New articles on Quantitative Biology


[1] 2603.26705

PI-Mamba: Linear-Time Protein Backbone Generation via Spectrally Initialized Flow Matching

Motivation: Generative models for protein backbone design have to simultaneously ensure geometric validity, sampling efficiency, and scalability to long sequences. However, most existing approaches rely on iterative refinement, quadratic attention mechanisms, or post-hoc geometry correction, leading to a persistent trade-off between computational efficiency and structural fidelity. Results: We present Physics-Informed Mamba (PI-Mamba), a generative model that enforces exact local covalent geometry by construction while enabling linear-time inference. PI-Mamba integrates a differentiable constraint-enforcement operator into a flow-matching framework and couples it with a Mamba-based state-space architecture. To improve optimisation stability and backbone realism, we introduce a spectral initialization derived from the Rouse polymer model and an auxiliary cis-proline awareness head. Across benchmark tasks, PI-Mamba achieves 0.0\% local geometry violations and high designability (scTM = $0.91\pm 0.03$, n = 100), while scaling to proteins exceeding 2,000 residues on a single A5000 GPU (24 GB).


[2] 2603.26809

Dictionary-based Pathology Mining with Hard-instance-assisted Classifier Debiasing for Genetic Biomarker Prediction from WSIs

Prediction of genetic biomarkers, e.g., microsatellite instability in colorectal cancer is crucial for clinical decision making. But, two primary challenges hamper accurate prediction: (1) It is difficult to construct a pathology-aware representation involving the complex interconnections among pathological components. (2) WSIs contain a large proportion of areas unrelated to genetic biomarkers, which make the model easily overfit simple but irrelative instances. We hereby propose a Dictionary-based hierarchical pathology mining with hard-instance-assisted classifier Debiasing framework to address these challenges, dubbed as D2Bio. Our first module, dictionary-based hierarchical pathology mining, is able to mine diverse and very fine-grained pathological contextual interaction without the limit to the distances between patches. The second module, hard-instance-assisted classfier debiasing, learns a debiased classifier via focusing on hard but task-related features, without any additional annotations. Experimental results on five cohorts show the superiority of our method, with over 4% improvement in AUROC compared with the second best on the TCGA-CRC-MSI cohort. Our analysis further shows the clinical interpretability of D2Bio in genetic biomarker diagnosis and potential clinical utility in survival analysis. Code will be available at this https URL.


[3] 2603.26860

Ecological systems in a modeling perspective

May (1974,1976) opened the debate on whether biological populations might exhibit nonlinear dynamics and chaos. However, it has in general been difficult to verify nonlinear dynamics in biological populations. There are many reports concerning problems with this issue and some of them can be traced back to Hassell, Lawton, and May (1976) and Morris (1990). Our objective is not a discussion of the presence of nonlinear dynamics in biological populations. Instead, we analyze whether ecological census data can be used for validating nonlinearities at all. We choose our models and our situation so that as much as possible can be done rigorously with by hand computations. We consider a clearly nonlinear chemostat based model that is isolated. Some noise must be considered, and we choose a minimal approach: Only noise originating from the fact that ecological populations remain finite is considered, cf. Bailey (1964). In ecology, exceptionally long and famous time series are those collected by Nicholson (1954) and Utida (1957). Our judgement is that ecological time series data containing a few hundred data points is exceptionally long.


[4] 2603.26974

Recent advances in modeling and simulation of biological phenomena in crowded and cellular environments

While experiments and computer simulations to study biological phenomena are usually performed in diluted in vitro conditions, such phenomena happen inside the cellular cytoplasm, an environment densely packed with diverse macromolecules. Here, we revise recent computational methods to investigate crowded and cellular environments. Protein crowders, inert crowders and small molecules were used to mimic crowding. Simulations were performed for models of the cytoplasm. New methods were developed to simulate crowded systems. Apart from the challenges, modeling and simulations to investigate biological phenomena inside cells is a growing field, and has a lot of potential to improve our understanding of how such phenomena happen in vivo.


[5] 2603.27017

Beyond BMI: Smartphone Body Composition Phenotyping for Cardiometabolic Risk Assessment

Body Mass Index (BMI) is a widely accessible but imprecise proxy of cardiometabolic health. While assessing true body composition is superior, gold-standard methods like Dual-Energy X-ray Absorptiometry (DXA) are not scalable. We address this gap by developing and validating "PhotoScan," a method to estimate body composition from smartphone imagery. We pretrained a deep learning model on UK Biobank participants (N=35,323) and fine-tuned on a newly recruited clinical cohort (PhotoBIA cohort, N=677) with diverse ethnicity, age, and body fat distribution, achieving high accuracy against DXA for total body fat percentage (BF%, MAE = 2.15%), Android-to-Gynoid fat ratio (A/G, MAE = 0.11), and visceral-to-subcutaneous fat area ratio (V/S, MAE = 0.09). Generalizability of the model was demonstrated on an independent metabolic health study cohort (MetabolicMosaic cohort, N=132 participants), achieving MAEs of 2.13% for BF%, 0.09 for A/G, and 0.09 for V/S. We then evaluated the clinical utility of these metrics in the MetabolicMosaic cohort by predicting insulin resistance (IR). Adding PhotoScan-derived body composition metrics to baseline demographics model (Age, Sex, BMI) significantly improved insulin resistance classification (Area Under the Receiver Operating Characteristic Curve "AUROC" 76.0% vs 69.2%, DeLong test p=0.002, Net Reclassification Index "NRI" 0.593). Crucially, this accessible smartphone method achieved performance nearly equivalent to adding clinical-grade DXA data to baseline demographics model (AUROC 77.3% vs 69.2%, DeLong test p=0.004, NRI 0.748). These findings demonstrate that smartphone-based phenotyping captures clinically meaningful risk signals missed by BMI and anthropometrics, offering a scalable alternative to DXA for cardiometabolic risk stratification.


[6] 2603.27104

Autonomous Agent-Orchestrated Digital Twins (AADT): Leveraging the OpenClaw Framework for State Synchronization in Rare Genetic Disorders

Background: Medical Digital Twins (MDTs) are computational representations of individual patients that integrate clinical, genomic, and physiological data to support diagnosis, treatment planning, and outcome prediction. However, most MDTs remain static or passively updated, creating a critical synchronization gap, especially in rare genetic disorders where phenotypes, genomic interpretations, and care guidelines evolve over time. Methods: We propose an agent-orchestrated digital twin framework using OpenClaw's proactive "heartbeat" mechanism and modular Agent Skills. This Autonomous Agent-orchestrated Digital Twin (AADT) system continuously monitors local and external data streams (e.g., patient-reported phenotypes and updates in variant classification databases) and executes automated workflows for data ingestion, normalization, state updates, and trigger-based analysis. Results: A prototype implementation demonstrates that agent orchestration can continuously synchronize MDT states with both longitudinal phenotype updates and evolving genomic knowledge. In rare disease settings, this enables earlier diagnosis and more accurate modeling of disease progression. We present two case studies, including variant reinterpretation and longitudinal phenotype tracking, highlighting how AADTs support timely, auditable updates for both research and clinical care. Conclusion: The AADT framework addresses the key bottleneck of real-time synchronization in MDTs, enabling scalable and continuously updated patient models. We also discuss data security considerations and mitigation strategies through human-in-the-loop system design.


[7] 2603.27145

Pan-Cancer Mapping of the Tumor Immune Landscape through Metagene Clustering and Predictive Modeling

As immunotherapies become standard cancer treatments, it is increasingly important to identify a patient's immune profile, which encompasses the activity of immune cells within the tumor microenvironment and the presence of specific biomarkers. However, we lack mechanistic explanations drivers of immune phenotypes. Despite advances in immune profiling with high-throughput sequencing, the mechanisms driving them remain unclear. This study aimed to identify novel, robust immune-related gene clusters (metagenes) and evaluate their prognostic significance and functional relevance across various pan-cancer types using a comprehensive computational pipeline. We acquired pan-cancer bulk RNA-seq and established immune subtypes from The Cancer Genome Atlas (TCGA). Using expression-based filtering and clustering of genes with ANOVA and Gaussian Mixture Model (GMM), we identified 48 unique metagenes. These metagenes achieved 87% accuracy in predicting the established subtypes. SHAP analysis revealed the most predictive metagenes per subtype, while functional enrichment analysis identified their associated pathways. Genes were ranked by differential expression between high- and low-expression groups. The metagenes revealed insights, including co-expression of immune activation and regulatory factors, links between cell cycle regulation and immune evasion, and dynamic microenvironment remodeling signatures. Kaplan-Meier survival analysis and multivariate Cox Regression revealed that many metagenes had prognostic value for overall survival. Overall, the metagenes represent coordinated biological programs across diverse cancer types, providing a foundation for developing robust, broadly applicable immuno-oncology biomarkers that extend beyond single-gene markers. They demonstrate prognostic value across cancer types and hold potential to guide immunotherapy treatment decisions.


[8] 2603.27255

When can fitness epistasis be ignored in a polygenic trait at equilibrium?

Although many phenotypic traits are determined by a large number of genetic variants, the behavior of allele frequencies in a polygenic trait is not completely understood. The problem is especially challenging when the quantitative trait of interest is under epistatic selection as the allele frequency at a locus is affected by those at other loci. Here, we consider a panmictic, diploid finite population evolving under stabilizing selection and symmetric mutations when the population is in linkage equilibrium. In the stationary state, using a diffusion theory, we calculate the marginal distribution of allele frequency, and find parameter regimes where fitness epistasis can not be ignored for an accurate description of the frequency distribution. For such parameters, the mean deviation in the phenotypic optimum and genic variance are, however, found to be well captured even when epistatic interactions are neglected. Thus, while the presence of epistasis may not be evident in phenotypic quantities, it can strongly affect the allele frequency this http URL also find that the allele frequency distribution at a locus is unimodal if its effect size is below a threshold effect and bimodal otherwise; these results are the stochastic analog of the deterministic ones where the stable allele frequency becomes bistable when the effect size exceeds a threshold. Our analytical results are verified against Monte Carlo simulations and numerical integration of a Langevin equation.


[9] 2603.27347

Information in a recurrent Retina-V1 network with realistic noise, feedback and nonlinearities

Quantitative estimation of information flow in early vision with psychophysically realistic networks is still an open issue. This is because, up to date, the necessary elements (general and plausible network, accurate noise, and reliable information measures) have not been put together. As a result, previous works made different approximations that limit the generality of their results. This work combines the following elements for the first time: (1) General and plausible recurrent net: a cascade of linear+nonlinear psychophysically tuned layers [IEEE TIP.06, J.Neurophysiol.19, J.Math.Neurosci.20, Neurocomp.24], augmented to consider top-down feedback following [Nat.Neurosci.21,Neurips.22]. (2) Accurate noise in every layer, which is tuned to reproduce psychometric functions both in contrast detection and discrimination following [this http URL 25]. (3) Reliable information measures that have been checked with analytical results, both in general [IEEE PAMI 24], and in similar visual neuroscience contexts [Neurocomp.24], and hence can be applied in this (more general) case where analytical results are difficult to obtain. The joint use of these elements allows a reliable study of information flow depending on different connectivity schemes (different nonlinearities and interactions), different noise sources, and different stimuli. Results show the benefits of feedback in two ways: (1) the information loss in the data processing inequality along the pathway is reduced by the V1 -- > LGN recurrence for values of feedback that give stable steady state solutions, and (2) the stability of the net is assessed though standard Poincaré analysis and we find an optimal value for the feedback in terms of the accuracy of the reconstructed signal from the cortical representation.


[10] 2603.27410

Grounding Social Perception in Intuitive Physics

People infer rich social information from others' actions. These inferences are often constrained by the physical world: what agents can do, what obstacles permit, and how the physical actions of agents causally change an environment and other agents' mental states and behavior. We propose that such rich social perception is more than visual pattern matching, but rather a reasoning process grounded in an integration of intuitive psychology with intuitive physics. To test this hypothesis, we introduced PHASE (PHysically grounded Abstract Social Events), a large dataset of procedurally generated animations, depicting physically simulated two-agent interactions on a 2D surface. Each animation follows the style of the Heider and Simmel movie, with systematic variation in environment geometry, object dynamics, agent capacities, goals, and relationships (friendly/adversarial/neutral). We then present a computational model, SIMPLE, a physics-grounded Bayesian inverse planning model that integrates planning, probabilistic planning, and physics simulation to infer agents' goals and relations from their trajectories. Our experimental results showed that SIMPLE achieved high accuracy and agreement with human judgments across diverse scenarios, while feedforward baseline models -- including strong vision-language models -- and physics-agnostic inverse planning failed to achieve human-level performance and did not align with human judgments. These results suggest that our model provides a computational account for how people understand physically grounded social scenes by inverting a generative model of physics and agents.


[11] 2603.27465

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models

Genomic foundation models trained on DNA sequences have demonstrated remarkable capabilities across diverse biological tasks, from variant effect prediction to genome design. These models are typically trained on massive, publicly sourced genomic datasets comprising trillions of nucleotide tokens, which renders them intrinsically susceptible to errors, artifacts, and adversarial issues embedded in the training data. Unlike natural language, DNA sequences lack the semantic transparency that might allow model makers to filter out corrupted entries, making genomic training corpora particularly susceptible to undetected manipulation. While training data poisoning has been established as a credible threat to large language models, its implications for genomic foundation models remain unexplored. Here, we present the first systematic investigation of training data poisoning in genomic language models. We demonstrate two complementary attack vectors. First, we show that adversarially crafted sequences can selectively degrade generative behavior on targeted genomic contexts, with backdoor activation following a sigmoidal dose-response relationship and full implantation achieved at 1 percent cumulative poison exposure. Second, targeted label corruption of downstream training data can selectively compromise clinically relevant variant classification, demonstrated using BRCA1 variant effect prediction. Our results reveal that genomic foundation models are vulnerable to targeted data poisoning attacks, underscoring the need for data provenance tracking, integrity verification, and adversarial robustness evaluation in the genomic foundation model development pipeline.


[12] 2603.27484

Quantitative mapping of dynamic 3D transport in growing cells via volumetric spatio-temporal image correlation spectroscopy (vSTICS)

Quantitatively mapping three-dimensional (3D) flow, diffusion, and particle density in crowded living cells remains challenging because most dynamic optical microscopy measurements are effectively planar and existing analysis methods struggle with dense, noisy volumetric data. We introduce volumetric spatio-temporal image correlation spectroscopy (vSTICS), a framework that recovers voxel-resolved flow, diffusion coefficients, and particle densities from 3D fluorescence time series. Growing Camellia japonica pollen tubes were imaged with field-synthesis lattice light-sheet microscopy, and localized 3D spatio-temporal correlation analysis was applied to overlapping volumetric samples to generate maps of velocity, diffusion, and density. Validation with synthetic flow-diffusion simulations showed accurate recovery of seeded transport parameters, including velocities near $3$ $\mu$m s$^{-1}$ and diffusion near $10^{-3}$ $\mu$m$^2$ s$^{-1}$. Fluorescent microsphere experiments verified particle number and point spread function readouts and measured diffusion coefficients of $0.3 \pm 0.1$ $\mu$m$^2$ s$^{-1}$ in gel, consistent with imaging-FCS measurements of $0.5 \pm 0.2$ $\mu$m$^2$ s$^{-1}$. Applied to mitochondria in pollen tubes, vSTICS resolved a bidirectional reverse-fountain pattern with slower anterograde transport ($0.1$-$1$ $\mu$m s$^{-1}$) and faster retrograde motion peaking near $3$ $\mu$m s$^{-1}$, plus a retrograde corridor about $2$ $\mu$m wide. Density and diffusion maps indicated a denser, more advective core and higher peripheral diffusion. High-density sub-diffraction vesicle mapping produced similar velocity landscapes with about ten-fold higher particle densities. These results establish vSTICS as a practical method for quantitative 3D mapping of intracellular transport and refines the reverse-fountain model by revealing asymmetric, predominantly transverse circulation.


[13] 2603.27644

Energy Landscapes of Emotion: Quantifying Brain Network Stability During Happy and Sad Face Processing Using EEG-Based Hopfield Energy

Understanding how the human brain instantiates distinct emotional states is a key challenge in affective neuroscience. While network-based approaches have advanced emotion processing research,they remain largely descriptive,leaving the dynamical stability of emotional brain states this http URL study introduces a novel framework to quantify this stability by applying Hopfield network energy to empirically derived functional connectivity. High density EEG was recorded from 20 healthy adults during a happy versus sad facial expression discrimination task. Functional connectivity was estimated using the weighted Phase Lag Index to obtain artifact-robust,frequency-specific matrices, which served as coupling weights in a continuous Hopfield energy model to calculate a scalar energy value per trial. Statistical comparisons showed sad emotional processing was associated with significantly lower(more negative) energy in delta,theta,and alpha bands,with the strongest effect in the alpha band (Cohen's d =0.83). Energy correlated strongly negatively with global efficiency(r=-0.72),indicating hyperconnected,efficient networks correspond to more stable this http URL, alpha-band energy correlated positively with reaction time during sad trials(r=0.61),linking deeper network stability to increased cognitive effort. These findings demonstrate emotional valence corresponds to distinct attractor basins in the brain's functional landscape, with sadness occupying a deeper,more stable configuration than this http URL Hopfield energy metric provides a principled, quantifiable measure of emotional brain state stability, opening new avenues for understanding affective dynamics in health and disease.


[14] 2603.27787

Cardiovascular-Kidney-Metabolic Health: Insights from Wearables and Blood Biomarkers

Cardiovascular-Kidney-Metabolic (CKM) syndrome represents a growing public health crisis, yet the subclinical heterogeneity of its component systems remains underexplored. Early detection of physiological deviation is critical for preventing irreversible organ damage and mortality. Here, we characterize the prevalence and interplay of CKM impairment in a US cohort (N=841) by integrating continuous wearable data with clinical biomarkers. We assessed cardiovascular, kidney via clinical biomarkers, namely Chol/HDL, eGFR, as well as metabolic health risk through Homeostatic Model Assessment of Insulin Resistance (HOMA-IR). We show that while metabolic and cardiovascular disruptions are significantly associated (r=0.26, p<0.001), early-stage kidney impairment manifests independently. Utilizing a normalized deviance score, we identified significant health impairments in 29.0% of the cohort. Cardiovascular deviation was the most prevalent singular phenotype (13.3%), followed by metabolic (9.1%) and renal (6.25%) deviations, with dual metabolic-cardiovascular impairment occurring in only 2.2% of participants. These findings suggest that high system-specific deviance may serve as an indicator for accelerated physiological aging within the respective organ system. Furthermore, feature ablation analysis revealed that step count, Active Zone Minutes, and resting heart rate are the most potent wearable-derived predictors of cardiovascular and metabolic decline. These findings underscore the necessity of a multi-system subtyping approach, demonstrating that wearable-derived phenotypes can facilitate the early, targeted interventions required to manage the complex landscape of CKM syndrome.


[15] 2603.27926

Allocentric Navigation Is Computationally Universal

This report presents three proofs showing that idealized architectures capable of navigation guided by allocentric maps with landmark structure can be computationally universal. The navigation may occur either online (in the environment) or offline (in the animal's head). The first proof proceeds from a universal two-counter machine by encoding counters as the positions of two movable markers on orthogonal coordinate axes. The second proof directly simulates an ordinary one-tape Turing machine by using a writable tape-path embedded in the map. The third proof strengthens locality by replacing the globally designated path with a two-dimensional field of landmarks that carries only local predecessor/successor information. These constructions are mathematically close to classical graph-based models in computability theory, including Kolmogorov-Uspensky machines, storage-modification machines, graph Turing machines, and related navigation-on-graphs models. Accordingly, the bare universality results are mathematically unsurprising. Nevertheless, the present treatment is, as far as I know, the first self-contained reconstruction of such universality demonstrations in the idiom of allocentric cognitive maps and offline navigation, that is, within an architecture whose core representational and computational primitives are drawn from a body of empirical and theoretical work on spatial navigation. The report therefore reframes known computability-theoretic ideas to show that an allocentric navigation-based architecture can be computationally universal.


[16] 2603.28600

A Normative Theory of Decision Making from Multiple Stimuli: The Contextual Diffusion Decision Model

The dynamics of simple two-alternative forced-choice (2AFC) decisions are well-modeled by a class of random walk models (e.g. Laming, 1968; Ratcliff, 1978; Usher & McClelland, 2001; Bogacz et al., 2006). However, in real-life, even simple decisions involve dynamically changing influence of additional information. In this work, we describe a computational theory of decision making from multiple sources of information, grounded in Bayesian inference and consistent with a simple neural network. This Contextual Diffusion Decision Model (CDDM) is a formal generalization of the Diffusion Decision Model (DDM), a popular existing model of fixed-context decision making (Ratcliff, 1978), and shares with it both a mechanistic and a probabilistic motivation. Just as the DDM is a model for a variety of simple two-alternative forced-choice (2AFC) decision making tasks, we demonstrate that the CDDM supports a variety of simple context-dependent tasks of longstanding interest in psychology, including the Flanker (Eriksen & Eriksen, 1974), AX-CPT (Servan-Schreiber et al., 1996), Stop-Signal (Logan & Cowan, 1984), Cueing (Posner, 1980), and Prospective Memory paradigms (Einstein & McDaniel, 2005). Further, we use the CDDM to perform a number of normative rational analyses exploring optimal response and memory allocation policies. Finally, we show how the use of a consistent model across tasks allows us to recover consistent qualitative data patterns in multiple tasks, using the same model parameters.


[17] 2603.26707

The Cognitive Divergence: AI Context Windows, Human Attention Decline, and the Delegation Feedback Loop

This paper documents and theorises a self-reinforcing dynamic between two measurable trends: the exponential expansion of large language model (LLM) context windows and the secular contraction of human sustained-attention capacity. We term the resulting asymmetry the Cognitive Divergence. AI context windows have grown from 512 tokens in 2017 to 2,000,000 tokens by 2026 (factor ~3,906; fitted lambda = 0.59/yr; doubling time ~14 months). Over the same period, human Effective Context Span (ECS) -- a token-equivalent measure derived from validated reading-rate meta-analysis (Brysbaert, 2019) and an empirically motivated Comprehension Scaling Factor -- has declined from approximately 16,000 tokens (2004 baseline) to an estimated 1,800 tokens (2026, extrapolated from longitudinal behavioural data ending 2020 (Mark, 2023); see Section 9 for uncertainty discussion). The AI-to-human ratio grew from near parity at the ChatGPT launch (November 2022) to 556--1,111x raw and 56--111x quality-adjusted, after accounting for retrieval degradation (Liu et al., 2024; Chroma, 2025). Beyond documenting this divergence, the paper introduces the Delegation Feedback Loop hypothesis: as AI capability grows, the cognitive threshold at which humans delegate to AI falls, extending to tasks of negligible demand; the resulting reduction in cognitive practice may further attenuate the capacities already documented as declining (Gerlich, 2025; Kim et al., 2026; Kosmyna et al., 2025). Neither trend reverses spontaneously. The paper characterises the divergence statistically, reviews neurobiological mechanisms across eight peer-reviewed neuroimaging studies, presents empirical evidence bearing on the delegation threshold, and proposes a research agenda centred on a validated ECS psychometric instrument and longitudinal study of AI-mediated cognitive change.


[18] 2603.26708

Fractional epidemics from quantum loops

Classical compartmental models of epidemiology rely on well-mixed, local interaction approximations that fail to capture the heavy-tailed burst dynamics and long-range spatial correlations observed in real-world outbreaks. While fractional calculus is frequently employed to model these anomalous behaviors, fractional operators are introduced phenomenologically. In this work, we demonstrate that fractional space-time epidemic dynamics emerge naturally and rigorously from first principles using a non-equilibrium quantum field theory model. By mapping the stochastic contagion process to a gauge-mediated field theory via the Doi-Peliti formalism, we go beyond the static mean-field approximation to compute the full dynamical one-loop vacuum polarization. We prove that integrating out a dynamically fluctuating host vacuum generates anomalous momentum and frequency scaling. Transitioning back to coordinate space, this derives a coupled space-time fractional integro-differential equations, where the non-linear transmission vertex is governed by parabolic Riesz potentials and Riemann-Liouville time derivatives. We show that in the anomalous regime ($\alpha < 2$), local Debye screening is modified, facilitating Lévy flight super-spreading and temporal avalanches. Consequently, the effective reproductive number ($R_{eff}$) ceases to be a scalar, transforming into a spectral dispersion relation bounded strictly by the ultraviolet spatial cutoff.


[19] 2603.26811

Implicit neural representations for larval zebrafish brain microscopy: a reproducible benchmark on the MapZebrain atlas

Implicit neural representations (INRs) offer continuous coordinate-based encodings for atlas registration, cross-modality resampling, sparse-view completion, and compact sharing of neuroanatomical data. Yet reproducible evaluation is lacking for high-resolution larval zebrafish microscopy, where preserving neuropil boundaries and fine neuronal processes is critical. We present a reproducible INR benchmark for the MapZebrain larval zebrafish brain atlas. Using a unified, seed-controlled protocol, we compare SIREN, Fourier features, Haar positional encoding, and a multi-resolution grid on 950 grayscale microscopy images, including atlas slices and single-neuron projections. Images are normalized with per-image (1,99) percentiles estimated from 10% of pixels in non-held-out columns, and spatial generalization is tested with a deterministic 40% column-wise hold-out along the X-axis. Haar and Fourier achieve the strongest macro-averaged reconstruction fidelity on held-out columns (about 26 dB), while the grid is moderately behind. SIREN performs worse in macro averages but remains competitive on area-weighted micro averages in the all-in-one regime. SSIM and edge-focused error further show that Haar and Fourier preserve boundaries more accurately. These results indicate that explicit spectral and multiscale encodings better capture high-frequency neuroanatomical detail than smoother-bias alternatives. For MapZebrain workflows, Haar and Fourier are best suited to boundary-sensitive tasks such as atlas registration, label transfer, and morphology-preserving sharing, while SIREN remains a lightweight baseline for background modelling or denoising.


[20] 2603.26822

Modularity, asymmetry, and polarization shape consensus speed in the voter model

In populations with community structure, the formation of consensus requires both alignment within and diffusion of beliefs across groups, processes that evolve on distinct time scales. How do modularity, asymmetry, and polarization shape this process? We study a variant of the voter model in which a population is divided into two cliques of sizes $N_1$ and $N_2$. At each time step, a pair of nodes is selected; if their binary opinions differ, each agent adopts the opinion of the other with probability $p$. With probability $\alpha$, the pairing occurs with a single clique, and with probability $1-\alpha$, across cliques. We analyze how this coupling strength, population imbalance, and initial polarization jointly determine the time to consensus. Formation of consensus generally starts with inter-clique interactions rapidly synchronizing the two cliques' opinion fractions, after which consensus is reached through a slower diffusion along the synchronized manifold; this slow stage is largely insensitive to $\alpha$ except when the cliques are nearly disconnected. To analyze these dynamics, we derive stochastic differential equations and Fokker-Planck approximations in the large-population limit, and assess their accuracy against the discrete model. While $\alpha$ primarily affects the fast alignment stage, initially polarized and asymmetric populations exhibit nontrivial effects, including regimes in which an intermediate level coupling minimizes consensus time. A small-clique scaling analysis reveals that this optimum arises from a competition between fast alignment drift and noise amplification in the smaller group, and provides an approximate decomposition of consensus time into fast and slow contributions.


[21] 2603.26858

A Hierarchical Sheaf Spectral Embedding Framework for Single-Cell RNA-seq Analysis

Single-cell RNA-seq data analysis typically requires representations that capture heterogeneous local structure across multiple scales while remaining stable and interpretable. In this work, we propose a hierarchical sheaf spectral embedding (HSSE) framework that constructs informative cell-level features based on persistent sheaf Laplacian analysis. Starting from scale-dependent low-dimensional embeddings, we define cell-centered local neighborhoods at multiple resolutions. For each local neighborhood, we construct a data-driven cellular sheaf that encodes local relationships among cells. We then compute persistent sheaf Laplacians over sampled filtration intervals and extract spectral statistics that summarize the evolution of local relational structure across scales. These spectral descriptors are aggregated into a unified feature vector for each cell and can be directly used in downstream learning tasks without additional model training. We evaluate HSSE on twelve benchmark single-cell RNA-seq datasets covering diverse biological systems and data scales. Under a consistent classification protocol, HSSE achieves competitive or improved performance compared with existing multiscale and classical embedding-based methods across multiple evaluation metrics. The results demonstrate that sheaf spectral representations provide a robust and interpretable approach for single-cell RNA-seq data representation learning.


[22] 2603.26994

ImmSET: Sequence-Based Predictor of TCR-pMHC Specificity at Scale

T cells are a critical component of the adaptive immune system, playing a role in infectious disease, autoimmunity, and cancer. T cell function is mediated by the T cell receptor (TCR) protein, a highly diverse receptor targeting specific peptides presented by the major histocompatibility complex (pMHCs). Predicting the specificity of TCRs for their cognate pMHCs is central to understanding adaptive immunity and enabling personalized therapies. However, accurate prediction of this protein-protein interaction remains challenging due to the extreme diversity of both TCRs and pMHCs. Here, we present ImmSET (Immune Synapse Encoding Transformer), a novel sequence-based architecture designed to model interactions among sets of variable-length biological sequences. We train this model across a range of dataset sizes and compositions and study the resulting models' generalization to pMHC targets. We describe a failure mode in prior sequence-based approaches that inflates previously reported performance on this task and show that ImmSET remains robust under stricter evaluation. In systematically testing the scaling behavior of ImmSET with training data, we show that performance scales consistently with data volume across multiple data types and compares favorably with the pre-trained protein language model ESM2 fine-tuned on the same datasets. Finally, we demonstrate that ImmSET can outperform AlphaFold2 and AlphaFold3-based pipelines on TCR-pMHC specificity prediction when provided sufficient training data. This work establishes ImmSET as a scalable modeling paradigm for multi-sequence interaction problems, demonstrated in the TCR-pMHC setting but generalizable to other biological domains where high-throughput sequence-driven reasoning complements structure prediction and experimental mapping.


[23] 2603.27188

Persistent Memory Through Triple-Loop Consolidation in a Non-Gradient Dissipative Cognitive Architecture

Dissipative cognitive architectures maintain computation through continuous energy expenditure, where units that exhaust their energy are stochastically replaced with fresh random state. This creates a fundamental challenge: how can persistent, context-specific memory survive when all learnable state is periodically destroyed? Existing memory mechanisms -- including elastic weight consolidation, synaptic intelligence, and surprise-driven gating -- rely on gradient computation and are inapplicable to non-gradient dissipative systems. We introduce Deep Memory (DM), a non-gradient persistent memory mechanism operating through a triple-loop consolidation cycle: (1) recording of expert-specific content centroids, (2) seeding of replaced units with stored representations, and (3) stabilization through continuous re-entry. We demonstrate that discrete expert routing via Mixture-of-Experts (MoE) gating is a causal prerequisite for DM, preventing centroid convergence that would render stored memories identical. Across ${\sim}970$ simulation runs spanning thirteen experimental blocks: (i) discrete routing is causally necessary for specialization ($\text{MI}=1.10$ vs. $0.001$; $n=91$); (ii) DM achieves $R=0.984$ vs. $0.385$ without memory ($n=16$); (iii) continuous seeding reconstructs representations after interference ($R_\mathrm{recon}=0.978$; one-shot fails; $n=30$); (iv) the mechanism operates within a characterized $(K,p)$ envelope ($n=350$); (v) recording $\times$ seeding is the minimal critical dyad ($n=40$); (vi) DM outperforms non-gradient baselines (Hopfield, ESN) under matched turnover ($n=370$). These results establish DM as a falsifiable mechanism for persistent memory in non-gradient cognitive systems, with functional parallels to hippocampal consolidation.


[24] 2603.27303

Self-evolving AI agents for protein discovery and directed evolution

Protein scientific discovery is bottlenecked by the manual orchestration of information and algorithms, while general agents are insufficient in complex domain projects. VenusFactory2 provides an autonomous framework that shifts from static tool usage to dynamic workflow synthesis via a self-evolving multi-agent infrastructure to address protein-related demands. It outperforms a set of well-known agents on the VenusAgentEval benchmark, and autonomously organizes the discovery and optimization of proteins from a single natural language prompt.


[25] 2603.27597

From indicators to biology: the calibration problem in artificial consciousness

Recent work on artificial consciousness shifts evaluation from behaviour to internal architecture, deriving indicators from theories of consciousness and updating credences accordingly. This is progress beyond naive Turing-style tests. But the indicator-based programme remains epistemically under-calibrated: consciousness science is theoretically fragmented, indicators lack independent validation, and no ground truth of artificial phenomenality exists. Under these conditions, probabilistic consciousness attribution to current AI systems is premature. A more defensible near-term strategy is to redirect effort toward biologically grounded engineering -- biohybrid, neuromorphic, and connectome-scale systems -- that reduces the gap with the only domain where consciousness is empirically anchored: living systems.


[26] 2603.27611

What does a system modify when it modifies itself?

When a cognitive system modifies its own functioning, what exactly does it modify: a low-level rule, a control rule, or the norm that evaluates its own revisions? Cognitive science describes executive control, metacognition, and hierarchical learning with precision, but lacks a formal framework distinguishing these targets of transformation. Contemporary artificial intelligence likewise exhibits self-modification without common criteria for comparison with biological cognition. We show that the question of what counts as a self-modifying system entails a minimal structure: a hierarchy of rules, a fixed core, and a distinction between effective rules, represented rules, and causally accessible rules. Four regimes are identified: (1) action without modification, (2) low-level modification, (3) structural modification, and (4) teleological revision. Each regime is anchored in a cognitive phenomenon and a corresponding artificial system. Applied to humans, the framework yields a central result: a crossing of opacities. Humans have self-representation and causal power concentrated at upper hierarchical levels, while operational levels remain largely opaque. Reflexive artificial systems display the inverse profile: rich representation and causal access at operational levels, but none at the highest evaluative level. This crossed asymmetry provides a structural signature for human-AI comparison. The framework also offers insight into artificial consciousness, with higher-order theories and Attention Schema Theory as special cases. We derive four testable predictions and identify four open problems: the independence of transformativity and autonomy, the viability of self-modification, the teleological lock, and identity under transformation.


[27] 2603.27716

The role of neuromorphic principles in the future of biomedicine and healthcare

Neuromorphic engineering has matured over the past four decades and is currently experiencing explosive growth with the potential to transform biomedical engineering and neurotechnologies. Participants at the Neuromorphic Principles in Biomedicine and Healthcare (NPBH) Workshop (October 2024) -- representing a broad cross-section of the community, including early-career and established scholars, engineers, scientists, clinicians, industry, and funders -- convened to discuss the state of the field, current and future challenges, and strategies for advancing neuromorphic research and development for biomedical applications. Publicly approved recordings with transcripts (this https URL) and slides (this https URL) can be found at the workshop website.


[28] 2603.28200

A Deep Reinforcement Learning Framework for Closed-loop Guidance of Fish Schools via Virtual Agents

Guiding collective motion in biological groups is a fundamental challenge in understanding social interaction rules and developing automated systems for animal management. In this study, we propose a deep reinforcement learning (RL) framework for the closed-loop guidance of fish schools using virtual agents. These agents are controlled by policies trained via Proximal Policy Optimization (PPO) in simulation and deployed in physical experiments with rummy-nose tetras (Petitella bleheri), enabling real-time interaction between artificial agents and live individuals. To cope with the stochastic behavior of live individuals, we design a composite reward function to balance directional guidance with social cohesion. Our systematic evaluation of visual parameters shows that a white background and larger stimulus sizes maximize guidance efficacy in physical trials. Furthermore, evaluation across group sizes revealed that while the system demonstrates effective guidance for groups of five individuals, this capability markedly degrades as group size increases to eight. This study highlights the potential of deep RL for automated guidance of biological collectives and identifies challenges in maintaining artificial influence in larger groups.


[29] 2603.28285

Global stability and uniform persistence in an epidemic model with saturating fomite-mediated transmission

We analyse the global dynamics of a Susceptible--Vaccinated--Exposed--Infected--Recovered (SVEIR) epidemic model with demographic turnover, imperfect vaccination, and two transmission routes: direct host-to-host contagion and indirect transmission via contaminated fomites. Indirect transmission is described through an environmental pathogen concentration and a Holling-type dose--response function, accounting for nonlinear incidence at high contamination levels. Threshold conditions separating disease elimination from long-term persistence are expressed in terms of the control reproduction number $\mathcal R_c$, and the classical threshold condition $\mathcal R_c<1$ is derived for the local asymptotic stability of the disease-free equilibrium. For the Holling type~II case, we further obtain an explicit closed-form sufficient condition for the global asymptotic stability of the disease-free equilibrium by applying the Kamgang--Sallet approach for monotone systems with a Metzler infected subsystem. In the absence of vaccination, this criterion recovers the sharp threshold $\mathcal R_0\le 1$ for the global asymptotic stability of the disease-free equilibrium, where $\mathcal R_0$ denotes the basic reproduction number. Conversely, when $\mathcal R_c>1$, we establish uniform persistence of the infection and the existence of at least one endemic equilibrium using persistence theory for semiflows and an acyclicity analysis of the boundary dynamics. Overall, our results quantify the combined impact of vaccination and saturating fomite-mediated transmission on the global behaviour of the model.


[30] 2603.28464

Will a time-varying complex system be stable?

Randomly-assembled dynamical systems are theoretically predicted to be unstable upon crossing a critical threshold of complexity, as first shown by May. Yet, empirical complex systems exhibit remarkable stability, indicating the presence of additional mechanisms playing a stabilizing role. The relation between complexity and stability is typically assessed by assuming fixed interactions, whereas real systems often evolve in intrinsically time-dependent states. To understand how this affects stability, we linearize a general non-autonomous dynamics around a reference operating state and model the resulting parameters as stochastic processes, which represent the minimal extension of static random interactions to time-varying ones. We derive exact stability bounds that generalize complexity-stability theory to dynamically varying systems. Notably, we find that temporal variability allows systems to remain stable even when their instantaneous Jacobian would predict instability. We compare our results against a non-linear neural network model, where our theory applies exactly, and the generalized Lotka-Volterra equations, where we numerically find that time-varying interactions systematically postpone the onset of replica-symmetry breaking. Overall, our results indicate that temporal variability systematically improves stability, demonstrating a general mechanism by which complex systems can violate classical complexity-stability bounds.


[31] 2603.28764

Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds

Similarity measures are widely used to interpret the representational geometries used by neural networks to solve tasks. Yet, because existing methods compare the extrinsic geometry of representations in state space, rather than their intrinsic geometry, they may fail to capture subtle yet crucial distinctions between fundamentally different neural network solutions. Here, we introduce metric similarity analysis (MSA), a novel method which leverages tools from Riemannian geometry to compare the intrinsic geometry of neural representations under the manifold hypothesis. We show that MSA can be used to i) disentangle features of neural computations in deep networks with different learning regimes, ii) compare nonlinear dynamics, and iii) investigate diffusion models. Hence, we introduce a mathematically grounded and broadly applicable framework to understand the mechanisms behind neural computations by comparing their intrinsic geometries.


[32] 1904.03236

Log-normal Superstatistics Reveals Statistical Resilience in the Panic Response of Confined Ants

We report the emergence of Log-normal Superstatistics in the collective motion of ants confined in a quasi-2D arena and exposed to a panic-inducing stimulus. A data-driven superstatistical Langevin model accurately reproduces the transition from stationary behavior to an organized escape response, characterized by non-Gaussian velocity distributions and a stochastic diffusion coefficient. Our findings show that danger information propagates via a memory-limited, cascade-like mechanism, resulting in a stable cluster formation despite individual memory constraints. These results indicate that a slowly varying diffusivity arises from the multiplicative combination of interaction-mediated processes under confinement, leading naturally to Log-normal fluctuations. The persistence of this statistical structure under panic reveals a form of collective resilience, establishing a mechanistic bridge between Superstatistics and living active matter in confined environments.


[33] 2510.11642

Competing forces of polarization and adhesion generate directional migration bias in a minimal model

Left-right axis specification establishes embryonic laterality through asymmetric signaling cascades originating at the cellular scale. We previously reported the presence of a directionality bias in confined pairs of endothelial (and fibroblast) cells exhibiting persistent circular motion, with cytoskeletal contractility modulating the direction. The relative simplicity of the experimental setup makes it a perfect testing ground for the physical forces that could endow this system with a tunable directional migration bias. We model self-propelling biological cells migrating in response to confinement, polarity, and pairwise repulsive forces. Our framework reproduces three key experimental observations: spontaneous coherent circular movement of confined cell pairs, emergence of directional bias when cells have asymmetric properties, and contractility-modulated switching of the rotation direction. Two key assumptions are required: an internal torque arising from cytoskeletal organization (previously observed in other cellular systems), and an asymmetric polarity response between cells, which introduces a difference in how quickly each cell reorients its migration direction. New experiments on daughter cell pairs support this asymmetry requirement in cellular properties. Tuning the polarity response timescale (or strength) relative to centering forces from confinement and cell-cell adhesion can amplify or reverse the directional migration bias.


[34] 2510.16082

BIOGEN: Evidence-Grounded Multi-Agent Reasoning Framework for Transcriptomic Interpretation in Antimicrobial Resistance

Interpreting gene clusters from RNA-seq remains challenging, especially in antimicrobial resistance studies where mechanistic context is essential for hypothesis generation. Conventional enrichment methods summarize co-expressed modules using predefined categories, but often return sparse results and lack cluster-specific, literature-linked explanations. We present BIOGEN, an evidence-grounded multi-agent framework for post hoc interpretation of RNA-seq transcriptional modules that integrates biomedical retrieval, structured reasoning, and multi-critic verification. BIOGEN organizes evidence from PubMed and UniProt into traceable cluster-level interpretations with explicit support and confidence tiering. On a primary Salmonella enterica dataset, BIOGEN achieved strong evidence-grounding performance while reducing hallucination from 0.67 in an unconstrained LLM setting to 0.00 under retrieval-grounded configurations. Compared with KEGG/ORA and GO/ORA, BIOGEN recovered broader biological coverage, identifying substantially more biological themes per cluster. Across four additional bacterial RNA-seq datasets, BIOGEN maintained zero hallucination and consistently outperformed KEGG/ORA in cluster-level thematic coverage. These results position BIOGEN as an interpretive support framework that complements transcriptomic workflows through improved traceability, evidential transparency, and biological coverage.


[35] 2511.15839

Comparing Bayesian and Frequentist Inference in Biological Models: A Comparative Analysis of Accuracy, Uncertainty, and Identifiability

Mathematical models support inference and forecasting in ecology and epidemiology, but results depend on the estimation framework. We compare Bayesian and Frequentist approaches across three biological models using four datasets: Lotka-Volterra predator-prey dynamics (Hudson Bay), a generalized logistic model (lung injury and 2022 U.S. mpox), and an SEIUR epidemic model (COVID-19 in Spain). Both approaches use a normal error structure to ensure a fair comparison. We first assessed structural identifiability to determine which parameters can theoretically be recovered from the data. We then evaluated practical identifiability and forecasting performance using four metrics: mean absolute error (MAE), mean squared error (MSE), 95 percent prediction interval (PI) coverage, and weighted interval score (WIS). For the Lotka-Volterra model with both prey and predator data, we analyzed three scenarios: prey only, predator only, and both. The Frequentist workflow used QuantDiffForecast (QDF) in MATLAB, which fits ODE models via nonlinear least squares and quantifies uncertainty through parametric bootstrap. The Bayesian workflow used BayesianFitForecast (BFF), which employs Hamiltonian Monte Carlo sampling via Stan to generate posterior distributions and diagnostics such as the Gelman-Rubin R-hat statistic. Results show that Frequentist inference performs best when data are rich and fully observed, while Bayesian inference excels when latent-state uncertainty is high and data are sparse, as in the SEIUR COVID-19 model. Structural identifiability clarifies these patterns: full observability benefits both frameworks, while limited observability constrains parameter recovery. This comparison provides guidance for choosing inference frameworks based on data richness, observability, and uncertainty needs.


[36] 2602.11478

Defining causal mechanism in dual process theory and two types of feedback control

Mental events are considered to supervene on physical events. A supervenient event does not change without a corresponding change in the underlying subvenient physical events. Since wholes and their parts exhibit the same supervenience-subvenience relations, inter-level causation has been expected to serve as a model for mental causation. We proposed an inter-level causation mechanism to construct a model of consciousness and an agent's self-determination. However, a significant gap exists between this mechanism and cognitive functions. Here, we demonstrate how to integrate the inter-level causation mechanism with the widely known dual-process theories. We assume that the supervenience level is composed of multiple supervenient functions (i.e., neural networks), and we argue that inter-level causation can be achieved by controlling the feedback error defined through changing algebraic expressions combining these functions. Using inter-level causation allows for a dual laws model in which each level possesses its own distinct dynamics. In this framework, the feedback error is determined independently by two processes: (1) the selection of equations combining supervenient functions, and (2) the negative feedback error reduction to satisfy the equations through adjustments of neurons and synapses. We interpret these two independent feedback controls as Type 1 and Type 2 processes in the dual process theories. As a result, theories of consciousness, agency, and dual process theory are unified into a single framework, and the characteristic features of Type 1 and Type 2 processes are naturally derived.


[37] 2602.17265

Spatio-temporal air flow properties in a 3D personalised model of the human lung

We propose a multi-scale lung model to investigate spatio-temporal distributions of ventilation variables. Lung envelope and large airway geometries are derived from CT scans; smaller airways are generated using a physiologically consistent algorithm. Tissue mechanics is modeled using nonlinear elasticity under small deformations, coupled with local air pressure from fluid dynamics within the bronchial tree. Airflow accounts for inertia and static airway compliance. Simulations employ finite elements. Using this model, we explore spatio-temporal airflows and shear stresses distributions.


[38] 2603.02627

Topological bounds on the dynamical growth rate of chemical reaction networks

Growth and decay are system-level properties of chemical reaction networks (CRNs) relevant from prebiotic chemistry to cellular metabolism. Their properties are typically analyzed through the kinetics of particular models, which requires specification of the full set of kinetic laws and parameters. In this work, we derive stoichiometry-based constraints on the growth (or shrinkage) rate, in the balanced-growth regime of scalable CRNs. The resulting bounds are controlled by a topological quantity, the maximum amplification factor, defined via a von Neumann max-min problem over feasible fluxes as illustrated by numerical tests on random-network ensembles of CRNs. We argue for the relevance of our results in the context of origin of life studies but also for designing synthetic chemical reaction networks.


[39] 2603.22498

Modelling SARS-CoV-2 epidemics via compartmental and cellular automaton SEIRS model with temporal immunity and vaccination

We consider the SEIRS epidemiology model with such features of the COVID-19 outbreak as: abundance of unidentified infected individuals, limited time of immunity and a possibility of vaccination. The control of the pandemic dynamics is possible by restricting the transmission rate, increasing identification and isolation rate of infected individuals, and via vaccination. For the compartmental version of this model, we found stable disease-free and endemic stationary states. The basic reproductive number is analysed with respect to balancing quarantine and vaccination measures. The positions and heights of the first peak of outbreak are obtained numerically and fitted to simple in usage algebraic forms. Lattice-based realization of this model is studied by means of the asynchronous cellular automaton algorithm. This permitted to study the effect of social distancing by varying the neighbourhood size of the model. The attempt is made to match the quarantine and vaccination effects.


[40] 2410.11328

Crossed laser phase plates for transmission electron microscopy

For decades since the development of phase-contrast optical microscopy, an analogous approach has been sought for maximizing the image contrast of weakly-scattering objects in transmission electron microscopy (TEM). The recent development of the laser phase plate (LPP) has demonstrated that an amplified, focused laser standing wave provides stable, tunable phase shift to the high-energy electron beam, achieving phase-contrast TEM. Building on proof-of-concept experimental demonstrations, this paper explores design improvements tailored to biological imaging. In particular, we introduce the approach of crossed laser phase plates (XLPP): two laser standing waves intersecting in the diffraction plane of the TEM, rather than a single beam as in the current LPP. We provide a theoretical model for the XLPP inside the microscope and use simulations to quantify its effect on image formation. Using simulations, we find that the XLPP increases information transfer at low spatial frequencies while also suppressing the ghost images formed by Kapitza-Dirac diffraction of the electron beam by the laser beam. We also present a simple acquisition scheme, enabled by the XLPP, which dramatically suppresses unwanted diffraction effects. Finally, we discuss important practical considerations of XLPP design and show experimental results from a prototype. The results of this study chart the course for future developments of LPP hardware.


[41] 2502.07297

MM-DADM: Multimodal Drug-Aware Diffusion Model for Virtual Clinical Trials

High failure rates in cardiac drug development necessitate virtual clinical trials via electrocardiogram (ECG) generation to reduce risks and costs. However, existing ECG generation models struggle to balance morphological realism with pathological flexibility, fail to disentangle demographics from genuine drug effects, and are severely bottlenecked by early-phase data scarcity. To overcome these hurdles, we propose the Multimodal Drug-Aware Diffusion Model (MM-DADM), the first generative framework for generating individualized drug-induced ECGs. Specifically, our proposed MM-DADM integrates a Dynamic Cross-Attention (DCA) module that adaptively fuses External Physical Knowledge (EPK) to preserve morphological realism while avoiding the suppression of complex pathological nuances. To resolve feature entanglement, a Causal Feature Encoder (CFE) actively filters out demographic noise to extract pure pharmacological representations. These representations subsequently guide a Causal-Disentangled ControlNet (CDC-Net), which leverages counterfactual data augmentation to explicitly learn intrinsic pharmacological mechanisms despite limited clinical data. Extensive experiments on $9,443$ ECGs across $8$ drug regimens demonstrate that MM-DADM outperforms $10$ state-of-the-art ECG generation models, improving simulation accuracy by at least $6.13\%$ and recall by $5.89\%$, while providing highly effective data augmentation for downstream classification tasks.


[42] 2508.01277

Foundation Models for Bioacoustics -- a Comparative Review

Automated bioacoustic analysis is essential for biodiversity monitoring and conservation, requiring advanced deep learning models that can adapt to diverse bioacoustic tasks. This article presents a comprehensive review of large-scale pretrained bioacoustic foundation models and systematically investigates their transferability across multiple bioacoustic classification tasks. We overview bioacoustic representation learning by analysing pretraining data sources and benchmarks. On this basis, we review bioacoustic foundation models, dissecting the models' training data, preprocessing, augmentations, architecture, and training paradigm. Additionally, we conduct an extensive empirical study of selected models on the BEANS and BirdSet benchmarks, evaluating generalisability under linear and attentive probing. Our experimental analysis reveals that Perch~2.0 achieves the highest BirdSet score (restricted evaluation) and the strongest linear probing result on BEANS, building on diverse multi-taxa supervised pretraining; that BirdMAE is the best model among probing-based strategies on BirdSet and second on BEANS after BEATs$_{NLM}$, the encoder of NatureLM-audio; that attentive probing is beneficial to extract the full performance of transformer-based models; and that general-purpose audio models trained with self-supervised learning on AudioSet outperform many specialised bird sound models on BEANS when evaluated with attentive probing. These findings provide valuable guidance for practitioners selecting appropriate models to adapt them to new bioacoustic classification tasks via probing.


[43] 2508.15603

Nonequilibrium protein complexes as molecular automata

Biology stores information and computes at the molecular scale, yet the ways in which it does so are often distinct from human-engineered computers. Mapping biological computation onto architectures familiar to computer science remains an outstanding challenge. Here, inspired by Crick's proposal for molecular memory, we analyse a thermodynamically-consistent model of a protein complex subject to driven, nonequilibrium enzymatic reactions. In the strongly driven limit, we find that the system maps onto a stochastic, asynchronous variant of cellular automata, where each rule corresponds to a different set of enzymes being present. We find a broad class of phenomena in these 'molecular automata' that can be exploited for molecular computation, including error-tolerant memory via multistable attractors, and long transients that can be used as molecular stopwatches. By systematically enumerating all possible dynamical rules, we identify those that allow molecular automata to implement simple computational architectures such as finite-state machines. Overall, our results provide a framework for engineering synthetic molecular automata, and offer a route to building protein-based computation in living cells.


[44] 2511.09588

Diffusion-Based Quality Control of Medical Image Segmentations across Organs

Medical image segmentation using deep learning (DL) has enabled the development of automated analysis pipelines for large-scale population studies. However, state-of-the-art DL methods are prone to hallucinations, which can result in anatomically implausible segmentations. With manual correction impractical at scale, automated quality control (QC) techniques have to address the challenge. While promising, existing QC methods are organ-specific, limiting their generalizability and usability beyond their original intended task. To overcome this limitation, we propose no-new Quality Control (nnQC), a robust QC framework based on a diffusion-generative paradigm that self-adapts to any input organ dataset. Central to nnQC is a novel Team of Experts (ToE) architecture, where two specialized experts independently encode 3D spatial awareness, represented by the relative spatial position of an axial slice, and anatomical information derived from visual features from the original image. A weighted conditional module dynamically combines the pair of independent embeddings, or opinions to condition the sampling mechanism within a diffusion process, enabling the generation of a spatially aware pseudo-ground truth for predicting QC scores. Within its framework, nnQC integrates fingerprint adaptation to ensure adaptability across organs, datasets, and imaging modalities. We evaluated nnQC on seven organs using twelve publicly available datasets. Our results demonstrate that nnQC consistently outperforms state-of-the-art methods across all experiments, including cases where segmentation masks are highly degraded or completely missing, confirming its versatility and effectiveness across different organs.


[45] 2511.12931

cryoSENSE: Compressive Sensing Enables High-throughput Microscopy with Sparse and Generative Priors on the Protein Cryo-EM Image Manifold

Cryo-electron microscopy (cryo-EM) enables the atomic-resolution visualization of biomolecules; however, modern direct detectors generate data volumes that far exceed the available storage and transfer bandwidth, thereby constraining practical throughput. We introduce cryoSENSE, the computational realization of a hardware-software co-designed framework for compressive cryo-EM sensing and acquisition. We show that cryo-EM images of proteins lie on low-dimensional manifolds that can be independently represented using sparse priors in predefined bases and generative priors captured by a denoising diffusion model. cryoSENSE leverages these low-dimensional manifolds to enable faithful image reconstruction from spatial and Fourier-domain undersampled measurements while preserving downstream structural resolution. In experiments, cryoSENSE increases acquisition throughput by up to 2.5$\times$ while retaining the original 3D resolution, offering controllable trade-offs between the number of masked measurements and the level of downsampling. Sparse priors favor faithful reconstruction from Fourier-domain measurements and moderate compression, whereas generative diffusion priors achieve accurate recovery from pixel-domain measurements and more severe undersampling. Project website: this https URL.