New articles on Quantitative Biology


[1] 2602.10152

Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

Saliency maps are increasingly used as \emph{design guidance} in siRNA efficacy prediction, yet attribution methods are rarely validated before motivating sequence edits. We introduce a \textbf{pre-synthesis gate}: a protocol for \emph{counterfactual sensitivity faithfulness} that tests whether mutating high-saliency positions changes model output more than composition-matched controls. Cross-dataset transfer reveals two failure modes that would otherwise go undetected: \emph{faithful-but-wrong} (saliency valid, predictions fail) and \emph{inverted saliency} (top-saliency edits less impactful than random). Strikingly, models trained on mRNA-level assays collapse on a luciferase reporter dataset, demonstrating that protocol shifts can silently invalidate deployment. Across four benchmarks, 19/20 fold instances pass; the single failure shows inverted saliency. A biology-informed regularizer (BioPrior) strengthens saliency faithfulness with modest, dataset-dependent predictive trade-offs. Our results establish saliency validation as essential pre-deployment practice for explanation-guided therapeutic design. Code is available at this https URL.


[2] 2602.10156

STRAND: Sequence-Conditioned Transport for Single-Cell Perturbations

Predicting how genetic perturbations change cellular state is a core problem for building controllable models of gene regulation. Perturbations targeting the same gene can produce different transcriptional responses depending on their genomic locus, including different transcription start sites and regulatory elements. Gene-level perturbation models collapse these distinct interventions into the same representation. We introduce STRAND, a generative model that predicts single-cell transcriptional responses by conditioning on regulatory DNA sequence. STRAND represents a perturbation by encoding the sequence at its genomic locus and uses this representation to parameterize a conditional transport process from control to perturbed cell states. Representing perturbations by sequence, rather than by a fixed set of gene identifiers, supports zero-shot inference at loci not seen during training and expands inference-time genomic coverage from ~1.5% for gene-level single-cell foundation models to ~95% of the genome. We evaluate STRAND on CRISPR perturbation datasets in K562, Jurkat, and RPE1 cells. STRAND improves discrimination scores by up to 33% in low-sample regimes, achieves the best average rank on unseen gene perturbation benchmarks, and improves transfer to novel cell lines by up to 0.14 in Pearson correlation. Ablations isolate the gains to sequence conditioning and transport, and case studies show that STRAND resolves functionally alternative transcription start sites missed by gene-level models.


[3] 2602.10163

Beyond SMILES: Evaluating Agentic Systems for Drug Discovery

Agentic systems for drug discovery have demonstrated autonomous synthesis planning, literature mining, and molecular design. We ask how well they generalize. Evaluating six frameworks against 15 task classes drawn from peptide therapeutics, in vivo pharmacology, and resource-constrained settings, we find five capability gaps: no support for protein language models or peptide-specific prediction, no bridges between in vivo and in silico data, reliance on LLM inference with no pathway to ML training or reinforcement learning, assumptions tied to large-pharma resources, and single-objective optimization that ignores safety-efficacy-stability trade-offs. A paired knowledge-probing experiment suggests the bottleneck is architectural rather than epistemic: four frontier LLMs reason about peptides at levels comparable to small molecules, yet no framework exposes this capability. We propose design requirements and a capability matrix for next-generation frameworks that function as computational partners under realistic constraints.


[4] 2602.10168

EVA: Towards a universal model of the immune system

The effective application of foundation models to translational research in immune-mediated diseases requires multimodal patient-level representations that can capture complex phenotypes emerging from multicellular interactions. Yet most current biological foundation models focus only on single-cell resolution and are evaluated on technical metrics often disconnected from actual drug development tasks and challenges. Here, we introduce EVA, the first cross-species, multimodal foundation model of immunology and inflammation, a therapeutic area where shared pathogenic mechanisms create unique opportunities for transfer learning. EVA harmonizes transcriptomics data across species, platforms, and resolutions, and integrates histology data to produce rich, unified patient representations. We establish clear scaling laws, demonstrating that increasing model size and compute translates to improvements in both pretraining and downstream tasks performance. We introduce a comprehensive evaluation suite of 39 tasks spanning the drug development pipeline: zero-shot target efficacy and gene function prediction for discovery, cross-species or cross-diseases molecular perturbations for preclinical development, and patient stratification with treatment response prediction or disease activity prediction for clinical trials applications. We benchmark EVA against several state-of-the-art biological foundation models and baselines on these tasks, and demonstrate state-of-the-art results on each task category. Using mechanistic interpretability, we further identify biological meaningful features, revealing intertwined representations across species and technologies. We release an open version of EVA for transcriptomics to accelerate research on immune-mediated diseases.


[5] 2602.10361

ENIGMA: EEG-to-Image in 15 Minutes Using Less Than 1% of the Parameters

To be practical for real-life applications, models for brain-computer interfaces must be easily and quickly deployable on new subjects, effective on affordable scanning hardware, and small enough to run locally on accessible computing resources. To directly address these current limitations, we introduce ENIGMA, a multi-subject electroencephalography (EEG)-to-Image decoding model that reconstructs seen images from EEG recordings and achieves state-of-the-art (SOTA) performance on the research-grade THINGS-EEG2 and consumer-grade AllJoined-1.6M benchmarks, while fine-tuning effectively on new subjects with as little as 15 minutes of data. ENIGMA boasts a simpler architecture and requires less than 1% of the trainable parameters necessary for previous approaches. Our approach integrates a subject-unified spatio-temporal backbone along with a set of multi-subject latent alignment layers and an MLP projector to map raw EEG signals to a rich visual latent space. We evaluate our approach using a broad suite of image reconstruction metrics that have been standardized in the adjacent field of fMRI-to-Image research, and we describe the first EEG-to-Image study to conduct extensive behavioral evaluations of our reconstructions using human raters. Our simple and robust architecture provides a significant performance boost across both research-grade and consumer-grade EEG hardware, and a substantial improvement in fine-tuning efficiency and inference cost. Finally, we provide extensive ablations to determine the architectural choices most responsible for our performance gains in both single and multi-subject cases across multiple benchmark datasets. Collectively, our work provides a substantial step towards the development of practical brain-computer interface applications.


[6] 2602.10644

Towards Universal Spatial Transcriptomics Super-Resolution: A Generalist Physically Consistent Flow Matching Framework

Spatial transcriptomics provides an unprecedented perspective for deciphering tissue spatial heterogeneity. However, high-resolution spatial transcriptomic technology remains constrained by limited gene coverage, technical complexity, and high cost. Existing spatial transcriptomics super-resolution methods from low resolution data suffer from two fundamental limitations: poor out-of-distribution generalization stemming from a neglect of inherent biological heterogeneity, and a lack of physical consistency. To address these challenges, we propose SRast, a novel physically constrained generalist framework designed for robust spatial transcriptomics super-resolution. To tackle heterogeneity, SRast employs a strategic decoupling architecture that explicitly decouples gene semantics representation from spatial geometry deconvolution, utilizing self-supervised learning to align latent distributions and mitigate cross-sample shifts. Regarding physical priors, SRast reformulates the task as ratio prediction on the simplex, performing a flow matching model to learn optimal transport-based geometric transformations that strictly enforce local mass conservation. Extensive experiments across diverse species, tissues, and platforms demonstrate that SRast achieves state-of-the-art performance, exhibiting superior zero-shot generalization capabilities and ensuring physical consistency in recovering fine-grained biological structures.


[7] 2602.11054

A Dynamical Microscope for Multivariate Oscillatory Signals: Validating Regime Recovery on Shared Manifolds

Multivariate oscillatory signals from complex systems often exhibit non-stationary dynamics and metastable regime structure, making dynamical interpretation challenging. We introduce a ``dynamical microscope'' framework that converts multichannel signals into circular phase--amplitude features, learns a data-driven latent trajectory representation with an autoencoder, and quantifies dynamical regimes through trajectory geometry and flow field metrics. Using a coupled Stuart--Landau oscillator network with topology-switching as ground-truth validation, we demonstrate that the framework recovers differences in dynamical laws even when regimes occupy overlapping regions of state space. Group differences can be expressed as changes in latent trajectory speed, path geometry, and flow organization on a shared manifold, rather than requiring discrete state separation. Speed and explored variance show strong regime discriminability ($\eta^2 > 0.5$), while some metrics (e.g., tortuosity) capture trajectory geometry orthogonal to topology contrasts. The framework provides a principled approach for analyzing regime structure in multivariate time series from neural, physiological, or physical systems.


[8] 2602.10242

Whodunnit? The case of midge swarms

As collective states of animal groups go, swarms of midge insects pose a number of puzzling questions. Their ordering polarization parameter is quite small and the insects are weakly coupled among themselves but strongly coupled to the swarm. In laboratory studies (free of external perturbations), the correlation length is small, whereas midge swarms exhibit strong correlations, scale free behavior and power laws for correlation length, susceptibility and correlation time in field studies. Data for the dynamic correlation function versus time collapse to a single curve only for small values of time scaled with the correlation time. Is there a theory that explains these disparate observations? Among the existing theories, whodunnit? Here we review and discuss several models proposed in the literature and extend our own one, the harmonically confined Vicsek model, to anisotropic confinement. Numerical simulations of the latter produce elongated swarm shapes and values of the static critical exponents between those of the two dimensional and isotropic three dimensional models. The new values agree better with those measured in natural swarms.


[9] 2602.10303

ICODEN: Ordinary Differential Equation Neural Networks for Interval-Censored Data

Predicting time-to-event outcomes when event times are interval censored is challenging because the exact event time is unobserved. Many existing survival analysis approaches for interval-censored data rely on strong model assumptions or cannot handle high-dimensional predictors. We develop ICODEN, an ordinary differential equation-based neural network for interval-censored data that models the hazard function through deep neural networks and obtains the cumulative hazard by solving an ordinary differential equation. ICODEN does not require the proportional hazards assumption or a prespecified parametric form for the hazard function, thereby permitting flexible survival modeling. Across simulation settings with proportional or non-proportional hazards and both linear and nonlinear covariate effects, ICODEN consistently achieves satisfactory predictive accuracy and remains stable as the number of predictors increases. Applications to data from multiple phases of the Alzheimer's Disease Neuroimaging Initiative (ADNI) and to two Age-Related Eye Disease Studies (AREDS and AREDS2) for age-related macular degeneration (AMD) demonstrate ICODEN's robust prediction performance. In both applications, predicting time-to-AD or time-to-late AMD, ICODEN effectively uses hundreds to more than 1,000 SNPs and supports data-driven subgroup identification with differential progression risk profiles. These results establish ICODEN as a practical assumption-lean tool for prediction with interval-censored survival data in high-dimensional biomedical settings.


[10] 2602.10375

Morphological instability of an invasive active-passive interface

Morphological instabilities of growing tissues that impinge on passive materials are typical of invasive cancers. To explain these instabilities in experiments on breast epithelial spheroids in an extracellular matrix, we develop a continuum phase field model of a growing active liquid expanding into a passive viscoelastic matrix. Linear stability analysis of the sharp-interface limit of the governing equations predicts that the tissue interface can develops long-wavelength instabilities, but these instabilities are suppressed when the active carcinoid is embedded in an elastic matrix. We develop a theoretical morphological phase diagram, and complement these with two-dimensional finite element (FEM) phase-field simulations to track the nonlinear evolution of the interface with results consistent with theoretical predictions and experimental observations. Our study provides a basis for the emergence of interfacial instabilities in active-passive systems with the potential to control them.


[11] 2602.10856

Fragile $\mathit{vs}$ robust Multiple Equilibria phases in generalized Lotka-Volterra model with non-reciprocal interactions

We investigate the Multiple Equilibria phase of generalized Lotka-Volterra dynamics with random, non-reciprocal interactions. We compute the topological complexity of equilibria, which quantifies how rapidly the number of equilibria of the dynamical equations grows with the total number of species. We perform the calculation for arbitrary degree of non-reciprocity in the interactions, distinguishing between configurations that are dynamically stable to invasions by species absent from the equilibrium, and those that are not. We characterize the properties of typical (i.e., most numerous) equilibria at a given diversity, including their average abundance, mutual similarity, and internal stability. This analysis reveals the existence of two distinct ME phases, which differ in how internally stable equilibria behave under invasions by absent species. We discuss the implications of this finding for the system's dynamical behavior.


[12] 2409.17038

Omnibenchmark: transparent, reproducible, extensible and standardized orchestration of solo and collaborative benchmarks

Benchmarking involves designing, running and disseminating rigorous performance assessments of methods, most often for data analysis and software tools, but the process can also be applied to experimental systems. Ideally, a benchmarking system is used to facilitate the benchmarking process by providing a structured entrypoint to design, coordinate, execute, and store standardized benchmarks. We describe a novel benchmarking system, Omnibenchmark, that facilitates benchmark formalization and execution in both solo and community efforts. Omnibenchmark provides a flexible benchmark plan syntax (i.e., a configuration YAML file), dynamic workflow generation based on Snakemake, S3-compatible storage handling, and reproducible software environments using environment modules, Apptainer or Conda. Such a setup provides an unprecedented flexibility such that existing benchmark designs can be forked and extended, run separately or collaboratively, giving versioned and standardized result outputs and therefore much-needed transparency to the analysis and interpretation of benchmark results. Tutorials and installation instructions are available from this https URL.


[13] 2512.21320

An Allele-Centric Pan-Graph-Matrix Representation for Scalable Pangenome Analysis

Population-scale pangenome analysis increasingly requires representations that unify single-nucleotide and structural variation while remaining scalable across large cohorts. Existing formats are typically sequence-centric, path-centric, or sample-centric, and often obscure population structure or fail to exploit carrier sparsity. We introduce the H1 pan-graph-matrix, an allele-centric representation that encodes exact haplotype membership using adaptive per-allele compression. By treating alleles as first-class objects and selecting optimal encodings based on carrier distribution, H1 achieves near-optimal storage across both common and rare variants. We further introduce H2, a path-centric dual representation derived from the same underlying allele-haplotype incidence information that restores explicit haplotype ordering while remaining exactly equivalent in information content. Using real human genome data, we show that this representation yields substantial compression gains, particularly for structural variants, while remaining equivalent in information content to pangenome graphs. H1 provides a unified, population-aware foundation for scalable pangenome analysis and downstream applications such as rare-variant interpretation and drug discovery.


[14] 2602.09649

Population-scale Ancestral Recombination Graphs with tskit 1.0

Ancestral recombination graphs (ARGs) are an increasingly important component of population and statistical genetics. The tskit library has become key infrastructure for the field, providing an expressive and general representation of ARGs together with a suite of efficient fundamental operations. In this note, we announce tskit version 1.0, describe its underlying rationale, and document its stability guarantees. These guarantees provide a foundation for durable computational artefacts and support long-term reproducibility of code and analyses.


[15] 2310.03111

Multi-modal Gaussian Process Variational Autoencoders for Neural and Behavioral Data

Characterizing the relationship between neural population activity and behavioral data is a central goal of neuroscience. While latent variable models (LVMs) are successful in describing high-dimensional time-series data, they are typically only designed for a single type of data, making it difficult to identify structure shared across different experimental data modalities. Here, we address this shortcoming by proposing an unsupervised LVM which extracts temporally evolving shared and independent latents for distinct, simultaneously recorded experimental modalities. We do this by combining Gaussian Process Factor Analysis (GPFA), an interpretable LVM for neural spiking data with temporally smooth latent space, with Gaussian Process Variational Autoencoders (GP-VAEs), which similarly use a GP prior to characterize correlations in a latent space, but admit rich expressivity due to a deep neural network mapping to observations. We achieve interpretability in our model by partitioning latent variability into components that are either shared between or independent to each modality. We parameterize the latents of our model in the Fourier domain, and show improved latent identification using this approach over standard GP-VAE methods. We validate our model on simulated multi-modal data consisting of Poisson spike counts and MNIST images that scale and rotate smoothly over time. We show that the multi-modal GP-VAE (MM-GPVAE) is able to not only identify the shared and independent latent structure across modalities accurately, but provides good reconstructions of both images and neural rates on held-out trials. Finally, we demonstrate our framework on two real world multi-modal experimental settings: Drosophila whole-brain calcium imaging alongside tracked limb positions, and Manduca sexta spike train measurements from ten wing muscles as the animal tracks a visual stimulus.


[16] 2408.01253

Metareasoning in uncertain environments: a meta-BAMDP framework

\textit{Reasoning} may be viewed as an algorithm $P$ that makes a choice of an action $a^* \in \mathcal{A}$, aiming to optimize some outcome. However, executing $P$ itself bears costs (time, energy, limited capacity, etc.) and needs to be considered alongside explicit utility obtained by making the choice in the underlying decision problem. Finding the right $P$ can itself be framed as an optimization problem over the space of reasoning processes $P$, generally referred to as \textit{metareasoning}. Conventionally, human metareasoning models assume that the agent knows the transition and reward distributions of the underlying MDP. This paper generalizes such models by proposing a meta Bayes-Adaptive MDP (meta-BAMDP) framework to handle metareasoning in environments with unknown reward/transition distributions, which encompasses a far larger and more realistic set of planning problems that humans and AI systems face. As a first step, we apply the framework to Bernoulli bandit tasks. Owing to the meta problem's complexity, our solutions are necessarily approximate. However, we introduce two novel theorems that significantly enhance the tractability of the problem, enabling stronger approximations that are robust within a range of assumptions grounded in realistic human decision-making scenarios. These results offer a resource-rational perspective and a normative framework for understanding human exploration under cognitive constraints, as well as providing experimentally testable predictions about human behavior in Bernoulli Bandit tasks.


[17] 2508.07465

MOTGNN: Interpretable Graph Neural Networks for Multi-Omics Disease Classification

Integrating multi-omics data, such as DNA methylation, mRNA expression, and microRNA (miRNA) expression, offers a comprehensive view of the biological mechanisms underlying disease. However, the high dimensionality of multi-omics data, the heterogeneity across modalities, and the lack of reliable biological interaction networks make meaningful integration challenging. In addition, many existing models rely on handcrafted similarity graphs, are vulnerable to class imbalance, and often lack built-in interpretability, limiting their usefulness in biomedical applications. We propose Multi-Omics integration with Tree-generated Graph Neural Network (MOTGNN), a novel and interpretable framework for binary disease classification. MOTGNN employs eXtreme Gradient Boosting (XGBoost) for omics-specific supervised graph construction, followed by modality-specific Graph Neural Networks (GNNs) for hierarchical representation learning, and a deep feedforward network for cross-omics integration. Across three real-world disease datasets, MOTGNN outperforms state-of-the-art baselines by 5-10% in accuracy, ROC-AUC, and F1-score, and remains robust to severe class imbalance. The model maintains computational efficiency through the use of sparse graphs and provides built-in interpretability, revealing both top-ranked biomarkers and the relative contributions of each omics modality. These results highlight the potential of MOTGNN to improve both predictive accuracy and interpretability in multi-omics disease modeling.


[18] 2602.02128

Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics

Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics, but their computational cost limits access to biologically relevant timescales. Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation due to architectural constraints, error accumulation, and inadequate modeling of spatio-temporal dynamics. We present STAR-MD (Spatio-Temporal Autoregressive Rollout for Molecular Dynamics), a scalable SE(3)-equivariant diffusion model that generates physically plausible protein trajectories over microsecond timescales. Our key innovation is a causal diffusion transformer with joint spatio-temporal attention that efficiently captures complex space-time dependencies while avoiding the memory bottlenecks of existing methods. On the standard ATLAS benchmark, STAR-MD achieves state-of-the-art performance across all metrics--substantially improving conformational coverage, structural validity, and dynamic fidelity compared to previous methods. STAR-MD successfully extrapolates to generate stable microsecond-scale trajectories where baseline methods fail catastrophically, maintaining high structural quality throughout the extended rollout. Our comprehensive evaluation reveals severe limitations in current models for long-horizon generation, while demonstrating that STAR-MD's joint spatio-temporal modeling enables robust dynamics simulation at biologically relevant timescales, paving the way for accelerated exploration of protein function.


[19] 2602.08640

Universal Approximation Theorems for Dynamical Systems with Infinite-Time Horizon Guarantees

Universal approximation theorems establish the expressive capacity of neural network architectures. For dynamical systems, existing results are limited to finite time horizons or systems with a globally stable equilibrium, leaving multistability and limit cycles unaddressed. We prove that Neural ODEs achieve $\varepsilon$-$\delta$ closeness -- trajectories within error $\varepsilon$ except for initial conditions of measure $< \delta$ -- over the \emph{infinite} time horizon $[0,\infty)$ for three target classes: (1) Morse-Smale systems (a structurally stable class) with hyperbolic fixed points, (2) Morse-Smale systems with hyperbolic limit cycles via exact period matching, and (3) systems with normally hyperbolic continuous attractors via discretization. We further establish a temporal generalization bound: $\varepsilon$-$\delta$ closeness implies $L^p$ error $\leq \varepsilon^p + \delta \cdot D^p$ for all $t \geq 0$, bridging topological guarantees to training metrics. These results provide the first universal approximation framework for multistable infinite-horizon dynamics.


[20] 2602.09116

Importance inversion transfer identifies shared principles for cross-domain learning

The capacity to transfer knowledge across scientific domains relies on shared organizational principles. However, existing transfer-learning methodologies often fail to bridge radically heterogeneous systems, particularly under severe data scarcity or stochastic noise. This study formalizes Explainable Cross-Domain Transfer Learning (X-CDTL), a framework unifying network science and explainable artificial intelligence to identify structural invariants that generalize across biological, linguistic, molecular, and social networks. By introducing the Importance Inversion Transfer (IIT) mechanism, the framework prioritizes domain-invariant structural anchors over idiosyncratic, highly discriminative features. In anomaly detection tasks, models guided by these principles achieve significant performance gains - exhibiting a 56% relative improvement in decision stability under extreme noise - over traditional baselines. These results provide evidence for a shared organizational signature across heterogeneous domains, establishing a principled paradigm for cross-disciplinary knowledge propagation. By shifting from opaque latent representations to explicit structural laws, this work advances machine learning as a robust engine for scientific discovery.