New articles on Quantitative Biology


[1] 2602.02620

CryoLVM: Self-supervised Learning from Cryo-EM Density Maps with Large Vision Models

Cryo-electron microscopy (cryo-EM) has revolutionized structural biology by enabling near-atomic-level visualization of biomolecular assemblies. However, the exponential growth in cryo-EM data throughput and complexity, coupled with diverse downstream analytical tasks, necessitates unified computational frameworks that transcend current task-specific deep learning approaches with limited scalability and generalizability. We present CryoLVM, a foundation model that learns rich structural representations from experimental density maps with resolved structures by leveraging the Joint-Embedding Predictive Architecture (JEPA) integrated with SCUNet-based backbone, which can be rapidly adapted to various downstream tasks. We further introduce a novel histogram-based distribution alignment loss that accelerates convergence and enhances fine-tuning performance. We demonstrate CryoLVM's effectiveness across three critical cryo-EM tasks: density map sharpening, density map super-resolution, and missing wedge restoration. Our method consistently outperforms state-of-the-art baselines across multiple density map quality metrics, confirming its potential as a versatile model for a wide spectrum of cryo-EM applications.


[2] 2602.02916

Mathematical Modeling of Lesion Pattern Formation in Dendritic Keratitis

Dendritic keratitis is a form of eye infection caused by herpes simplex virus (HSV). The virus spreads via direct cell-to-cell infection among corneal epithelial cells. This leads to the formation of dendritic lesions characterized by terminal bulbs at their tips. Under immunosuppression, the condition may progress to geographic keratitis, which is a map-shaped lesion with dendritic tails. The mechanism of this pattern formation remains to be elucidated. In this study, we propose a mathematical model to elucidate the mechanisms of lesion pattern formation in dendritic keratitis. Our model shows that increased production of infection-suppressive cytokines induces dendritic patterns with terminal bulbs, whereas reduced cytokine levels lead to geographic patterns. Furthermore, altering the spatial distribution of cytokine production can reproduce dendritic tails. By including external cytokine secretion, we could reproduce tapered lesions observed in non-HSV keratitis. By clarifying the mechanisms behind terminal bulb formation and reproducing atypical lesion morphologies, our findings enhance the understanding of herpetic keratitis and highlight the utility of mathematical modeling in ophthalmology.


[3] 2602.03228

Asymptotic Behavior of Integral Projection Models via Genealogical Quantities

Multi-state structured population models, including integral projection models (IPMs) and age-structured McKendrick equations, link individual life histories to population growth and composition, yet the demographic meaning of their dominant eigenstructure can be difficult to interpret. A main goal of this paper is to derive interpretable demographic indicators for multi-state heterogeneity -- in particular expected generation numbers, which act as an effective genealogical memory length (in generations) of the ancestry-weighted contributions driving growth -- together with type reproduction numbers and generation intervals, directly from life-history transition kernels. To this end we develop a determinant-free genealogical framework based on a reference-point operator, a rank-one construction at the kernel level that singles out a biologically chosen reference state and organizes lineages by their contributions relative to that state. This yields stable distributions and reproductive values as convergent series of iterated kernels, and leads to an Euler--Lotka-like characteristic equation expressed by reference-point moments. The resulting expansion admits a closed combinatorial form via ordinary partial Bell polynomials, providing a direct bridge from transition kernels to genealogical quantities. We extend the approach to multi-state McKendrick equations and show how these indicators quantify how population scale and composition are determined by ancestry-weighted initial-state information. The framework avoids restrictive Hilbert--Schmidt assumptions and clarifies how temporal memory and multi-type heterogeneity emerge from cross-generational accumulation, yielding a unified and interpretable route from transition kernels to multi-state demographic indicators.


[4] 2602.03240

Estimating measures of information processing during cognitive tasks using functional magnetic resonance imaging

Cognition is increasingly framed in terms of information processing, yet most fMRI analyses focus on activation or functional connectivity rather than quantifying how information is stored and transferred. To remedy this problem, we propose a framework for estimating measures of information processing: active information storage (AIS), transfer entropy (TE), and net synergy from task-based fMRI. AIS measures information maintained within a region, TE captures directed information flow, and net synergy contrasts higher-order synergistic to redundant interactions. Crucially, to enable this framework we utilised a recently developed approach for calculating information-theoretic measures: the cross mutual information. This approach combines resting-state and task data to address the challenges of limited sample size, non-stationarity and context in task-based fMRI. We applied this framework to the working memory (N-back) task from the Human Connectome Project (470 participants). Results show that AIS increases in fronto-parietal regions with working memory load, TE reveals enhanced directed information flows across control pathways, and net synergy indicates a global shift to redundancy. This work establishes a novel methodology for quantifying information processing in task-based fMRI.


[5] 2602.03269

Systematic review of self-supervised foundation models for brain network representation using electroencephalography

Automated analysis of electroencephalography (EEG) has recently undergone a paradigm shift. The introduction of transformer architectures and self-supervised pretraining (SSL) has led to the development of EEG foundation models. These models are pretrained on large amounts of unlabeled data and can be adapted to a range of downstream tasks. This systematic review summarizes recent SSL-trained EEG foundation models that learn whole-brain representations from multichannel EEG rather than representations derived from a single channel. We searched PubMed, IEEE Xplore, Scopus, and arXiv through July 21, 2025. Nineteen preprints and peer-reviewed articles met inclusion criteria. We extracted information regarding pretraining datasets, model architectures, pretraining SSL objectives, and downstream task applications. While pretraining data heavily relied on the Temple University EEG corpus, there was significant heterogeneity in model architecture and training objectives across studies. Transformer architectures were identified as the predominant pretraining architecture with state-space models such as MAMBA and S4 as emerging alternatives. Concerning SSL objectives, masked auto-encoding was most common, and other studies incorporate contrastive learning. Downstream tasks varied widely and implemented diverse fine-tuning strategies, which made direct comparison challenging. Furthermore, most studies used single-task fine-tuning, and a generalizable EEG foundation model remains lacking. In conclusion, the field is advancing rapidly but still limited by limited dataset diversity and the absence of standardized benchmarks. Progress will likely depend on larger and more diverse pretraining datasets, standardized evaluation protocols, and multi-task validation. The development will advance EEG foundation models towards robust and general-purpose relevant to both basic and clinical applications.


[6] 2602.03779

Generative AI for Enzyme Design and Biocatalysis

Sparked by innovations in generative artificial intelligence (AI), the field of protein design has undergone a paradigm shift with an explosion of new models for optimizing existing enzymes or creating them from scratch. After more than one decade of low success rates for computationally designed enzymes, generative AI models are now frequently used for designing proficient enzymes. Here, we provide a comprehensive overview and classification of generative AI models for enzyme design, highlighting models with experimental validation relevant to real-world settings and outlining their respective limitations. We argue that generative AI models now have the maturity to create and optimize enzymes for industrial applications. Wider adoption of generative AI models with experimental feedback loops can speed up the development of biocatalysts and serve as a community assessment to inform the next generation of models.


[7] 2602.03824

Deep-learning-based pan-phenomic data reveals the explosive evolution of avian visual disparity

The evolution of biological morphology is critical for understanding the diversity of the natural world, yet traditional analyses often involve subjective biases in the selection and coding of morphological traits. This study employs deep learning techniques, utilising a ResNet34 model capable of recognising over 10,000 bird species, to explore avian morphological evolution. We extract weights from the model's final fully connected (fc) layer and investigate the semantic alignment between the high-dimensional embedding space learned by the model and biological phenotypes. The results demonstrate that the high-dimensional embedding space encodes phenotypic convergence. Subsequently, we assess the morphological disparity among various taxa and evaluate the association between morphological disparity and species richness, demonstrating that species richness is the primary driver of morphospace expansion. Moreover, the disparity-through-time analysis reveals a visual "early burst" after the K-Pg extinction. While mainly aimed at evolutionary analysis, this study also provides insights into the interpretability of Deep Neural Networks. We demonstrate that hierarchical semantic structures (biological taxonomy) emerged in the high-dimensional embedding space despite being trained on flat labels. Furthermore, through adversarial examples, we provide evidence that our model in this task can overcome texture bias and learn holistic shape representations (body plans), challenging the prevailing view that CNNs rely primarily on local textures.


[8] 2602.02553

Indirect Reciprocity with Environmental Feedback

Indirect reciprocity maintains cooperation in stranger societies by mapping individual behaviors onto reputation signals via social norms. Existing theoretical frameworks assume static environments with constant resources and fixed payoff structures. However, in real-world systems, individuals' strategic behaviors not only shape their reputation but also induce collective-level resource changes in ecological, economic, or other external environments, which in turn reshape the incentives governing future individual actions. To overcome this limitation, we establish a co-evolutionary framework that couples moral assessment, strategy updating, and environmental dynamics, allowing the payoff structure to dynamically adjust in response to the ecological consequences of collective actions. We find that this environmental feedback mechanism helps lower the threshold for the emergence of cooperation, enabling the system to spontaneously transition from a low-cooperation state to a stable high-cooperation regime, thereby reducing the dependence on specific initial conditions. Furthermore, while lenient norms demonstrate adaptability in static environments, norms with strict discrimination are shown to be crucial for curbing opportunism and maintaining evolutionary resilience in dynamic settings. Our results reveal the evolutionary dynamics of coupled systems involving reputation institutions and environmental constraints, offering a new theoretical perspective for understanding collective cooperation and social governance in complex environments.


[9] 2602.02558

PA-MIL: Phenotype-Aware Multiple Instance Learning Guided by Language Prompting and Genotype-to-Phenotype Relationships

Deep learning has been extensively researched in the analysis of pathology whole-slide images (WSIs). However, most existing methods are limited to providing prediction interpretability by locating the model's salient areas in a post-hoc manner, failing to offer more reliable and accountable explanations. In this work, we propose Phenotype-Aware Multiple Instance Learning (PA-MIL), a novel ante-hoc interpretable framework that identifies cancer-related phenotypes from WSIs and utilizes them for cancer subtyping. To facilitate PA-MIL in learning phenotype-aware features, we 1) construct a phenotype knowledge base containing cancer-related phenotypes and their associated genotypes. 2) utilize the morphological descriptions of phenotypes as language prompting to aggregate phenotype-related features. 3) devise the Genotype-to-Phenotype Neural Network (GP-NN) grounded in genotype-to-phenotype relationships, which provides multi-level guidance for PA-MIL. Experimental results on multiple datasets demonstrate that PA-MIL achieves competitive performance compared to existing MIL methods while offering improved interpretability. PA-MIL leverages phenotype saliency as evidence and, using a linear classifier, achieves competitive results compared to state-of-the-art methods. Additionally, we thoroughly analyze the genotype-phenotype relationships, as well as cohort-level and case-level interpretability, demonstrating the reliability and accountability of PA-MIL.


[10] 2602.02562

A Distinct Communication Strategies Model of the Double Empathy Problem

The double empathy problem recasts the difficulty of forming empathy bonds in social interactions between autistic and neurotypical individuals as a bidirectional problem, rather than due to a deficit exclusive to the person on the spectrum. However, no explicit mechanism to explain such a phenomenon has been proposed. Here we build a feedback-loop mathematical model that would theoretically induce the empathy degradation observed during communication in neurotypical-autistic pairs solely due to differences in communication preferences between neurotypical and neurodivergent individuals. Numerical simulations of dyadic interactions show the model, whose mechanism is based solely on communication preferences, can illustrate the breakdown of empathic bonding observed clinically. Stability analysis of the model provides a way to predict the overall trajectory of the interaction in the empathy space. Furthermore, we suggest experimental designs to measure several parameters outlined here and discuss the future directions for testing the proposed model.


[11] 2602.02587

The Evolution of Lying in a Spatially-Explicit Prisoner's Dilemma Model

I present the results from a spatial model of the prisoner's dilemma, played on a toroidal lattice. Each individual has a default strategy of either cooperating ($C$) or defecting ($D$). Two strategies were tested, including ``tit-for-tat'' (TFT), in which individuals play their opponent's last play, or simply playing their default play. Each individual also has a probability of telling the truth ($0 \leq P_{truth} \leq 1$) about their last play. This parameter, which can evolve over time, allows individuals to be, for instance, a defector but present as a cooperator regarding their last play. This leads to interesting dynamics where mixed populations of defectors and cooperators with $P_{truth} \geq 0.75$ move toward populations of truth-telling cooperators. Likewise, mixed populations with $P_{truth} < 0.7$ become populations of lying defectors. Both such populations are stable because they each have higher average scores than populations with intermediate values of $P_{truth}$. Applications of this model are discussed with regards to both humans and animals.


[12] 2602.02605

Fine-Tuning Language Models to Know What They Know

Metacognition is a critical component of intelligence, specifically regarding the awareness of one's own knowledge. While humans rely on shared internal memory for both answering questions and reporting their knowledge state, this dependency in LLMs remains underexplored. This study proposes a framework to measure metacognitive ability $d_{\rm{type2}}'$ using a dual-prompt method, followed by the introduction of Evolution Strategy for Metacognitive Alignment (ESMA) to bind a model's internal knowledge to its explicit behaviors. ESMA demonstrates robust generalization across diverse untrained settings, indicating a enhancement in the model's ability to reference its own knowledge. Furthermore, parameter analysis attributes these improvements to a sparse set of significant modifications.


[13] 2602.02638

hSNMF: Hybrid Spatially Regularized NMF for Image-Derived Spatial Transcriptomics

High-resolution spatial transcriptomics platforms, such as Xenium, generate single-cell images that capture both molecular and spatial context, but their extremely high dimensionality poses major challenges for representation learning and clustering. In this study, we analyze data from the Xenium platform, which captures high-resolution images of tumor microarray (TMA) tissues and converts them into cell-by-gene matrices suitable for computational analysis. We benchmark and extend nonnegative matrix factorization (NMF) for spatial transcriptomics by introducing two spatially regularized variants. First, we propose Spatial NMF (SNMF), a lightweight baseline that enforces local spatial smoothness by diffusing each cell's NMF factor vector over its spatial neighborhood. Second, we introduce Hybrid Spatial NMF (hSNMF), which performs spatially regularized NMF followed by Leiden clustering on a hybrid adjacency that integrates spatial proximity (via a contact-radius graph) and transcriptomic similarity through a tunable mixing parameter alpha. Evaluated on a cholangiocarcinoma dataset, SNMF and hSNMF achieve markedly improved spatial compactness (CHAOS < 0.004, Moran's I > 0.96), greater cluster separability (Silhouette > 0.12, DBI < 1.8), and higher biological coherence (CMC and enrichment) compared to other spatial baselines. Availability and implementation: this https URL


[14] 2602.02825

On the consistent and scalable detection of spatial patterns

Detecting spatial patterns is fundamental to scientific discovery, yet current methods lack statistical consensus and face computational barriers when applied to large-scale spatial omics datasets. We unify major approaches through a single quadratic form and derive general consistency conditions. We reveal that several widely used methods, including Moran's I, are inconsistent, and propose scalable corrections. The resulting test enables robust pattern detection across millions of spatial locations and single-cell lineage-tracing datasets.


[15] 2602.02918

A Multi-scale Linear-time Encoder for Whole-Slide Image Analysis

We introduce Multi-scale Adaptive Recurrent Biomedical Linear-time Encoder (MARBLE), the first \textit{purely Mamba-based} multi-state multiple instance learning (MIL) framework for whole-slide image (WSI) analysis. MARBLE processes multiple magnification levels in parallel and integrates coarse-to-fine reasoning within a linear-time state-space model, efficiently capturing cross-scale dependencies with minimal parameter overhead. WSI analysis remains challenging due to gigapixel resolutions and hierarchical magnifications, while existing MIL methods typically operate at a single scale and transformer-based approaches suffer from quadratic attention costs. By coupling parallel multi-scale processing with linear-time sequence modeling, MARBLE provides a scalable and modular alternative to attention-based architectures. Experiments on five public datasets show improvements of up to \textbf{6.9\%} in AUC, \textbf{20.3\%} in accuracy, and \textbf{2.3\%} in C-index, establishing MARBLE as an efficient and generalizable framework for multi-scale WSI analysis.


[16] 2602.02920

A Reproducible Framework for Bias-Resistant Machine Learning on Small-Sample Neuroimaging Data

We introduce a reproducible, bias-resistant machine learning framework that integrates domain-informed feature engineering, nested cross-validation, and calibrated decision-threshold optimization for small-sample neuroimaging data. Conventional cross-validation frameworks that reuse the same folds for both model selection and performance estimation yield optimistically biased results, limiting reproducibility and generalization. Demonstrated on a high-dimensional structural MRI dataset of deep brain stimulation cognitive outcomes, the framework achieved a nested-CV balanced accuracy of 0.660\,$\pm$\,0.068 using a compact, interpretable subset selected via importance-guided ranking. By combining interpretability and unbiased evaluation, this work provides a generalizable computational blueprint for reliable machine learning in data-limited biomedical domains.


[17] 2602.03172

Adversarial construction as a potential solution to the experiment design problem in large task spaces

Despite decades of work, we still lack a robust, task-general theory of human behavior even in the simplest domains. In this paper we tackle the generality problem head-on, by aiming to develop a unified model for all tasks embedded in a task-space. In particular we consider the space of binary sequence prediction tasks where the observations are generated by the space parameterized by hidden Markov models (HMM). As the space of tasks is large, experimental exploration of the entire space is infeasible. To solve this problem we propose the adversarial construction approach, which helps identify tasks that are most likely to elicit a qualitatively novel behavior. Our results suggest that adversarial construction significantly outperforms random sampling of environments and therefore could be used as a proxy for optimal experimental design in high-dimensional task spaces.


[18] 2602.03343

MARADONER: Motif Activity Response Analysis Done Right

Inferring the activities of transcription factors from high-throughput transcriptomic or open chromatin profiling, such as RNA-/CAGE-/ATAC-Seq, is a long-standing challenge in systems biology. Identification of highly active master regulators enables mechanistic interpretation of differential gene expression, chromatin state changes, or perturbation responses across conditions, cell types, and diseases. Here, we describe MARADONER, a statistical framework and its software implementation for motif activity response analysis (MARA), utilizing the sequence-level features obtained with pattern matching (motif scanning) of individual promoters and promoter- or gene-level activity or expression estimates. Compared to the classic MARA, MARADONER (MARA-done-right) employs an unbiased variance parameter estimation and a bias-adjusted likelihood estimation of fixed effects, thereby enhancing goodness-of-fit and the accuracy of activity estimation. Further, MARADONER is capable of accounting for heteroscedasticity of motif scores and activity estimates.


[19] 2602.03477

ScDiVa: Masked Discrete Diffusion for Joint Modeling of Single-Cell Identity and Expression

Single-cell RNA-seq profiles are high-dimensional, sparse, and unordered, causing autoregressive generation to impose an artificial ordering bias and suffer from error accumulation. To address this, we propose scDiVa, a masked discrete diffusion foundation model that aligns generation with the dropout-like corruption process by defining a continuous-time forward masking mechanism in token space. ScDiVa features a bidirectional denoiser that jointly models discrete gene identities and continuous values, utilizing entropy-normalized serialization and a latent anchor token to maximize information efficiency and preserve global cell identity. The model is trained via depth-invariant time sampling and a dual denoising objective to simulate varying sparsity levels while ensuring precise recovery of both identity and magnitude. Pre-trained on 59 million cells, scDiVa achieves strong transfer performance across major benchmarks, including batch integration, cell type annotation, and perturbation response prediction. These results suggest that masked discrete diffusion serves as a biologically coherent and effective alternative to autoregression.


[20] 2602.03490

A Minimal Task Reveals Emergent Path Integration and Object-Location Binding in a Predictive Sequence Model

Adaptive cognition requires structured internal models representing objects and their relations. Predictive neural networks are often proposed to form such "world models", yet their underlying mechanisms remain unclear. One hypothesis is that action-conditioned sequential prediction suffices for learning such world models. In this work, we investigate this possibility in a minimal in-silico setting. Sequentially sampling tokens from 2D continuous token scenes, a recurrent neural network is trained to predict the upcoming token from current input and a saccade-like displacement. On novel scenes, prediction accuracy improves across the sequence, indicating in-context learning. Decoding analyses reveal path integration and dynamic binding of token identity to position. Interventional analyses show that new bindings can be learned late in sequence and that out-of-distribution bindings can be learned. Together, these results demonstrate how structured representations that rely on flexible binding emerge to support prediction, offering a mechanistic account of sequential world modeling relevant to cognitive science.


[21] 2602.03766

FOVI: A biologically-inspired foveated interface for deep vision models

Human vision is foveated, with variable resolution peaking at the center of a large field of view; this reflects an efficient trade-off for active sensing, allowing eye-movements to bring different parts of the world into focus with other parts of the world in context. In contrast, most computer vision systems encode the visual world at a uniform resolution, raising challenges for processing full-field high-resolution images efficiently. We propose a foveated vision interface (FOVI) based on the human retina and primary visual cortex, that reformats a variable-resolution retina-like sensor array into a uniformly dense, V1-like sensor manifold. Receptive fields are defined as k-nearest-neighborhoods (kNNs) on the sensor manifold, enabling kNN-convolution via a novel kernel mapping technique. We demonstrate two use cases: (1) an end-to-end kNN-convolutional architecture, and (2) a foveated adaptation of the foundational DINOv3 ViT model, leveraging low-rank adaptation (LoRA). These models provide competitive performance at a fraction of the computational cost of non-foveated baselines, opening pathways for efficient and scalable active sensing for high-resolution egocentric vision. Code and pre-trained models are available at this https URL and this https URL.


[22] 2410.00532

smICA: Open-Source Software for Quantitative, Lifetime-Resolved Mapping of Absolute Fluorophore Concentrations in Living Cells

Advanced microscopy techniques are essential in biomedical research for visualising and tracking biomolecules within living cells and their compartments. Conventional fluorescence microscopy methods, however, often struggle with accurately measuring the absolute concentrations of fluorescent probes in living cells. To overcome these limitations, we introduce an open-source analysis tool, smICA (Single-Molecule Image to Concentration Analyser). The smICA method offers quantitative mapping of absolute fluorophore concentrations, lifetime-resolved filtering methods of the signal, intensity-based cell segmentation, and requires only a few photons per pixel. Our approach also reduces the time required for the determination of the mean concentration per cell, compared to the standard FCS measurement performed in multiple posts. To highlight the robustness of the method, we validated it against standard fluorescence correlation spectroscopy (FCS) measurements by performing in vitro (aqueous solutions of polymers) and in vivo (polymers and EGFP in living cells) experiments. The presented methodology, along with the software, is a promising tool for quantitative single-cell studies, including, but not limited to, protein expression, degradation of biomolecules (such as proteins and mRNA), and monitoring of enzymatic reactions.


[23] 2505.08677

Evolving genealogies in cultural evolution, the descendant process, and the number of cultural traits

We consider a Moran-type model of cultural evolution, which describes how traits emerge, are transmitted, and get lost in populations. Our analysis focuses on the underlying cultural genealogies; they were first described by Aguilar and Ghirlanda (2015) and are closely related to the ancestral selection graph of population genetics, wherefore we call them ancestral learning graphs. We investigate their dynamical behaviour, that is, we are concerned with evolving genealogies. In particular, we consider the total length of the genealogy of the entire population as a function of the (forward) time where we start looking back. This quantity shows a sawtooth-like dynamics with linear increase interrupted by collapses to near-zero at random times. We relate this to the metastable behaviour of the stochastic logistic model, which describes the evolution of the number of ancestors as well as the number of descendants of a given sample. We superpose types to the model by assuming that new inventions appear independently in every individual, and all traits of the cultural parent are transmitted to the learner in any given learning event. The set of traits of an individual then agrees with the set of innovations along its genealogy. The properties of the genealogy thus translate into the properties of the trait set of a sample. In particular, the moments of the number of traits are obtained from the moments of the total length of the genealogy.


[24] 2508.15784

Emergent time-keeping mechanisms in a deep reinforcement learning agent performing an interval timing task

Drawing parallels between Deep Artificial Neural Networks (DNNs) and biological systems can aid in understanding complex biological mechanisms that are difficult to disentangle. Temporal processing, an extensively researched topic, is one such example that lacks a coherent understanding of its underlying mechanisms. In this study, we investigate temporal processing in a Deep Reinforcement Learning (DRL) agent performing an interval timing task and explore potential biological counterparts to its emergent behavior. The agent was successfully trained to perform a duration production task, which involved marking successive occurrences of a target interval while viewing a video sequence. Analysis of the agent's internal states revealed oscillatory neural activations, a ubiquitous pattern in biological systems. Interestingly, the agent's actions were predominantly influenced by neurons exhibiting these oscillations with high amplitudes and frequencies corresponding to the target interval. Parallels are drawn between the agent's time-keeping strategy and the Striatal Beat Frequency (SBF) model, a biologically plausible model of interval timing. Furthermore, the agent maintained its oscillatory representations and task performance when tested on different video sequences (including a blank video). Thus, once learned, the agent internalized its time-keeping mechanism and showed minimal reliance on its environment to perform the timing task. A hypothesis about the resemblance between this emergent behavior and certain aspects of the evolution of biological processes like circadian rhythms, has been discussed. This study aims to contribute to recent research efforts of utilizing DNNs to understand biological systems, with a particular emphasis on temporal processing.


[25] 2509.24926

Bifurcations and multistability in inducible three-gene toggle switch networks

Control of transcription presides over a vast array of biological processes, including those mediated by gene regulatory circuits that exhibit multistability. Within these circuits, two- and three-gene network motifs are particularly critical to the repertoire of metabolic and developmental pathways. Theoretical models of these circuits, however, often vary parameters such as dissociation constants, transcription rates, and degradation rates without specifying precisely how these parameters are controlled biologically. In this study, we examine the role of effector molecules, which can alter the concentrations of the active transcription factors that control regulation, and are ubiquitous in regulatory processes across many biological settings. We specifically consider allosteric regulation in the context of extending the standard bistable switch to three-gene networks, and explore the rich multistable dynamics exhibited in these architectures as a function of effector concentrations. We then analyze how the dynamics evolve under various interpretations of regulatory circuit mechanics, underlying inducer activity, and perturbations thereof. Notably, the biological mechanism by which we model effector control over dual-function proteins transforms not only the phenotypic trend of dynamic tuning but also the set of available dynamic regimes. In this way, we determine key parameters and regulatory features that drive phenotypic decisions, and offer an experimentally tunable structure for encoding inducible multistable behavior arising from both single and dual-function allosteric transcription factors.


[26] 2510.16082

BioGen: An Evidence-Grounded Framework for Interpreting RNA-seq Gene Clusters in Antimicrobial Resistance Research

The interpretation of gene clusters derived from RNA sequencing (RNA-seq) experiments remains a persistent challenge in functional genomics, particularly in antimicrobial resistance studies where mechanistic context is essential. While clustering methods effectively identify co-expressed gene modules, their interpretation typically relies on enrichment statistics and manual literature review, limiting transparency, reproducibility, and scalability. We present BioGen, an agentic framework for post hoc interpretation of RNA-seq gene clusters that emphasizes evidence-grounded and traceable biological reasoning. Rather than introducing new predictive models or clustering algorithms, BioGen organizes existing biomedical knowledge through a structured pipeline that integrates literature retrieval, hypothesis formulation, and critic-based validation. The framework enforces explicit linkage between interpretive claims and external sources such as PubMed and UniProt, enabling systematic assessment of factual grounding and semantic consistency. We apply BioGen to RNA-seq data from Salmonella enterica, demonstrating that it produces concise, literature-supported cluster-level interpretations related to efflux regulation, virulence, and metabolic adaptation. Comparative and ablation analyses indicate that retrieval augmentation and critic-based filtering reduce unsupported statements relative to unconstrained large language model baselines albeit at the cost of reduced interpretive coverage. These results highlight the role of architectural constraints and verification logic in improving reliability of automated biological interpretation. Overall, BioGen is intended as an interpretive support layer that complements existing transcriptomic analysis workflows by improving auditability and reproducibility of RNA-seq cluster interpretation, rather than as a standalone discovery or predictive system.


[27] 2602.00197

Rank-and-Reason: Multi-Agent Collaboration Accelerates Zero-Shot Protein Mutation Prediction

Zero-shot mutation prediction is vital for low-resource protein engineering, yet existing protein language models (PLMs) often yield statistically confident results that ignore fundamental biophysical constraints. Currently, selecting candidates for wet-lab validation relies on manual expert auditing of PLM outputs, a process that is inefficient, subjective, and highly dependent on domain expertise. To address this, we propose Rank-and-Reason (VenusRAR), a two-stage agentic framework to automate this workflow and maximize expected wet-lab fitness. In the Rank-Stage, a Computational Expert and Virtual Biologist aggregate a context-aware multi-modal ensemble, establishing a new Spearman correlation record of 0.551 (vs. 0.518) on ProteinGym. In the Reason-Stage, an agentic Expert Panel employs chain-of-thought reasoning to audit candidates against geometric and structural constraints, improving the Top-5 Hit Rate by up to 367% on ProteinGym-DMS99. The wet-lab validation on Cas12i3 nuclease further confirms the framework's efficacy, achieving a 46.7% positive rate and identifying two novel mutants with 4.23-fold and 5.05-fold activity improvements. Code and datasets are released on GitHub (this https URL).


[28] 2405.18605

Merged ChemProt-DrugProt for Relation Extraction from Biomedical Literature

The extraction of chemical-gene relations plays a pivotal role in understanding the intricate interactions between chemical compounds and genes, with significant implications for drug discovery, disease understanding, and biomedical research. This paper presents a data set created by merging the ChemProt and DrugProt datasets to augment sample counts and improve model accuracy. We evaluate the merged dataset using two state of the art relationship extraction algorithms: Bidirectional Encoder Representations from Transformers (BERT) specifically BioBERT, and Graph Convolutional Networks (GCNs) combined with BioBERT. While BioBERT excels at capturing local contexts, it may benefit from incorporating global information essential for understanding chemical-gene interactions. This can be achieved by integrating GCNs with BioBERT to harness both global and local context. Our results show that by integrating the ChemProt and DrugProt datasets, we demonstrated significant improvements in model performance, particularly in CPR groups shared between the datasets. Incorporating the global context using GCN can help increase the overall precision and recall in some of the CPR groups over using just BioBERT.


[29] 2505.12387

Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning

With the rapid discovery of emergent phenomena in deep learning and large language models, understanding their cause has become an urgent need. Here, we propose a rigorous entropic-force theory for understanding the learning dynamics of neural networks trained with stochastic gradient descent (SGD) and its variants. Building on the theory of parameter symmetries and an entropic loss landscape, we show that representation learning is crucially governed by emergent entropic forces arising from stochasticity and discrete-time updates. These forces systematically break continuous parameter symmetries and preserve discrete ones, leading to a series of gradient balance phenomena that resemble the equipartition property of thermal systems. These phenomena, in turn, (a) explain the universal alignment of neural representations between AI models and lead to a proof of the Platonic Representation Hypothesis, and (b) reconcile the seemingly contradictory observations of sharpness- and flatness-seeking behavior of deep learning optimization. Our theory and experiments demonstrate that a combination of entropic forces and symmetry breaking is key to understanding emergent phenomena in deep learning.


[30] 2505.13197

Inferring stochastic dynamics with growth from cross-sectional data

Time-resolved single-cell omics data offers high-throughput, genome-wide measurements of cellular states, which are instrumental to reverse-engineer the processes underpinning cell fate. Such technologies are inherently destructive, allowing only cross-sectional measurements of the underlying stochastic dynamical system. Furthermore, cells may divide or die in addition to changing their molecular state. Collectively these present a major challenge to inferring realistic biophysical models. We present a novel approach, unbalanced probability flow inference, that addresses this challenge for biological processes modelled as stochastic dynamics with growth. By leveraging a Lagrangian formulation of the Fokker-Planck equation, our method accurately disentangles drift from intrinsic noise and growth. We showcase the applicability of our approach through evaluation on a range of simulated and real single-cell RNA-seq datasets. Comparing to several existing methods, we find our method achieves higher accuracy while enjoying a simple two-step training scheme.


[31] 2506.04289

Relational reasoning and inductive bias in transformers and large language models

Transformer-based models have demonstrated remarkable reasoning abilities, but the mechanisms underlying relational reasoning remain poorly understood. We investigate how transformers perform \textit{transitive inference}, a classic relational reasoning task which requires inference indirectly related items (e.g., if $A>B$ and $B>C$, then $A>C$), comparing in-weights learning (IWL) and in-context learning (ICL) strategies. We find that IWL naturally induces a generalization bias towards transitive inference despite training only on adjacent items, whereas ICL models develop induction circuits implementing match-and-copy strategies that fail to encode hierarchical relationships. However, when pre-trained on in-context linear regression tasks, transformers successfully exhibit in-context generalizable transitive inference, displaying both \textit{symbolic distance} and \textit{terminal item effects} characteristic of human and animal performance, without forming induction circuits. We extend these findings to large language models, demonstrating that prompting with linear geometric scaffolds improves transitive inference, while circular geometries (which violate transitivity by allowing wraparound) impair performance, particularly when models cannot rely on stored knowledge. Together, these results reveal that both the training regime and the geometric structure of induced representations critically determine transformers' capacity for transitive inference.


[32] 2506.04536

NOBLE -- Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models

Characterizing the cellular properties of neurons is fundamental to understanding their function in the brain. In this quest, the generation of bio-realistic models is central towards integrating multimodal cellular data sets and establishing causal relationships. However, current modeling approaches remain constrained by the limited availability and intrinsic variability of experimental neuronal data. The deterministic formalism of bio-realistic models currently precludes accounting for the natural variability observed experimentally. While deep learning is becoming increasingly relevant in this space, it fails to capture the full biophysical complexity of neurons, their nonlinear voltage dynamics, and variability. To address these shortcomings, we introduce NOBLE, a neural operator framework that learns a mapping from a continuous frequency-modulated embedding of interpretable neuron features to the somatic voltage response induced by current injection. Trained on synthetic data generated from bio-realistic neuron models, NOBLE predicts distributions of neural dynamics accounting for the intrinsic experimental variability. Unlike conventional bio-realistic neuron models, interpolating within the embedding space offers models whose dynamics are consistent with experimentally observed responses. NOBLE enables the efficient generation of synthetic neurons that closely resemble experimental data and exhibit trial-to-trial variability, offering a $4200\times$ speedup over the numerical solver. NOBLE is the first scaled-up deep learning framework that validates its generalization with real experimental data. To this end, NOBLE captures fundamental neural properties in a unique and emergent manner that opens the door to a better understanding of cellular composition and computations, neuromorphic architectures, large-scale brain circuits, and general neuroAI applications.