New articles on Quantitative Biology


[1] 2606.04004

Oxygenation and spatial heterogeneity shape radiotherapy protocol ranking through phenotypic adaptation

Tumor response to radiotherapy is strongly influenced by oxygen availability and phenotypic heterogeneity, yet their combined impact on the relative performance of fractionation schedules remains unclear. Here, we develop a mathematical model that integrates spatial oxygen dynamics with continuous phenotypic adaptation to hypoxia and radiation, and use it to systematically compare radiotherapy protocols under a common normal-tissue toxicity constraint. Under spatially uniform oxygenation, we find that alternative fractionation schedules provide little improvement over standard-of-care protocols in normoxic conditions. Under moderate hypoxia, however, a distinct class of protracted schedules with longer inter-fraction intervals substantially increases time-to-progression, in some cases by up to twofold. This regime-dependent benefit is consistent with a shift in the balance between reoxygenation and selection for resistant phenotypes. When oxygen delivery is spatially heterogeneous, treatment outcomes depend strongly on the geometric organization of oxygen sources. Even with identical total oxygen supply, different spatial configurations lead to large variability in time-to-progression and can alter the relative ranking of radiotherapy protocols. These results show that radiotherapy effectiveness is not an intrinsic property of a treatment schedule alone, but emerges from its interaction with tumor microenvironmental structure and evolutionary dynamics. Incorporating both spatial heterogeneity and phenotypic adaptation may therefore be important for the consistent evaluation and design of fractionation strategies in heterogeneous tumors.


[2] 2606.04010

The Variance Brain Foundation Models Forgot: Third-Order Statistics Predict Cognition Where Billion-Parameter Models Fail

Brain foundation models (BFMs) are self-supervised Transformers pretrained on fMRI data. We posit that these models should capture each subject's cognitive performance from their fMRI signal. Yet across three state-of-the-art BFMs and every readout we test, they predict cognition worse than a linear regression from the $\sim$80K parameters of the functional connectivity matrix (FC). The gap widens with scale: BrainLM's 650M model predicts cognition worse than its 111M. We attribute this to a \textbf{variance allocation problem}: BFM pretraining captures the variance components that dominate fMRI but not the higher-order structure that predicts cognition. Our per-cumulant analysis of the reconstructed signal shows that the second-order covariance is partially preserved, while the third-order co-skewness tensor is largely destroyed. To recover what BFMs lose, we design a linear pipeline that projects the fMRI signal into the subspace that best preserves its co-skewness and computes FC there. This \textbf{exceeds raw FC and every pretrained BFM} on every dataset and parcellation we test, outperforming prior state-of-the-art under controlled evaluation \textbf{with no pretraining and no GPU}. We \textbf{recover the raw-FC ceiling on BrainLM's forward pass} by finetuning with a loss targeted at this same subspace. This shows that the bottleneck is the pretraining objective, not the architecture or the model size.


[3] 2606.04011

Towards an Ideometrics-Based Understanding of Consciousness, Time, Space and Dreams

From an ideometrics-based perspective, consciousness may reduce the informational entropy of many randomly possible future outcomes through ideometric processes. Consciousness enables a system to internally simulate alternative futures and then voluntarily act, based on ideometric processes, towards realising preferred states in external reality. This may explain why most humans gravitate towards futures that minimise threat and maximise survival, reproduction, safety and well-being. Ideometrics typically uses three fundamental criteria: attractiveness, feasibility and potential impact of many competing ideas. Feasibility and potential impact can, in principle, be computed by non-conscious systems, including artificial intelligence (AI). However, attractiveness may represent the consciously and emotionally experienced valuation of possible futures. Feasibility may have appeared first during evolution, while potential impact required predictive processing, and consciousness added subjective attractiveness to many alternative futures. Within this framework, subjective sense of time may be intertwined with consciousness, providing causal relating and internal ordering to external changes perceived by the senses. Time may require conscious beings to have a meaning, while consciousness may require the subjective sense of time to have a meaning. Space, in turn, provides the structured field in which ideas can acquire causal impact across nested scales. Dreaming may represent remnants of earlier evolutionary stages of internal modelling.


[4] 2606.04020

SpliceBind: Isoform-Aware Prediction of Binding Pocket Druggability

Splice-mediated drug resistance occurs in up to 40% of patients on targeted kinase inhibitors, yet state-of-the-art druggability tools operate on single structures and cannot compare across isoforms. We introduce SpliceBind, a graph neural network framework for isoform-aware druggability prediction. Beyond improving prediction accuracy (AUROC 0.703 vs. P2Rank 0.634, p = 0.026), we address a more fundamental question: when do structural methods succeed, and when must they fail? Systematic analysis of six clinically validated variants spanning five mechanism classes reveals a two-tier resistance taxonomy. Domain deletions (AR-V7, Delta = -18.39) and pocket disruptions produce structurally detectable changes, while allosteric mechanisms (BRAF-p61) remain fundamentally invisible to any pocket-centric approach -- a boundary no algorithmic improvement can cross. Notably, learned embeddings capture affinity-based resistance missed by geometry alone (ALK-L1196M: Delta_SB = -0.228 vs. Delta_P2Rank = -0.95), partially bridging the structural-biochemical gap. On 229 kinase pockets spanning 25 families, SpliceBind achieves AUROC 0.703 (p = 0.026 vs. P2Rank) with robust generalization to held-out families (AUROC 0.761). This taxonomy transforms clinical workflows: upon discovering a splice variant, clinicians can immediately determine whether computational triage suffices or biochemical validation is required -- reducing time from variant discovery to therapeutic decision.


[5] 2606.04021

Structure-Aware Prediction of PROTAC-Mediated Protein Degradability via Graph Neural Networks

Proteolysis-targeting chimeras (PROTACs) can selectively degrade disease-causing proteins, yet predicting which targets are amenable to degradation remains a critical bottleneck: existing computational methods require the complete PROTAC molecular structure, information unavailable before synthesis. We present DegradoMap, a graph neural network that predicts PROTAC-mediated degradability from protein structure and E3 ligase identity alone -- the minimal information available at the target selection stage. The model encodes biophysical priors through lysine-weighted graph pooling with per-protein normalization, models protein-E3 compatibility via cross-attention, and integrates cellular context from the Cancer Dependency Map. On the PROTAC-8K benchmark (3,101 samples, 155 targets, 10 E3 ligases), DegradoMap achieves 0.646+-0.124 AUROC on target-unseen evaluation (best seed: 0.7449) and 0.811 AUROC on CRBN->VHL E3-unseen transfer, outperforming GNN and machine learning baselines. The model additionally recommends optimal E3 ligases with 74% Hit@3 accuracy. Two findings carry broader implications: E(3)-equivariant architectures underperform the simpler invariant design for this scalar prediction task, and ESM-2 embeddings improve peak performance only with careful regularization -- naive integration fails. DegradoMap provides pre-synthesis computational guidance for degradability assessment; its well-calibrated confidence scores (ECE = 0.029, target-unseen) enable practitioners to prioritize high-confidence predictions for experimental follow-up. However, the high seed variance (std = 0.124) and limited E3 coverage require ensembling for reliable deployment.


[6] 2606.04066

SC-TauPath: A Structural Connectivity Attribution Framework for Mapping Tau Propagation Pathways in Alzheimer's Disease

Understanding how structural connections are associated with tau propagation in Alzheimer's disease (AD) remains a central open question, yet existing computational models either rely heavily on biophysical assumptions or lack neurobiologically interpretable pathway maps. We present SC-TauPath, a structural connectivity (SC) attribution framework that maps tau propagation pathways from in vivo neuroimaging data. SC-TauPath combines a Network Diffusion Model (NDM)-augmented multilayer perceptron with gradient $\times$ input attribution to score each SC edge's contribution to tau prediction, then translates these attribution scores into multi-scale pathway maps (backbone edges, high-traffic routes, and hub ROIs), which validates established Braak staging anatomy. Applied to 234 ADNI participants with paired DTI SC and 18F-Flortaucipir PET, SC-TauPath achieves strong cross-validated tau prediction and yields attribution-based pathway maps consistent with established Braak staging anatomy, demonstrating that SC encode spatially specific information about regional tau distribution in AD.


[7] 2606.04154

EpiFormer: Learning Antigen-Antibody Interactions for Epitope Prediction via Geometric Deep Learning

Antibodies neutralize foreign antigens by binding to specific surface regions called epitopes. Computational epitope prediction is critical for understanding immune recognition and guiding antibody engineering. However, existing methods face three fundamental challenges: antibody-aware models encode each chain independently and combine them only at a late stage, failing to capture co-dependent structural features that define binding interfaces, whereas severe class imbalance and scarcity of known antibody-antigen complexes render standard training objectives ineffective. We propose EpiFormer, a general encoder-decoder framework that addresses these challenges jointly. Our key design principle is interleaved cross-attention within GNN encoding layers, enabling bidirectional antigen-antibody information flow throughout representation learning rather than only at the output. This early-fusion principle is backbone-agnostic, providing consistent gains across GNN architectures from simple GCNs to equivariant models. We further show that sparsity-aware objectives are effective when paired with early-fusion architectures for the epitope prediction task. EpiFormer improves over the previous best method by over 40% in F1 score on standard benchmarks, demonstrating generalizability and cross-dataset transferability. Notably, EpiFormer discovers known biological principles as emergent behaviors of end-to-end training, where the learned cross-attention gates favor antigen-to-antibody information flow, consistent with the asymmetric roles of the two chains at the binding interface, and the model's preference for geometric over evolutionary features aligns with the established finding that epitope residues are not evolutionarily conserved. The source code is available at: this https URL


[8] 2606.04234

An Octahedral Fibrous Constitutive Model for Heart Valve Mechanics and Function

Fibrous soft tissues derive their nonlinear mechanical response from networks of extracellular matrix fibers, whose organization gives rise to strain stiffening, the reverse Poynting effect, and anisotropic mechanical behavior. Motivated by these coupled features, we develop an anisotropic hyperelastic model for fibrous biological tissues that accounts for the contribution of the fiber network under both tensile and compressive deformation. We calibrate the model to experimental data for mitral valve leaflets using an inverse finite element approach that is coupled to automatic differentiation to facilitate efficient parameter calibration. Using the calibrated model, we investigate how anisotropy and fiber reorientation affect valve deformation under physiological loading. The results show that greater leaflet compliance in the radial direction yields proper valve closure, whereas localized fiber reorientation leads to stress concentrations that may promote progressive functional degradation. Fiber reorientation that makes the circumferential direction more compliant than the radial direction compromises valve closure and leads to mitral regurgitation. Chordal softening further amplifies the severity of this regurgitant response. These findings suggest that alterations in fiber architecture, especially when accompanied by chordal degradation, can contribute to the onset and progression of mitral valve incompetence.


[9] 2606.04426

Discrete signaling mediates chaotic regularization in recurrent neural networks

Cortical circuits operate in a regime of intrinsic chaos, where even tiny changes in input can lead to divergent neural responses. Yet, remarkably, population codes in the brain vary smoothly with sensory stimuli, forming coherent representational manifolds. How can chaotic networks sustain such stable coding? Here, we develop a theoretical framework that links the microscopic chaos of recurrent networks to the macroscopic geometry of neural representations. Combining kernel methods with dynamical mean-field theory, we show that chaotic dynamics induce local roughness (introducing sharp distortions at small scales) while preserving global smoothness across larger stimulus variations. This structural property acts as an intrinsic regularizer, enhancing generalization while maintaining expressivity. Moreover, we show how chaotic networks naturally produce power-law spectral signatures, closely matching experimental observations in cortical recordings. These results explain how chaotic spiking networks can sustain smooth, differentiable population codes and establish a theoretical framework linking network dynamics, computational structure, and recorded neural activity.


[10] 2606.04551

Quasi-birth-and-death processes evolving within trees: Applications to comparative phylogenetics

We consider a quasi-birth-and-process (QBD) that duplicates itself at some fixed times within a tree that contains information about duplication times and potentially partially observed states. We analyse a continuous trait by discretising it to obtain the QBD level variable. Then, the phase variable is used to model the dynamics of the underlying environment. Here, we extend the framework of Soewongsono et al. to enable a more general analysis. We develop an efficient recursive algorithm for computing the likelihood of an observed tree under this model and construct several numerical examples to illustrate its application potential. Through our synthetic data examples, we show a range of potential behaviours that could be modelled with this approach. Further, we apply the framework to two empirical examples from comparative phylogenetics (the evolution of range area and body size traits across a phylogeny of 49 mammals) to gain different insights into the evolution of these continuous traits. In this setting duplication of the QBD represents speciation and continuous trait evolution is modelled in a discretised state space. In our empirical examples, we explore the impact of different parameter choices on the corresponding likelihood of observing a given phylogenetic tree and the observed levels at its tips.


[11] 2606.04566

AF_Cache: Efficient Pipeline for Running AlphaFold for High-Throughput Protein-Protein Interaction Prediction

Motivation: Accurate prediction of protein-protein interactions is essential for understanding biological processes, and recent advances such as AlphaFold2 and AlphaFold3 have enabled structure-based interaction prediction at unprecedented accuracy. However, the high computational cost of these methods, driven primarily by CPU-based repeated multiple sequence alignment (MSA) generation and, for AlphaFold2, repeated model recompilations, limits their applicability in large-scale, high-throughput settings. This creates a need for efficient pipelines that retain predictive performance while substantially reducing runtime. Results: We present AF_Cache, a high-throughput Nextflow pipeline for accelerating protein-protein interaction prediction using AlphaFold2 and AlphaFold3. AF_Cache combines GPU-accelerated MSA generation with MMseqs2, feature caching to eliminate redundant alignment computations, and sequence length bucketing to minimise repeated JAX compilations. Benchmarking on a dataset of 5,050 human mitochondrial protein pairs demonstrates a $\sim$2-fold reduction in inference time for AlphaFold2 and up to a 13-fold speedup of the MSA generation. AF\_Cache enables efficient large-scale interaction screening and provides a practical framework for deploying AlphaFold-based methods in high-throughput applications. Availability and implementation: The code and Nextflow pipeline are available on GitHub here: this https URL. The code for reproducing the results of the paper, the MSAs, and the predicted models can be found at Zenodo: this https URL


[12] 2606.03995

Early Detection of Alzheimer's Disease Using Explainable Machine Learning on Clinical Biomarkers: A Multi-Class Classification Study Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset

Background: Alzheimer's disease (AD) affects over 55 million people worldwide. Accurate, interpretable detection of normal cognition (NC), mild cognitive impairment (MCI), and AD from routine clinical assessments remains a critical unmet need. Methods: An XGBoost classifier was developed for three-class detection using eight clinical features from the Alzheimer's Disease Neuroimaging Initiative (ADNI): MMSE, CDR Global, CDR Sum of Boxes (CDR-SB), MoCA, FAQ, age, sex, and education. Hyperparameters were optimised using Optuna (50 trials); class imbalance was addressed with SMOTE. Performance was evaluated by macro AUC-ROC with 1,000-iteration bootstrap 95% confidence intervals, macro F1, balanced accuracy, and Cohen's kappa. SHAP values provided feature-level explainability. Results: The dataset comprised 1,641 baseline subjects (608 NC, 767 MCI, 266 AD). On five-fold cross-validation, mean macro AUC was 0.983 (SD 0.007), accuracy 0.944 (SD 0.006), and macro F1 0.929 (SD 0.008). On the held-out test set (n = 247), macro AUC was 0.982 (95% CI: 0.965--0.995), accuracy 0.943, balanced accuracy 0.932, macro F1 0.927, and Cohen's kappa 0.909. SHAP analysis identified CDR Global as the dominant predictor for NC and MCI, while CDR-SB and MMSE together drove AD classification. Conclusion: An explainable machine learning model trained on routine clinical assessments achieves near-perfect three-class Alzheimer's detection. SHAP analysis reveals clinically plausible, class-specific feature importance patterns supporting clinical validity. Future work will extend this framework with speech biomarkers for multimodal detection.


[13] 2606.04525

GENEB: Why Genomic Models Are Hard to Compare

Progress in genomic foundation models is difficult to assess due to fragmented benchmarks, incompatible evaluation protocols, and task-specific reporting. As a result, claims of superiority or generality across models are often not directly comparable. We introduce GENEB, a large-scale diagnostic benchmark that evaluates frozen representations from 40 genomic foundation models across 100 tasks spanning 13 functional categories under a unified probing-based protocol, including few-shot regimes. GENEB enables controlled comparison across model scale, architecture, tokenization, and pretraining data while explicitly exposing task-level trade-offs. Our analysis shows that aggregate leaderboards are unstable: model rankings vary sharply across task categories, scale provides only modest and inconsistent gains, and architectural and pretraining alignment frequently outweigh parameter count. These results highlight limitations of current evaluation practices and position GENEB as a reference framework for principled comparison and category-aware model selection in genomic machine learning.


[14] 2606.04552

LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling

Genomic foundation models increasingly adopt large language model architectures, yet almost universally rely on fixed tokenization schemes such as $k$-mers, BPE, or single nucleotides, which impose arbitrary sequence boundaries that may obscure biologically relevant structure. We present LDARNet, a 120M-parameter hierarchical genomic foundation model that adapts H-Net-style dynamic chunking from autoregressive generation to masked language modeling, combining BiMamba-2 state-space layers with local attention, bidirectional routing, and a ratio-based regularizer to induce adaptive token boundaries without supervision. Fine-tuned on 27 tasks from the Nucleotide Transformer and Genomic Benchmarks suites, LDARNet achieves 11/18 wins among compact models ($<$300M parameters) and state-of-the-art results on 5 histone modification tasks, outperforming models up to 20$\times$ larger. A FLOPs-matched controlled experiment isolates learned routing as the source of these gains: learned boundaries beat fixed-grid boundaries by up to 14 percentage points on histone tasks at identical compute. Nucleotide-resolution analysis further shows that the learned boundaries align with canonical promoter motifs and splice junctions without supervision, providing a biological interpretation for adaptive tokenization in genomic foundation models.


[15] 2606.04994

New Benchmarking Shows Limited Generalization Power of TCR Antigenic Epitope Prediction Models

Accurate computational prediction of T cell receptor (TCR) antigen specificity would transform the study of T cell biology and enable scalable immune engineering, yet existing models lack sufficient sensitivity and specificity for broad applications. A major limitation is the absence of rigorously defined, unseen benchmark datasets that allow unbiased evaluation of model performance and generalizability. Here, we describe two complementary classes of datasets that meet this criterion and argue that they provide both a robust framework for model assessment and a foundation for next-generation TCR-antigen prediction algorithm development.


[16] 2504.16952

Last-layer committee machines for uncertainty estimations of benthic imagery

Automating the annotation of benthic imagery (i.e., images of the seafloor and its associated organisms, habitats, and geological features) is critical for monitoring rapidly changing ocean ecosystems. Deep learning approaches have succeeded in this purpose; however, consistent annotation remains challenging due to ambiguous seafloor images, potential inter-user annotation disagreements, and out-of-distribution samples. Marine scientists implementing deep learning models often obtain predictions based on one-hot representations trained using a cross-entropy loss objective with softmax normalization, resulting with a single set of model parameters. While efficient, this approach may lead to overconfident predictions for context-challenging datasets, raising reliability concerns that present risks for downstream tasks such as benthic habitat mapping and marine spatial planning. In this study, we investigated classification uncertainty as a tool to improve the labeling of benthic habitat imagery. We developed a framework for two challenging sub-datasets of the recently publicly available BenthicNet dataset using Bayesian neural networks, Monte Carlo dropout inference sampling, and a proposed single last-layer committee machine. This approach resulted with a > 95% reduction of network parameters to obtain per-sample uncertainties while obtaining near-identical performance compared to computationally more expensive strategies such as Bayesian neural networks, Monte Carlo dropout, and deep ensembles. The method proposed in this research provides a strategy for obtaining prioritized lists of uncertain samples for human-in-the-loop interventions to identify ambiguous, mislabeled, out-of-distribution, and/or difficult images for enhancing existing annotation tools for benthic mapping and other applications.


[17] 2506.23546

Neural Langevin Machine: a local asymmetric learning rule can be creative

Fixed points of recurrent neural networks can be leveraged to store and generate information. These fixed points can be captured by the Boltzmann-Gibbs measure, which leads to neural Langevin dynamics that can be used to find them for generative learning of a real dataset. We call this type of generative model a neural Langevin machine, which derives an asymmetric and firing-rate-speed adjusted learning rule requiring only local neural signals, thereby bearing biological relevance in terms of local predictive learning. An interesting out-of-equilibrium regime of the generative process is revealed, together with a memorization-to-generalization transition with increasing training data size. The neuro-inspired machine can also realize a continuous exploration of the phase space for different kinds of generative images and can denoise a corrupted image as well.


[18] 2512.05252

Competition, stability, and functionality in excitatory-inhibitory neural circuits

Energy-based models have become a central paradigm for understanding computation and stability in both theoretical neuroscience and machine learning. However, the energetic framework typically relies on symmetry in synaptic or weight matrices - a constraint that excludes biologically realistic systems such as excitatory-inhibitory (E-I) networks. When symmetry is relaxed, the classical notion of a global energy landscape fails, leaving the dynamics of asymmetric neural systems conceptually unanchored. In this work, we extend the energetic framework to asymmetric firing rate networks, revealing an underlying game-theoretic structure for the neural dynamics in which each neuron is an agent that seeks to minimize its own energy. In addition, we exploit rigorous stability principles from network theory to study regulation and balancing of neural activity in E-I networks. We combine the novel game-energetic interpretation and the stability results to revisit standard frameworks in theoretical neuroscience, such as the Wilson-Cowan and lateral inhibition models. These insights allow us to study cortical columns of lateral inhibition microcircuits as contrast enhancer - with the ability to selectively sharpen subtle differences in the environment through hierarchical excitation-inhibition interplay. Our results bridge energetic and game-theoretic views of neural computation, offering a pathway toward the systematic engineering of biologically grounded, dynamically stable neural architectures.


[19] 2601.10221

Cognitive Field Theory of Learning, Inference, and Emergence

Learning, inference, memory, and emergence in biological and artificial systems are often described using disparate theoretical frameworks. Here we develop a cognitive field theory in which cognition emerges as a collective nonequilibrium phenomenon governed by the infrared organization of adaptive dynamical time scales. Starting from a stochastic cognitive-field equation with homeostatic stabilization and adaptive manifold geometry, we show that cognitive dynamics is organized by slowly relaxing infrared modes embedded within a high-dimensional cognitive manifold. The resulting dynamics may be interpreted in terms of a macroscopic cognitive state continuously coupled to hierarchically organized slow-memory sectors. Integrating out these latent sectors generates retarded self-energy feedback and nonlocal memory kernels that soften the infrared response of the cognitive field. The resulting susceptibility enhancement produces a protected near-critical regime characterized by long-time contextual persistence and coherent collective dynamics. We introduce the time-scale density of states (TDOS) as a fundamental descriptor of the relaxation spectrum underlying inference, memory, and adaptive reasoning. Learning continuously reorganizes the infrared TDOS, selectively stabilizing weakly damped sectors that support contextual organization and recursive memory feedback. Near criticality, the TDOS generically develops a broad and nearly flat infrared structure associated with the accumulation of slowly relaxing modes. The resulting memory self-energy suppresses the effective forgetting gap, enhances collective susceptibility, and generates scale-free temporal organization over extended time scales. Within this framework, memory formation, adaptive reasoning, and emergent intelligence arise as hierarchical stages of infrared collective dynamical organization.


[20] 2602.11189

MuCO: Generative Peptide Cyclization Empowered by Multi-stage Conformation Optimization

Modeling peptide cyclization is critical for the virtual screening of candidate peptides with desirable physical and pharmaceutical properties. This task is challenging because a cyclic peptide often exhibits diverse, ring-shaped conformations, which cannot be well captured by deterministic prediction models derived from linear peptide folding. In this study, we propose MuCO (Multi-stage Conformation Optimization), a generative peptide cyclization method that models the distribution of cyclic peptide conformations conditioned on the corresponding linear peptide. In principle, MuCO decouples the peptide cyclization task into three stages: topology-aware backbone design, generative side-chain packing, and physics-aware all-atom optimization, thereby generating and optimizing conformations of cyclic peptides in a coarse-to-fine manner. This multi-stage framework enables an efficient parallel sampling strategy for conformation generation and allows for rapid exploration of diverse, low-energy conformations. Experiments on the large-scale CPSea dataset demonstrate that MuCO significantly and consistently outperforms state-of-the-art methods in physical stability, structural diversity, secondary structure recovery, and computational efficiency, making it a promising computational tool for exploring and designing cyclic peptides. The demo of the proposed method can be found at this https URL.


[21] 2605.00465

Intrinsic Brain Networks Underlying the Experience and Expression of Subclinical Anxiety

Anxiety includes behavioural, physiological, and subjective components that do not always align, and it remains unclear whether these dimensions are supported by distinct intrinsic brain networks. Guided by the two-system framework, we tested whether resting-state functional connectivity (rsFC) differentiates these components in subclinical anxiety. Forty-seven young adults spanning a range of subclinical anxiety levels completed a threat anticipation task measuring behavioral responses (reaction time) and physiological arousal (skin conductance), along with the NIH Fear-Affect self-report of anxiety severity. These measures were related to rsFC using region-of-interest analyses. Higher subclinical anxiety was associated with faster responses under temporally uncertain threat, consistent with increased vigilance, while no association was found with physiological arousal. At the neural level, three connectivity patterns emerged and remained significant after sequential family-wise error correction. Behavioural responses modulated by subclinical anxiety were linked to stronger connectivity between the anterior cingulate cortex (ACC) and insula. Physiological modulation was associated with connectivity between the ACC and orbitofrontal cortex (OFC). Subjective anxiety was associated with increased connectivity between the hippocampus and insula. Additional connections were observed but did not survive stricter correction. Overall, the findings indicate that behavioural, physiological, and subjective aspects of subclinical anxiety map onto partially dissociable but overlapping intrinsic brain networks, extending prior task-based results to resting-state connectivity and informing future work on early neural markers of anxiety.


[22] 2605.16331

Retrieval and competition: how a protein foundation model starts a protein

Protein language models are increasingly used to guide experimental and clinical decisions, yet it is often unclear whether a confident prediction reflects recognition of biological evidence or retrieval of a statistical default. We examine this distinction for a near-universal biological rule, that proteins begin with methionine, by tracing the computational pathway through which ESM2-8M produces this prediction. The model does not detect methionine at the masked position. Instead, it retrieves a methionine-favouring signal from a reference representation at the beginning-of-sequence token via a position-specific query assembled across layers, with the final output emerging through competition with context-dependent circuits. To understand how positional information reaches the readout, we introduce a norm-direction decomposition of attention scores within rotary frequency bands. Positional encoding operates through coupled changes in query norm and angular alignment distributed across these bands. On sequences whose true N-terminus is not methionine, where the biological question matters, the model predicts methionine anyway. This is not a correct prediction produced by an unexpected mechanism, but the output of a positional-prior retrieval circuit that matches the statistical average and fails where biology diverges from it. Distinguishing the two requires resolution at the level of individual circuits, frequency bands, and query composition, suggesting that mechanistic verification will be necessary, and challenging, for predictions where the biological stakes are higher. Even for the simplest biological rule, the model's prediction is mediated by a distributed computational circuit rather than direct recognition, suggesting that increasing task complexity will further obscure the relationship between model confidence and underlying biological evidence.


[23] 2605.22598

Efficient coding under constraint drives neural systems towards criticality and sloppiness

It is widely accepted that the brain operates near a critical state, characterized by neural avalanches that follow power-law distributions. However, the functional rationale for why neural systems attain criticality remains unclear. Here, we present a theoretical framework that links efficient coding to criticality in neural populations. Using a Gaussian population coding model, we demonstrate that maximizing Fisher information under resource constraints naturally leads to the emergence of soft modes and diverging correlation lengths, which are hallmarks of criticality. By introducing spatial structure, we unify two distinct perspectives of criticality: statistical criticality with diverging correlation lengths and dynamical criticality with critical slowing down as well as bifurcation. Furthermore, this framework provides a natural explanation for the sloppiness observed in neural systems. Numerical simulations confirm that optimization results in power-law response, providing a mechanistic link between efficient coding, sloppiness and the critical brain hypothesis.


[24] 2601.03599

The Feller diffusion as the limit of a coalescent point process

The Feller diffusion is studied as the limit of a coalescent point process in which the density of the node height distribution is skewed towards zero. Using a unified approach, a number of recent results pertaining to scaling limits of branching processes are reviewed and reinterpreted as properties of the Feller diffusion arising from this limit. The notion of Bernoulli sampling of a finite population is extended to the diffusion limit to cover finite Poisson-distributed samples drawn from infinite continuum populations. We show that the coalescent tree of a Poisson-sampled Feller diffusion corresponds to a coalescent point process with a node height distribution taking the same algebraic form as that of a Bernoulli-sampled birth-death process. By adapting methods for analysing k-sampled birth-death processes, in which the sample size is pre-specified, we develop methods for studying the coalescent properties of the k-sampled Feller diffusion.


[25] 2602.14885

Drift-Diffusion Matching: Embedding dynamics in latent manifolds of asymmetric neural networks

Recurrent neural networks (RNNs) provide a theoretical framework for understanding computation in biological neural circuits, yet classical results, such as Hopfield's model of associative memory, rely on symmetric connectivity that restricts network dynamics to gradient-like flows. In contrast, biological networks support rich time-dependent behaviour facilitated by their asymmetry. Here we introduce a general framework, which we term drift-diffusion matching, for training continuous-time RNNs to represent arbitrary, nonlinear stochastic differential equations (SDEs), with given drift and diffusion coefficients, within a low-dimensional latent subspace. Allowing asymmetric connectivity, we show that RNNs can faithfully embed the drift and diffusion of a given SDE, including nonlinear and nonequilibrium dynamics such as chaotic attractors. As an application, we construct RNN realisations of stochastic systems that transiently explore various attractors through both input-driven switching and autonomous transitions driven by nonequilibrium currents, which we interpret as models of associative and sequential (episodic) memory. To elucidate how these dynamics are encoded in the network, we introduce decompositions of the RNN based on its asymmetric connectivity and its time-irreversibility. Our results extend attractor neural network theory beyond equilibrium, showing that asymmetric neural populations can implement a broad class of dynamical computations within low-dimensional manifolds, unifying ideas from associative memory, nonequilibrium statistical mechanics, and neural computation.