New articles on Quantitative Biology


[1] 2604.09911

The Rise and Fall of $G$ in AGI

In the psychological literature the term `general intelligence' describes correlations between abilities and not simply the number of abilities. This paper connects Spearman's $g$-factor from psychometrics, measuring a positive manifold, to the implicit ``$G$-factor'' in claims about artificial general intelligence (AGI) performance on temporally structured benchmarks. By treating LLM benchmark batteries as cognitive test batteries and model releases as subjects, principal component analysis is applied to a models $\times$ benchmarks $\times$ time matrix spanning 39 models (2019--2025) and 14 benchmarks. Preliminary results confirm a strong positive manifold in which all 28 pairwise correlations positive across 8 benchmarks. By analyzing the spectrum of the benchmark correlation through time, PC1 explains 90\% of variance on a 5-benchmark core battery ($n=19$)) reducing to 77\% by 2024. On a four benchmark battery, PC1 is found to peak at 92\% of the variance between 2023--2024 and reduce to 64\% with the arrival of reasoning-specialized models in 2024. This is coincident with a rotation in the G-factor as models outsource `reasoning' to tools. The analysis of partial correlation matrices through time provides evidence for the evolution of specialization beneath the positive manifold of general intelligence (AI-hedgehog) encompassing diverse high dimensional problem solving systems (AI-foxes). In strictly psychometric terms, AI models exhibit general intelligence suppressing specialized intelligences. LLMs invert the ideal of substituting complicated models with parsimonious mechanisms, a `Ptolemaic Succession' of theories, with architectures of increasing hierarchical complication and capability.


[2] 2604.09966

Fragmentation is a diversity ratchet

A fragmented landscape reduces the impact of interspecies connectivity, leading to higher diversity levels than otherwise possible in a connected landscape. Reconnecting a previously fragmented landscape initiates an extinction event, preferentially weeding out more highly connected species. A sequence of fragmentation-coalescence events will drive the ecosystem to higher levels of diversity in a ratchet-like effect, than if the landscape continuously remained connected.


[3] 2604.10036

Astrocytic resource diffusion stabilizes persistent activity in neural fields

Persistent neural activity underlying working memory requires sustained synaptic transmission, yet the metabolic and neurotransmitter support provided by astrocyte networks is largely absent from spatially extended neural circuit models. We introduce a coupled astrocyte-neural field model in which synaptic efficacy is regulated by depletion and recovery of a conserved resource pool recycled and spatially redistributed through diffusively coupled astrocytes. We obtain explicit stationary bump profiles and self-consistency conditions for bump width and amplitude on a canonical ring architecture. Linearizing about these solutions while carefully accounting for perturbations at bump boundaries, we analyze the resulting spectral problem governing stability. Our analysis, supported by numerical simulations and low-dimensional Fourier truncations, reveals a two-stage stabilization mechanism: astrocytic diffusion smooths resource asymmetries created by small bump displacements, and synaptic replenishment transfers this smoothing back to the synaptic pool. Together, sufficiently strong diffusion and replenishment suppress drift instabilities and enlarge the parameter regime in which stationary bumps persist.


[4] 2604.10571

Universal statistical signatures of evolution in artificial intelligence architectures

We test whether artificial intelligence architectural evolution obeys the same statistical laws as biological evolution. Compiling 935 ablation experiments from 161 publications, we show that the distribution of fitness effects (DFE) of architectural modifications follows a heavy-tailed Student's t-distribution with proportions (68% deleterious, 19% neutral, 13% beneficial for major ablations, n=568) that place AI between compact viral genomes and simple eukaryotes. The DFE shape matches D. melanogaster (normalized KS=0.07) and S. cerevisiae (KS=0.09); the elevated beneficial fraction (13% vs. 1-6% in biology) quantifies the advantage of directed over blind search while preserving the distributional form. Architectural origination follows logistic dynamics (R^2=0.994) with punctuated equilibria and adaptive radiation into domain niches. Fourteen architectural traits were independently invented 3-5 times, paralleling biological convergences. These results demonstrate that the statistical structure of evolution is substrate-independent, determined by fitness landscape topology rather than the mechanism of selection.


[5] 2604.10606

Relaxing in Warped Spaces: Generalized Hierarchical and Modular Dynamical Neural Network

We propose a dynamical neural network model with a hierarchical and modular structure. The network architecture can be derived by minimizing an energy function that is originally designed based on two kinds of neurons with quite different time constants. It has multiple subspaces that are spanned by neural parameters employed in the energy function, and adjacent subspaces are related to each other with a layered internetwork. Each internetwork further consists of a pair of a forward subnet and a backward one, and signals flowing through these subnets determine total dynamics of the network. The model can operate in either a learning or an association mode. In the learning mode, when periodic signals equivalent to repetitive neuronal bursting are suitably applied to input ports in all subspaces, mapping relationships corresponding to those input signals are eventually formed in internetworks between subspaces. Various two-dimensional mapping relationships between subspaces can be shaped by employing an appropriate set of periodic input signals with different frequencies based on the same mechanism as a Lissajous curve. The model in the association mode provides an overall framework such that state variables inside the network individually relax in warped spaces, each of which has been designed as favorable for a (or some) state variable(s). The association mode is further classified into two modes; unconstrained and constrained. In the latter mode, for instance, when a sufficiently slow periodic trajectory is set as an input, a warped output trajectory appears in each subspace as if imaginary layered networks with the inverse mapping relationships to existing forward subnets' were located hierarchically from outside to inside. These results suggest that a certainty/uncertainty relation exists between an input trajectory and an output trajectory.


[6] 2604.10957

A molecular clock for writing systems reveals the quantitative impact of imperial power on cultural evolution

Writing systems are cultural replicators whose evolution has never been studied quantitatively at global scale. We compile the Global Script Database (GSD): 300 writing and notation systems, 50 binary structural characters, and 259 phylogenetic edges spanning 5,400 years. Applying four methods -- phenetics, cladistics, Bayesian inference, and neural network clustering -- we find that scripts exhibit a detectable molecular clock. The best-fitting model (Mk+Gamma strict clock) yields a substitution rate of q = 0.226 substitutions/character/millennium (95% CI: 0.034-1.22; Delta BIC = -4.1 versus relaxed clock; Delta BIC = -1,364.7 versus Mk without rate variation). Political interventions break this clock: deviation from expected divergence times correlates with intervention intensity (Spearman rho = 0.556, p < 10^{-4}), and per-character rate analysis reveals that intervention selectively rewrites deep structural features rather than merely accelerating change (rate profile correlation rho = 0.320). We identify 30 major script replacement events and rank their destructive impact. A ceiling effect suppresses independent invention wherever writing already exists (Fisher's exact OR = 0.054, p < 10^{-6}), and colonial contact predicts script extinction (Cox HR = 5.25, p = 0.0006). The Spanish Empire extinguished the most scripts (6 of 12 contacted, 50%), followed by the Empire of Japan (3 of 9, 33.3%). Feature coding was validated by inter-rater reliability testing with two independent human coders (Cohen's kappa = 0.877; human-LLM kappa = 0.929; Fleiss' kappa = 0.911).


[7] 2604.10995

How complex behavioural contagion can prevent infectious diseases from becoming endemic

Infectious disease transmission in human populations has a complex two-way interaction with changes in host behaviour. It is increasingly recognised that incorporating adaptive behavioural change into epidemic models is important for improving understanding of infectious disease dynamics and developing policy-relevant modelling tools. An important aspect of behavioural dynamics is social contagion, where people tend to adopt behaviours exhibited by others around them. In a simple behavioural contagion model, the behaviour uptake rate increases linearly with the number of contacts who have adopted a given behaviour. Here, we explore an epidemic model with complex behavioural contagion, where the behaviour uptake rate is a nonlinear function of the number of behaving contacts. We identify key bifurcation parameters of the model, which include the basic reproduction number $R_0$, the strength of the behavioural effect on disease transmission, and the speed of behaviour uptake relative to behaviour abandonment. We show that, in some regions of parameter space, the model has multiple disease-free equilibria. In this situation, the occurrence of an epidemic in a population with an initially low level of behaviour practice can trigger a self-sustaining increase in behaviour, which then causes the disease to be eliminated. In some cases, while moderate values of $R_0$ lead to the disease becoming endemic, higher values of $R_0$ may lead to behaviour-driven disease elimination. We demonstrate that this mechanism of epidemic-triggered uptake of behaviour leading to disease elimination can occur in the presence and absence of temporary post-infection immunity.


[8] 2604.11178

Probabilistic Prediction of Neural Dynamics via Autoregressive Flow Matching

Forecasting neural activity in response to naturalistic stimuli remains a key challenge for understanding brain dynamics and enabling downstream neurotechnological applications. Here, we introduce a generative forecasting framework for modeling neural dynamics based on autoregressive flow matching (AFM). Building on recent advances in transport-based generative modeling, our approach probabilistically predicts neural responses at scale from multimodal sensory input. Specifically, we learn the conditional distribution of future neural activity given past neural dynamics and concurrent sensory input, explicitly modeling neural activity as a temporally evolving process in which future states depend on recent neural history. We evaluate our framework on the Algonauts project 2025 challenge functional magnetic resonance imaging dataset using subject-specific models. AFM significantly outperforms both a non-autoregressive flow-matching baseline and the official challenge general linear model baseline in predicting short-term parcel-wise blood oxygenation level-dependent (BOLD) activity, demonstrating improved generalization and widespread cortical prediction performance. Ablation analyses show that access to past BOLD dynamics is a dominant driver of performance, while autoregressive factorization yields consistent, modest gains under short-horizon, context-rich conditions. Together, these findings position autoregressive flow-based generative modeling as an effective approach for short-term probabilistic forecasting of neural dynamics with promising applications in closed-loop neurotechnology.


[9] 2604.11208

The Neurobiological Craving Signature (NCS) predicts social craving and responds to social isolation

Humans are inherently social and seek connection with others for survival. Recent studies suggest that acute social isolation leads to craving for social interactions, but the brain mechanisms of social craving and their relationship to brain networks underlying drug and food craving remain incompletely understood. Here we harnessed an existing dataset and tested whether the Neurobiological Craving Signature (NCS)-a recently developed fMRI-based brain-signature of drug and food craving-also predicts social craving. During fMRI, participants rated their craving for images of food, social interactions, and flowers in three different sessions: after 10h of fasting from food, 10h of social isolation, or neither (baseline; order of sessions counterbalanced). The NCS significantly predicted self-reported craving for food and social cues but not flower cues. Further, NCS responses to food were higher after fasting compared to baseline, and higher for social cues after social isolation compared to baseline, demonstrating its responsiveness to both food and social deprivation. These findings resonate with recent work showing shared brainstem circuits for hunger and social isolation, and indicate shared whole-brain circuits for social, food, and drug craving. They open new avenues for testing the NCS across different primary rewards, for assessing the consequences of their deprivation, and for examining how social deprivation-such as loneliness and isolation-interacts with overeating and drug use.


[10] 2604.11451

Neutralization titers reveal the structure of polyclonal antibody responses

The composition of a polyclonal antibody response is hard to measure experimentally but contains vital information about the robustness of immunity. Here, we argue that the statistics of neutralization titers alone can be used to make quantitative predictions about the composition of the response, circumventing challenges arising through sequencing and monoclonal antibody expression. We show that the response against influenza within a cohort can be either driven by a collective phenomenon where many antibodies contribute to neutralization, or dominated by just a few strong binders, leading to a broad distribution of titers across individuals described by a Gumbel distribution from extreme value theory. Comparing titers across cohorts, we find that Gumbel statistics {accurately describe} individuals prior to an immune challenge. We propose an equilibrium binding model that quantitatively captures titer data and illustrates the structure of the polyclonal response. Our approach extends generically to immune responses to other pathogens.


[11] 2604.11482

Integrated information theory: the good, the bad and the misunderstood

The integrated information theory of consciousness (IIT) is uniquely ambitious in proposing a mathematical formula, derived from apparently fundamental properties of conscious experience, to describe the quantity and quality of consciousness for any physical system that possesses it. IIT has generated considerable debate, which has engendered some misunderstandings and misrepresentations. Here we address and hope to remedy this. We begin by concisely summarising the essentials of IIT. Given IIT is supposed to apply universally, we do this with reference to an arbitrary patch of matter, as opposed to the usual system of discrete computational units. Then, after briefly summarising IIT's theoretical and empirical achievements, we focus on five points which we consider especially important for driving forward new theory and increasing understanding. First, a high value of the measure $\Phi$ is not synonymous with `more consciousness'. We describe how $\Phi$ might be replaced with a suite of quantities to obtain a multi-dimensional characterisation of states of consciousness. Second, we describe with nuance the distinct flavour of panpsychism implied by IIT -- whereby space (and time) are tiled with substrates of (proto-) consciousness -- and find this is not problematic for the theory. Third, $\Phi$ is not well-defined for real physical systems, and has not been computed on any real physical system. Fourth, so far only proxies for IIT measures have been computed, and not approximations. Fifth, for IIT to fit with current successful theories in fundamental physics, a reformulation in terms of continuous fields would be needed.


[12] 2604.11555

Will a Large Complex System be Stable? Revisited

Over fifty years ago, Robert May applied random matrix theory to show that as ecological systems grow in size, stability decreases. What emerged from this and the critique that followed was decades of what has been called the complexity-stability debate. However, decades of critique over the assumptions that Robert May applied in carrying out his analysis have not been enough to fully dispel the strength of his conclusion and close the debate. Drawing on a mathematical approach that had not yet been fully developed in the early 70s, it is possible to revisit the argument without the use of random matrix techniques, and provide more detailed understanding of the mechanisms that play a deciding role in stability of ecological systems, countering the broad conclusion that led to the complexity-stability debate.


[13] 2604.11696

The origin of the genetic code is encrypted in the structure of present-day transfer RNAs

Background/ Objectives: Resolving the origin of the genetic code is fundamental to understanding how life began its journey out of the chemical world. Since its deciphering some 60 years ago, there is still no general theory of the emergence of the genetic code. My objectives are to bring some unique data that might provide some insight into this particular issue. Methods: Because tRNA (transfer RNA) constitutes a crucial piece of the present translational system, having unique structural characteristics, I hypothesized that they might constitute the key elements at the origin of the genetic code and thus decided to compare the primary structure of the tRNAs from a bacterium, Bacillus subtilis. Results: The comparison of the primary structure of the tRNAs from Bacillus subtilis generated a genealogical tree, meaning that the tRNAs were all related and appeared gradually in a precise time sequence. Remarkably, analysis of the various characteristics of this tRNAs tree showed that it very likely reflects the time of entry of amino acids into the Universal Codon Table. Conclusions: These results strongly suggest that the tRNA entity was indeed a major component in the formation of the genetic code and, further, provide a likely scenario for the time sequence of codon colonization of the Universal Codon Table by the various amino acids at the very beginning of life. Also, these data are interpreted in terms of a general theory of the origin of the genetic code I propose, the poly-tRNA theory.


[14] 2604.09702

Identity-Aware U-Net: Fine-grained Cell Segmentation via Identity-Aware Representation Learning

Precise segmentation of objects with highly similar shapes remains a challenging problem in dense prediction, especially in scenarios with ambiguous boundaries, overlapping instances, and weak inter-instance visual differences. While conventional segmentation models are effective at localizing object regions, they often lack the discriminative capacity required to reliably distinguish a target object from morphologically similar distractors. In this work, we study fine-grained object segmentation from an identity-aware perspective and propose Identity-Aware U-Net (IAU-Net), a unified framework that jointly models spatial localization and instance discrimination. Built upon a U-Net-style encoder-decoder architecture, our method augments the segmentation backbone with an auxiliary embedding branch that learns discriminative identity representations from high-level features, while the main branch predicts pixel-accurate masks. To enhance robustness in distinguishing objects with near-identical contours or textures, we further incorporate triplet-based metric learning, which pulls target-consistent embeddings together and separates them from hard negatives with similar morphology. This design enables the model to move beyond category-level segmentation and acquire a stronger capability for precise discrimination among visually similar objects. Experiments on benchmarks including cell segmentation demonstrate promising results, particularly in challenging cases involving similar contours, dense layouts, and ambiguous boundaries.


[15] 2604.10476

The Dynamic Origin of Kleiber's Law

The ubiquitous $3/4$ metabolic scaling exponent, known as Kleiber's law, has long been attributed to the minimization of viscous dissipation within fractal transport networks. In this paper, we invert this standard narrative, demonstrating that Kleiber's law is fundamentally a signature of pulsatile wave physics rather than steady-state geometry. By coupling local branching optimization to global allometry, we derive the exact generalized metabolic exponent $\beta = d\alpha/(2d+\alpha)$, which strictly maps local transport microphysics to global organismal scaling. We show that dynamic wave-impedance matching in the proximal vasculature uniquely enforces $\beta = 3/4$ in three dimensions. This bound is dynamically protected: no static optimization of a viscous network can reproduce it. Consequently, we analytically predict the critical body mass for the wave-to-viscous transition, successfully explaining the empirical shift to steeper allometric scaling ($\beta \approx 0.9$) in small mammals and invertebrates with no free parameters. Furthermore, we demonstrate that the classical West--Brown--Enquist (WBE) derivation is structurally divergent under its own geometric assumptions, failing at the required proximal-dominance limit. Our framework is validated across nine biological systems spanning five phyla -- including vertebrate vasculature, insect tracheae, plant xylem, and sponge canals -- accurately predicting empirical branching exponents from independent biophysical measurements. Ultimately, we establish a general allometric equation of state that organizes diverse biological networks into discrete universality classes, generating falsifiable predictions across clades from shrews to flatworms.


[16] 2604.10609

Self-supervised Pretraining of Cell Segmentation Models

Instance segmentation enables the analysis of spatial and temporal properties of cells in microscopy images by identifying the pixels belonging to each cell. However, progress is constrained by the scarcity of high-quality labeled microscopy datasets. Many recent approaches address this challenge by initializing models with segmentation-pretrained weights from large-scale natural-image models such as Segment Anything Model (SAM). However, representations learned from natural images often encode objectness and texture priors that are poorly aligned with microscopy data, leading to degraded performance under domain shift. We propose DINOCell, a self-supervised framework for cell instance segmentation that leverages representations from DINOv2 and adapts them to microscopy through continued self-supervised training on unlabeled cell images prior to supervised fine-tuning. On the LIVECell benchmark, DINOCell achieves a SEG score of 0.784, improving by 10.42% over leading SAM-based models, and demonstrates strong zero-shot performance on three out-of-distribution microscopy datasets. These results highlight the benefits of domain-adapted self-supervised pretraining for robust cell segmentation.


[17] 2604.10614

Kinetic models of opinion-driven epidemic dynamics modulated by graphons

We introduce kinetic models to simulate epidemic spread while accounting for individuals' opinions on protective behaviors. Opinion exchanges occur on a social network represented by a graphon, leading to scenarios with or without opinion leaders. We prove convergence to equilibrium in the strong $L^1$ norm via relative entropy methods and in homogeneous Sobolev spaces $\dot{H}^{-s}$, $s \in \big(\frac{1}{2},1\big)$, using Fourier-based techniques. We then design a structure-preserving scheme for the coupled opinion-epidemiological system, highlighting graphon effects: opinion leaders supporting protective behaviors limit disease spread, whereas influenceable individuals may shift toward opposing views, worsening epidemics. Finally, we introduce a time-dependent quantity, analogous to the reproduction number, whose oscillations can generate epidemic waves without explicit external forcing. The MATLAB code implementing our algorithms is made publicly available.


[18] 2604.11287

Consistency of AI-Generated Exercise Prescriptions: A Repeated Generation Study Using a Large Language Model

Background: Large language models (LLMs) have been explored as tools for generating personalized exercise prescriptions, yet the consistency of outputs under identical conditions remains insufficiently examined. Objective: This study evaluated the intra-model consistency of LLM-generated exercise prescriptions using a repeated generation design. Methods: Six clinical scenarios were used to generate exercise prescriptions using Gemini 2.5 Flash (20 outputs per scenario; total n = 120). Consistency was assessed across three dimensions: (1) semantic consistency using SBERT-based cosine similarity, (2) structural consistency based on the FITT principle using an AI-as-a-judge approach, and (3) safety expression consistency, including inclusion rates and sentence-level quantification. Results: Semantic similarity was high across scenarios (mean cosine similarity: 0.879-0.939), with greater consistency in clinically constrained cases. Frequency showed consistent patterns, whereas variability was observed in quantitative components, particularly exercise intensity. Unclassifiable intensity expressions were observed in 10-25% of resistance training outputs. Safety-related expressions were included in 100% of outputs; however, safety sentence counts varied significantly across scenarios (H=86.18, p less than 0.001), with clinical cases generating more safety expressions than healthy adult cases. Conclusions: LLM-generated exercise prescriptions demonstrated high semantic consistency but showed variability in key quantitative components. Reliability depends substantially on prompt structure, and additional structural constraints and expert validation are needed before clinical deployment.


[19] 2604.11354

Strategy evolution on networks under payoff uncertainty and risk preference

Cooperation is a key driver of human social progress. Studies of the evolution of cooperation typically assume a deterministic outcome for social interactions. But in real-world social interactions, interaction outcomes are often subject to stochastic perturbations arising from open environments. Individuals may show different attitudes towards such uncertainty, some are risk-seeking, while others tend to be risk-averse. Here we investigate how risk preference towards uncertain payoffs affects the evolution of cooperation on social networks, where uncertainty originates from random punishment of defectors initiated by cooperators. We provide an analytical treatment of how the distribution of risk preference among individuals alters the threshold required for cooperation. We find that, at the population level, risk-averse behavior promotes or even rescues cooperation. At the node level, variation in risk preference has a significant impact when it occurs on nodes with high degree centrality. When nodes have the same degree centrality, the nodes with lower betweenness centrality exhibit a stronger effect on strategy evolution. Our analysis reveals how risk preference, together with spatial structure, jointly shapes and potentially reverses the evolutionary dynamics of cooperation.


[20] 2604.11483

CAGenMol: Condition-Aware Diffusion Language Model for Goal-Directed Molecular Generation

Goal-directed molecular generation requires satisfying heterogeneous constraints such as protein--ligand compatibility and multi-objective drug-like properties, yet existing methods often optimize these constraints in isolation, failing to reconcile conflicting objectives (e.g., affinity vs. safety), and struggle to navigate the non-differentiable chemical space without compromising structural validity. To address these challenges, we propose CAGenMol, a condition-aware discrete diffusion framework over molecular sequences that formulates molecular design as conditional denoising guided by heterogeneous structural and property signals. By coupling discrete diffusion with reinforcement learning, the model aligns the generation trajectory with non-differentiable objectives while preserving chemical validity and diversity. The non-autoregressive nature of diffusion language model further enables iterative refinement of molecular fragments at inference time. Experiments on structure-conditioned, property-conditioned, and dual-conditioned benchmarks demonstrate consistent improvements over state-of-the-art methods in binding affinity, drug-likeness, and success rate, highlighting the effectiveness of our framework.


[21] 2505.03458

Improved classification of Alzheimer's disease and mild cognitive impairment through dynamic functional network analysis

Brain networks from functional MRI have advanced our understanding of cortical activity and its disruption in neurodegenerative disorders. Recent work has increasingly focused on dynamic (time-varying) brain networks that capture both spatial and temporal patterns of regional co-activity, yet this approach remains underexplored across the Alzheimer's disease (AD). We analysed age- and sex-matched static and dynamic functional brain networks derived from resting-state fMRI data in 315 individuals with AD, mild cognitive impairment (MCI), and cognitively normal healthy controls (HC) from the ADNI-3 cohort. Functional networks were constructed using the Juelich brain atlas, with static connectivity estimated from full time series and dynamic connectivity derived using a sliding-window approach. Group differences were assessed at both link and node levels using non-parametric statistics and bootstrap resampling. While HC and MCI exhibited similar static and dynamic patterns at the node level, clearer differences emerged in AD. Stable (stationary) differences in functional connectivity were identified between white matter regions and parietal and somatosensory cortices, whereas temporally varying differences were consistently observed in connections involving the amygdala and hippocampal formation. Node centrality analysis further suggested that white matter connectivity differences are predominantly local in nature. These findings highlight both shared and distinct functional connectivity patterns across static and dynamic networks, underscoring the importance of incorporating temporal dynamics into brain network analyses of the Alzheimer's spectrum. Additionally, a Random Forest model trained on regional BOLD time series informed by static and dynamic metrics achieved robust classification of MCI, AD, and HC groups, demonstrating the diagnostic potential of time-varying connectivity.


[22] 2509.02651

Bias Detection in Emergency Psychiatry: Linking Negative Language to Diagnostic Disparities

The emergency department (ED) is a high stress environment with increased risk of clinician bias exposure. In the United States, Black patients are more likely than other racial/ethnic groups to obtain their first schizophrenia (SCZ) diagnosis in the ED, a highly stigmatizing disorder. Therefore, understanding the link between clinician bias exposure and psychiatric outcomes is critical for promoting nondiscriminatory decision-making in the ED. This study examines the association between clinician bias exposure and psychiatric diagnosis using a sample of patients with anxiety, bipolar, depression, trauma, and SCZ diagnoses (N=29,005) from a diverse, large medical center. Clinician bias exposure was quantified as the ratio of negative to total number of sentences in psychiatric notes, labeled using a large language model (Mistral). We utilized logistic regression to predict SCZ diagnosis when controlling for patient demographics, risk factors, and negative sentence ratio (NSR). A high NSR significantly increased one's odds of obtaining a SCZ diagnosis and attenuated the effects of patient race. Black male patients with high NSR had the highest odds of being diagnosed with SCZ. Our findings suggest sentiment-based metrics can operationalize clinician bias exposure with real world data and reveal disparities beyond race or ethnicity.


[23] 2601.08056

The embodied brain: Bridging the brain, body, and behavior with neuromechanical digital twins

Animal behavior reflects interactions between the nervous system, body, and environment. Therefore, biomechanics and environmental context must be considered to understand algorithms for behavioral control. Neuromechanical digital twins, namely computational models that embed artificial neural controllers within realistic body models in simulated environments, are a powerful tool for this purpose. Here, we review advances in neuromechanical digital twins while also highlighting emerging opportunities ahead. We first show how these models enable inference of biophysical variables that are difficult to measure experimentally. Through systematic perturbation, one can generate new experimentally testable hypotheses through these models. We then examine how neuromechanical twins facilitate the exchange between neuroscience, robotics, and machine learning, and showcase their applications in healthcare. We envision that coupling experimental studies with active probing of their neuromechanical twins will significantly accelerate progress in neuroscience.


[24] 2601.09634

Human Ancestries Simulation and Inference: a Review of Ancestral Recombination Graph-Based Approaches

There is little debate about the importance of the ancestral recombination graph in population genetics. An important theoretical tool, the main obstacle to its widespread usage is the computational cost required to match the ever-increasing scale of the data being analyzed. Many of these difficulties have been overcome in the past two decades, which have consequently seen the development of increasingly sophisticated ARG simulation and inference software. Nonetheless, challenges remain, especially in the area of ancestry inference. This paper is a comprehensive review of ARG samplers that have emerged in the past three decades to meet the need for scalable and flexible ancestry simulation and inference solutions. It specifically focuses on their performance, usability, and the biological realism of the underlying algorithm, and aims primarily to provide a technical overview of the field for researchers seeking to write their own coalescent-with-recombination sampler. As a complement to this article, we have compiled links to software, source code and documentation and made them available at this https URL.


[25] 2601.15689

Resting-State Functional Connectivity Correlates of Emotional Memory Control under Cognitive load in Subclinical Anxiety

Volitional memory control supports adaptive cognition by enabling intentional suppression of goal-irrelevant, interfering memories and recall of goal-relevant memories. Neural mechanisms of suppression and recall have been studied largely in isolation, and their operation under concurrent working memory load in the context of subclinical anxiety remains unclear. We examined control of emotionally valenced memories in 47 healthy participants with varying levels of subclinical anxiety under dual-task conditions involving directed suppression and recall while concurrently performing a secondary task imposing visual working memory load. Cognitive efficiency in controlling dual-task memory-linked interference, measured by the Balanced Integration Score (BIS), showed no differences between suppression and recall, across emotions, or by anxiety. Intrinsic functional brain networks measured by seed-to-voxel resting-state functional connectivity (rsFC) revealed dissociable rsFC profiles linked to cognitive control across emotional valences, moderated by anxiety. Efficient suppression of positive memories correlated with reduced connectivity between anterior cingulate cortex and posterior perceptual-midline regions, and diminished hippocampal-frontal pole coupling. Efficient suppression of negative memories correlated with increased posterior parietal to lateral occipital connectivity. Anxiety moderated associations between cognitive control and prefrontal connectivity during suppression of positive memories and recall of positive and neutral memories. Direct comparisons revealed stronger hippocampal-thalamic rsFC during suppression versus recall of positive memories. Together, these findings delineate neural correlates of volitional emotional memory control under cognitive load and suggest that subclinical anxiety shapes these networks selectively


[26] 2601.19019

Embedding of Low-Dimensional Sensory Dynamics in Recurrent Networks: Implications for the Geometry of Neural Representation

Neural population activity in sensory cortex is organized on low-dimensional manifolds, but why such manifolds arise and what determines their geometry remain unclear. We model cortical populations as recurrent circuits driven by low-dimensional regular sensory dynamics (circles, tori). Combining generalized synchronization and delay-embedding theory, we show that contracting recurrent networks generically develop smooth internal manifolds embedding the sensory dynamics. The dimensional requirement is modest: N>2d suffices, where d is the intrinsic sensory dimension (compatible with Whitney and Takens bounds). We prove a prediction-separation result linking representational geometry to predictive performance without assuming contraction: accurate prediction forces state separation up to a resolution set by prediction error, yielding categorical boundaries, metameric equivalence, and discrimination thresholds. Numerical experiments with trained tanh RNNs recover ring- and torus-shaped hidden manifolds; state separation improves sharply at the 2d+1 threshold. Training pushes networks beyond strict contraction, yet embedding persists, indicating sufficient but not necessary conditions. These results provide a mechanistic account of why sensory manifolds emerge in recurrent circuits and how prediction constrains their resolution.


[27] 2604.01475

Interpretable Electrophysiological Features of Resting-State EEG Capture Cortical Network Dynamics in Parkinsons Disease

Parkinsons disease (PD) alters cortical neural dynamics, yet reliable non-invasive electrophysiological biomarkers remain elusive. This study examined whether interpretable EEG features capturing complementary aspects of neural dynamics can discriminate Parkinsonian neural states. A comprehensive set of interpretable features was extracted and grouped into Standard descriptors (spectral power, phase synchronization, time-domain statistics) and Dynamical descriptors (aperiodic activity, cross-frequency coupling, scale-free dynamics, neuronal avalanche statistics, and instantaneous frequency measures). A multi-head attention transformer classifier was trained using strict LOSO validation. Group-level comparisons were performed to identify electrophysiological differences associated with disease and medication state. Standard feature sets achieved strongest performance in discriminating medication states (PDoff vs PDon), whereas Dynamical performed competitively in contrasts between PD patients and healthy controls. Random feature ablation analyses indicated that Dynamical descriptors provide complementary information distributed across features while correlation analysis revealed low redundancy within both feature sets. Group-level comparisons revealed medication-sensitive reductions in delta power and voltage variance, modulation of neuronal avalanche statistics, persistent increases in theta phase synchronization in PD patients, and disease-related alterations in cross-frequency interactions. Traditional spectral and synchronization features primarily reflect medication-related neural modulation, whereas dynamical descriptors reveal broader alterations in cortical network organization associated with disease but also with medication. These findings support multivariate EEG representations as a promising framework for developing non-invasive biomarkers of PD.


[28] 2604.04069

The physical basis of information flow in neural matter: a thermocoherent perspective on cognitive dynamics

Information flow is central to contemporary accounts of cognition, yet its physical basis in living neural matter remains poorly specified. Here, we develop a multiscale resource-theoretical framework motivated by the \textit{thermocoherent effect}, where heat flow is reciprocally coupled to a delocalized information flow carried by shared coherence and not reducible to local subsystem variables. Extending this line of work in light of recent results on correlation-enabled Mpemba-type thermal relaxation, we argue that the operational relevance of correlations depends less on their taxonomy than on their dynamical accessibility under the underlying interaction geometry. Relational structure encoded in the state of a single composite system -- including quantum entanglement, quantum discord, and classical correlations -- may therefore act as a usable physical resource that remains hidden from local subsystem descriptions. We propose that electrical, chemical, ionic, and thermal transport processes in neural matter may, under suitable microscopic conditions, generate or transduce partially hidden relational resources whose mutual coupling can progressively build larger-scale thermocoherent organization across spatial or spatiotemporal partitions in neural tissue. Ion-channel interfaces, hydrogen-bonded proton networks, aromatic $\pi$-electron architectures, and phosphate-rich motifs emerge as plausible substrate classes in which such resources may arise, become transiently accessible under environmental coupling, and leave coarse-grained signatures in neural dynamics. The resulting picture is neither a claim of macroscopic quantum cognition nor a reduction of cognition to abstract coding, but a falsifiable framework in which microscopic relational resources can bias transport, relaxation, signaling, and cross-scale neural coordination.


[29] 2604.08587

Covariant quantum error correction in a three-layer quantum brain model: computational analysis of layer-specific coherence dynamics

Quantum brain proposals require coherence on behaviorally relevant timescales, yet the gap between spin coherence times and neural decision windows has remained a quantitative obstacle. We evaluate approximate covariant quantum error correction (CQEC) -- a purification protocol constrained by the Eastin-Knill theorem -- across two radical-pair proteins parameterized by \textit{ab initio} spin Hamiltonians: monoamine oxidase~A (MAO-A) and cryptochrome (CRY, PDB~4I6G). Both share a three-layer architecture (${}^{31}$P nuclear spin memory, electron spin interface, classical electrochemistry) and identical hyperfine coupling ($A = 200$~MHz), but differ 16-fold in nuclear $T_2$: 3.2~ms (MAO-A) versus 52~ms (CRY). We test whether CQEC preserves coherence over the 200~ms Schultze-Kraft veto window by mapping each protein's $T_2$ gap onto a simulation decoherence rate ($\gamma_\mathrm{veto} = T_2~\text{gap}/2T_\mathrm{sim}$): 3.08 for MAO-A, 0.19 for CRY. At $\gamma_\mathrm{veto} = 0.19$, CQEC maintains tunneling coherence of 0.83 (95\% CI [0.76, 0.79]; versus 0.12 without correction, $\times$6.9 improvement). At $\gamma_\mathrm{veto} = 3.08$, coherence collapses to 0.012 even with CQEC. A $T_2$ sensitivity analysis confirms robustness: at $T_2 = 26$~ms (half the CRY estimate), CQEC-protected coherence remains 0.69. A classical Markov baseline produces only monotonic relaxation, confirming that CQEC-maintained oscillatory dynamics are genuinely quantum. However, no single protein optimizes both layers: CRY's shorter $T_2^e$ (0.53~ns versus 1.1~ns) worsens Layer~2 fidelity. This layer-protein tradeoff, together with unresolved challenges in state preparation and entanglement distribution, defines the next targets for quantum brain research.


[30] 2409.06565

Statistical inference for a multiscale stochastic model of enzyme kinetics via propagation of chaos

We study a class of Stochastic Differential Equations (SDEs) with jumps modeling multistage Michaelis--Menten enzyme kinetics, in which a substrate is sequentially transformed into a product via a cascade of intermediate complexes. These networks are typically high-dimensional and exhibit multiscale behavior with a strong coupling between different components, posing substantial analytical and computational challenges. In particular, the problem of statistical inference of reaction rates is significantly difficult and becomes even more intricate when direct observations of system states are unavailable and only a random sample of product formation times is observed. We address this problem in two stages. First, in a suitable scaling regime consistent with the Quasi-Steady State Approximation (QSSA), we rigorously establish a stochastic averaging principle yielding a reduced model for the product-substrate dynamics. Guided by the reduced-order dynamics, we next construct a novel Interacting Particle System (IPS) that approximates the product-substrate process at the particle level. This IPS plays a pivotal role in the inference methodology, and we prove several non-asymptotic bounds and limiting results for this system. These results facilitate the construction of an estimator based on a product-form approximate likelihood requiring only a random sample of product formation times. This approach does not need access to the system states, and we rigorously prove consistency of the estimator.


[31] 2507.03951

Structure from Noise: Confirmation Bias in Particle Picking in Structural Biology

The computational pipelines of single-particle cryo-electron microscopy (cryo-EM) and cryo-electron tomography (cryo-ET) include an early particle-picking stage, in which a micrograph or tomogram is scanned to extract candidate particles, typically via template matching or deep-learning-based techniques. The extracted particles are then passed to downstream tasks such as classification and 3D reconstruction. Although it is well understood empirically that particle picking can be sensitive to the choice of templates or learned priors, a quantitative theory of the bias introduced by this stage has been lacking. Here, we develop a mathematical framework for analyzing bias in template matching-based detection with concrete applications to cryo-EM and cryo-ET. We study this bias through two downstream tasks: (i) maximum-likelihood estimation of class means in a Gaussian mixture model (GMM) and (ii) 3D volume reconstruction from the extracted particle stack. We show that when template matching is applied to pure noise, then under broad noise models, the resulting maximum-likelihood estimates converge asymptotically to deterministic, noise-dependent transforms of the user-specified templates, yielding a structure from noise effect. We further characterize how the resulting bias depends on the noise statistics, sample size, dimension, and detection threshold. Finally, controlled experiments using standard cryo-EM software corroborate the theory, demonstrating reproducible structure from noise artifacts in low-SNR data.


[32] 2510.07286

Evolutionary Profiles for Protein Fitness Prediction

Predicting the fitness impact of mutations is central to protein engineering but constrained by limited assays relative to the size of sequence space. Protein language models (pLMs) trained with masked language modeling (MLM) exhibit strong zero-shot fitness prediction; we provide a unifying view by interpreting natural evolution as implicit reward maximization and MLM as inverse reinforcement learning (IRL), in which extant sequences act as expert demonstrations and pLM log-odds serve as fitness estimates. Building on this perspective, we introduce EvoIF, a lightweight model that integrates two complementary sources of evolutionary signal: (i) within-family profiles from retrieved homologs and (ii) cross-family structural-evolutionary constraints distilled from inverse folding logits. EvoIF fuses sequence-structure representations with these profiles via a compact transition block, yielding calibrated probabilities for log-odds scoring. On ProteinGym (217 mutational assays; >2.5M mutants), EvoIF and its MSA-enabled variant achieve state-of-the-art or competitive performance while using only 0.15% of the training data and fewer parameters than recent large models. Ablations confirm that within-family and cross-family profiles are complementary, improving robustness across function types, MSA depths, taxa, and mutation depths. The codes will be made publicly available.


[33] 2510.20792

BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation

The rapid progress of graph generation has raised new security concerns, particularly regarding backdoor vulnerabilities. While prior work has explored backdoor attacks in image diffusion and unconditional graph generation, conditional, especially text-guided graph generation remains largely unexamined. This paper proposes BadGraph, a backdoor attack method against latent diffusion models for text-guided graph generation. BadGraph leverages textual triggers to poison training data, covertly implanting backdoors that induce attacker-specified subgraphs during inference when triggers appear, while preserving normal performance on clean inputs. Extensive experiments on four benchmark datasets (PubChem, ChEBI-20, PCDes, MoMu) demonstrate the effectiveness and stealth of the attack: less than 10% poisoning rate can achieves 50% attack success rate, while 24% suffices for over 80% success rate, with negligible performance degradation on benign samples. Ablation studies further reveal that the backdoor is implanted during VAE and diffusion training rather than pretraining. These findings reveal the security vulnerabilities in latent diffusion models of text-guided graph generation, highlight the serious risks in models' applications such as drug discovery and underscore the need for robust defenses against the backdoor attack in such diffusion models.


[34] 2602.07131

Behavior Score Prediction in Resting-State Functional MRI by Deep State Space Modeling

Early clinical assessment of Alzheimer's disease relies on behavior scores that measure a subject's language, memory, and cognitive skills. On the medical imaging side, functional magnetic resonance imaging has provided invaluable insights into the neural pathways underlying Alzheimer's disease. While prior studies have used resting-state functional MRI by extracting functional connectivity matrices, these approaches neglect the temporal dynamics inherent in functional data. In this work, we present a deep state space modeling framework that directly leverages the blood-oxygenation-level-dependent time series to learn a sparse collection of brain regions to predict behavior scores. Our model extracts temporal features that encapsulate nuanced patterns of intrinsic brain activity, thereby enhancing predictive performance compared to traditional connectivity methods. We identify specific brain regions that are most predictive of cognitive impairment through experiments on data provided by the Michigan Alzheimer's Disease Research Center, providing new insights into the neural substrates of early Alzheimer's pathology. These findings have important implications for the possible development of risk monitoring and intervention strategies in Alzheimer's disease.


[35] 2602.20218

Robust Glioblastoma Segmentation Without T2-FLAIR: External Validation of Targeted Dropout Training

Objectives: To determine whether targeted T2 fluid-attenuated inversion recovery (T2-FLAIR) dropout training improves robustness of glioblastoma MRI tumor segmentation and whole-tumor volumetry when T2-FLAIR is unavailable, without degrading performance when T2-FLAIR is available. Materials and Methods: In this retrospective multi-dataset study, 3D nnU-Net models were trained on a subset of the BraTS 2021 cohort (n=848) and externally validated on the University of Pennsylvania glioblastoma cohort (n=403). Models were trained with no dropout or targeted T2-FLAIR dropout (probability rate (r)=0.35 or 0.50) by replacing only the T2-FLAIR channel with zeros during training. Testing used prespecified T2-FLAIR-present and T2-FLAIR-absent scenarios, with the absent scenario simulated by zeroing the T2-FLAIR channel at inference. The primary endpoint was per-patient overall region-wise Dice similarity coefficient (DSC), secondary endpoints were region-specific DSC, 95th percentile Hausdorff distance and Bland-Altman whole-tumor volume bias. Results: With T2-FLAIR present, overall median DSC was 94.8% (interquartile range [IQR] 90.0%-97.1%) with dropout (r=0.35) and 95.0% (IQR 90.3%-97.1%) without dropout, supporting equivalence (p<0.001). With T2-FLAIR absent, overall median DSC improved from 81.0% (IQR 75.1%-86.4%) without dropout to 93.4% (IQR 89.1%-96.2%) with dropout (r=0.35). Whole-tumor DSC improved from 60.4% to 92.6%, whole tumor 95th percentile Hausdorff distance improved from 17.24 mm to 2.45 mm, and whole-tumor volume bias improved from -45.6 mL to 0.83 mL. Conclusions: In a simulated T2-FLAIR-unavailable scenario, targeted T2-FLAIR dropout preserved segmentation performance when T2-FLAIR was available and substantially reduced whole-tumor segmentation error and volumetric bias when T2-FLAIR was absent.