A class of multiple-timescale asymptotic solutions to the equations of the susceptible-infected-recovered (SIR) model is presented for the case of high basic reproduction number, with the inverse of the latter employed as the expansion parameter. High values of the basic reproduction number, a coefficient defined as the ratio of the infection and recovery rates built into the SIR model equations, are associated with escalating epidemics. Combinations of fast and slow timescales in the suggested multiple-timescale solutions prove adequate to reflect the acknowledged epidemic paradigm, which is characterized by the concatenation of a sharp outbreak with a subsequent protracted plateau. Explicit solutions for the numbers of the infected, susceptible, and recovered compartments of the SIR model are derived via the asymptotic treatment, and the epidemic peak timing and magnitude are assessed on this basis. The asymptotic results agree seamlessly with numerical simulations based on the SIR model.
The cryopreservation of biological materials is a highly complex process, as it involves numerous factors such as the cooling and thawing procedures, the administration of cryoprotective agents (CPAs), as well as the type and composition of cells. While theoretical work has yielded a better understanding of the processes occurring during cryopreservation, the design of cryopreservation protocols and their parameters is currently predominantly based on heuristic optimization. Here, we propose a mathematical method to optimise the cooling dynamics in slow-cooling, to reduce the risk of injury. We derive our method from first principles and provide computational predictions. Moreover, we assess the predictions with data obtained from the literature, as well as novel experimental results. Overall, we provide a generic computational approach to generate improved slow-cooling profiles for the cryopreservation of cells in suspension.
Many cellular components are present in such low numbers that individual stochastic production and degradation events lead to significant fluctuations in molecular abundances. Although feedback control can, in principle, suppress such low-copy-number fluctuations, general rules have emerged that suggest fundamental performance constraints on feedback control in biochemical systems. In particular, previous work has conjectured that reducing abundance fluctuations in one component requires at least one sacrificial component with increased variability in arbitrary reaction networks of any size. Here, we present an exact and general proof of this statement based on probability current decompositions of mutual information rates between molecular abundances. This suggests that variability in cellular components is necessary for cellular control and that fluctuating components do not necessarily generate cellular "noise" but may correspond to control molecules that are involved in removing "noise" from other cellular components.
Can machine learning models identify which chemist made a molecule from structure alone? If so, models trained on literature data may exploit chemist intent rather than learning causal structure-activity relationships. We test this by linking CHEMBL assays to publication authors and training a 1,815-class classifier to predict authors from molecular fingerprints, achieving 60% top-5 accuracy under scaffold-based splitting. We then train an activity model that receives only a protein identifier and an author-probability vector derived from structure, with no direct access to molecular descriptors. This author-only model achieves predictive power comparable to a simple baseline that has access to structure. This reveals a "Clever Hans" failure mode: models can predict bioactivity largely by inferring chemist goals and favorite targets without requiring a lab-independent understanding of chemistry. We analyze the sources of this leakage, propose author-disjoint splits, and recommend dataset practices to decouple chemist intent from biological outcomes.
Human language processing relies on the brain's capacity for predictive inference. We present a machine learning framework for decoding neural (EEG) responses to dynamic visual language stimuli in Deaf signers. Using coherence between neural signals and optical flow-derived motion features, we construct spatiotemporal representations of predictive neural dynamics. Through entropy-based feature selection, we identify frequency-specific neural signatures that differentiate interpretable linguistic input from linguistically disrupted (time-reversed) stimuli. Our results reveal distributed left-hemispheric and frontal low-frequency coherence as key features in language comprehension, with experience-dependent neural signatures correlating with age. This work demonstrates a novel multimodal approach for probing experience-driven generative models of perception in the brain.
Multiple cellular processes are triggered when the concentration of a regulatory protein reaches a critical threshold. Previous analyses have characterized timing statistics for single-gene systems. However, many biological timers are based on cascades of genes that activate each other sequentially. Here, we develop an analytical framework to describe the timing precision of such cascades using a burst-dilution hybrid stochastic model. We first revisit the single-gene case and recover the known result of an optimal activation threshold that minimizes first-passage-time (FPT) variability. Extending this concept to two-gene cascades, we identify three distinct optimization regimes determined by the ratio of intrinsic noise levels and the protein dilution rate, defining when coupling improves or worsens timing precision compared to a single-gene strategy. Generalizing to cascades of arbitrary gene length, we obtain a simple mathematical condition that determines when a new gene in the cascade can decrease the timing noise based on its intrinsic noise and protein dilution rate. In the specific case of a cascade of identical genes, our analytical results predict suppression of FPT noise with increasing cascade length and the existence of a mean time that decreases relative timing fluctuations. Together, these results define the intrinsic limits of timekeeping precision in gene regulatory cascades and provide a minimal analytical framework to explore timing control in biological systems.
This technical note considers the sampling of outcomes that provide the greatest amount of information about the structure of underlying world models. This generalisation furnishes a principled approach to structure learning under a plausible set of generative models or hypotheses. In active inference, policies - i.e., combinations of actions - are selected based on their expected free energy, which comprises expected information gain and value. Information gain corresponds to the KL divergence between predictive posteriors with, and without, the consequences of action. Posteriors over models can be evaluated quickly and efficiently using Bayesian Model Reduction, based upon accumulated posterior beliefs about model parameters. The ensuing information gain can then be used to select actions that disambiguate among alternative models, in the spirit of optimal experimental design. We illustrate this kind of active selection or reasoning using partially observed discrete models; namely, a 'three-ball' paradigm used previously to describe artificial insight and 'aha moments' via (synthetic) introspection or sleep. We focus on the sample efficiency afforded by seeking outcomes that resolve the greatest uncertainty about the world model, under which outcomes are generated.
Population-scale pangenome analysis increasingly requires representations that unify single-nucleotide and structural variation while remaining scalable across large cohorts. Existing formats are typically sequence-centric, path-centric, or sample-centric, and often obscure population structure or fail to exploit carrier sparsity. We introduce the H1 pan-graph-matrix, an allele-centric representation that encodes exact haplotype membership using adaptive per-allele compression. By treating alleles as first-class objects and selecting optimal encodings based on carrier distribution, H1 achieves near-optimal storage across both common and rare variants. We further introduce H2, a path-centric dual representation derived from the same underlying allele-haplotype incidence information that restores explicit haplotype ordering while remaining exactly equivalent in information content. Using real human genome data, we show that this representation yields substantial compression gains, particularly for structural variants, while remaining equivalent in information content to pangenome graphs. H1 provides a unified, population-aware foundation for scalable pangenome analysis and downstream applications such as rare-variant interpretation and drug discovery.
Acute Myeloid Leukemia (AML) remains a clinical challenge due to its extreme molecular heterogeneity and high relapse rates. While precision medicine has introduced mutation-specific therapies, many patients still lack effective, personalized options. This paper presents a novel, end-to-end computational framework that bridges the gap between patient-specific transcriptomics and de novo drug discovery. By analyzing bulk RNA sequencing data from the TCGA-LAML cohort, the study utilized Weighted Gene Co-expression Network Analysis (WGCNA) to prioritize 20 high-value biomarkers, including metabolic transporters like HK3 and immune-modulatory receptors such as SIGLEC9. The physical structures of these targets were modeled using AlphaFold3, and druggable hotspots were quantitatively mapped via the DOGSiteScorer engine. Then developed a novel, reaction-first evolutionary metaheuristic algorithm as well as multi-objective optimization programming that assembles novel ligands from fragment libraries, guided by spatial alignment to these identified hotspots. The generative model produced structurally unique chemical entities with a strong bias toward drug-like space, as evidenced by QED scores peaking between 0.5 and 0.7. Validation through ADMET profiling and SwissDock molecular docking identified high-confidence candidates, such as Ligand L1, which achieved a binding free energy of -6.571 kcal/mol against the A08A96 biomarker. These results demonstrate that integrating systems biology with metaheuristic molecular assembly can produce pharmacologically viable, patient tailored leads, offering a scalable blueprint for precision oncology in AML and beyond
Computational neuroscience relies on large-scale dynamical-systems models of neurons, with a vast amount of offline, pre-simulation, tuned parameters, with models often tied to their brain simulators. These fixed parameters lead to stiff models, that show unnatural behaviour when introduced to new environments, or when combined into larger networks. In contrast to offline tuning, in biology, cells continuously adapt via homeostatic plasticity to stay in desired dynamical regimes. In this work, we aim to introduce such online tuning of cellular parameters into brain simulation. We show that the sensitivity equation of a biorealistic neural models has the same shape as a general neuron model, and can be simulated within existing brain simulators. Via co-simulation with the sensitivity equation, we enable both offline, and online tuning of activity of arbitrary biophysically realistic brain models. Furthermore, we show that this opens the possibility to study the biological mechanisms underlying homeostatic plasticity, via both meta-learning plasticity mechanism as well as treating online tuning as a black-box plasticity mechanism. Through the generality of our methods, we hope that more computational science fields can capitalize on the similarity between the simulated model and its gradient system.
Aging includes both continuous gradual decline from microscopic mechanisms together with major deficit onset events such as morbidity, disability and ultimately death. These deficit events are stochastic, obscuring the connection between aging mechanisms and overall health. We propose a framework for modelling both the gradual effects of aging together with health deficit onset events, as reflected in the frailty index (FI) - a quantitative measure of overall age-related health. We model damage and repair dynamics of the FI from individual health transitions within two large longitudinal studies of aging health, the Health and Retirement Study (HRS) and the English Longitudinal Study of Ageing (ELSA), which together included N=47592 individuals. We find that both damage resistance (robustness) and damage recovery (resilience) rates decline smoothly with both increasing age and with increasing FI, for both sexes. This leads to two distinct dynamical states: a robust and resilient young state of stable good health (low FI) and an older state that drifts towards poor health (high FI). These two health states are separated by a sharp transition near age 75. Since FI accumulation risk accelerates dramatically across this tipping point, ages 70-80 are crucial for understanding and managing late-life decline in health.
Motivation: Externalizing behaviors in children, such as aggression, hyperactivity, and defiance, are influenced by complex interplays between genetic predispositions and environmental factors, particularly parental behaviors. Unraveling these intricate causal relationships can benefit from the use of robust data-driven methods. Methods: We developed a method called Hillclimb-Causal Inference, a causal discovery approach that integrates the Hill Climb Search algorithm with a customized Linear Gaussian Bayesian Information Criterion (BIC). This method was applied to data from the Adolescent Brain Cognitive Development (ABCD) Study, which included parental behavior assessments, children's genotypes, and externalizing behavior measures. We performed dimensionality reduction to address multicollinearity among parental behaviors and assessed children's genetic risk for externalizing disorders using polygenic risk scores (PRS), which were computed based on GWAS summary statistics from independent cohorts. Once the causal pathways were identified, we employed structural equation modeling (SEM) to quantify the relationships within the model. Results: We identified prominent causal pathways linking parental behaviors to children's externalizing outcomes. Parental alcohol misuse and broader behavioral issues exhibited notably stronger direct effects (0.33 and 0.20, respectively) compared to children's polygenic risk scores (0.07). Moreover, when considering both direct and indirect paths, parental substance misuse (alcohol, drug, and tobacco) collectively resulted in a total effect exceeding 1.1 on externalizing behaviors. Bootstrap and sensitivity analyses further validated the robustness of these findings.
Medically uncontrolled epileptic seizures affect nearly 15 million people worldwide, resulting in enormous economic and psychological burdens. Treatment of medically refractory epilepsy is essential for patients to achieve remission, improve psychological functioning, and enhance social and vocational outcomes. Here, we show a state-of-the-art method that stabilizes fractional dynamical networks modeled from intracranial EEG data, effectively suppressing seizure activity in 34 out of 35 total spontaneous episodes from patients at the University of Pennsylvania and the Mayo Clinic. We perform a multi-scale analysis and show that the fractal behavior and stability properties of these data distinguish between four epileptic states: interictal, pre-ictal, ictal, and post-ictal. Furthermore, the simulated controlled signals exhibit substantial amplitude reduction ($49\%$ average). These findings highlight the potential of fractional dynamics to characterize seizure-related brain states and demonstrate its capability to suppress epileptic activity.
We investigate how enzymatic binding kinetics regulate diffusion-driven instabilities in a two-step metabolic pathway. Starting from a mechanistic description in which the substrate reversibly binds to the first enzyme before catalytic conversion, we formulate two reaction-diffusion models: a simplified system with effective kinetics and an extended model that explicitly includes the enzyme-substrate complex. The latter exhibits a structural degeneracy at critical parameter values due to a continuous family of homogeneous equilibria. To enable direct comparison and analytical progress, we introduce a reduced non-degenerate formulation via a quasi-equilibrium closure that encodes the influence of complex formation into effective reaction terms while preserving the nonlinear coupling between catalytic turnover and spatial transport. We show that explicit enzyme-substrate binding shifts the homogeneous steady state, modifies relaxation dynamics, and substantially alters the size and location of the Turing instability region relative to the simplified model. Numerical simulations are in close agreement with weakly nonlinear predictions, illustrating how reversible binding reshapes pattern selection and slows the development of spatial heterogeneity. These results establish a quantitative link between enzyme-substrate binding kinetics, diffusion-driven instabilities, and mesoscale spatial organization, including structures associated with liquid-liquid phase separation (LLPS). The proposed framework provides a mechanistic route by which association, dissociation, and catalytic rates jointly regulate the robustness and structure of spatial metabolic patterns, and can be extended to broader classes of compartmentalized biochemical networks.
Coherence in language requires the brain to satisfy two competing temporal demands: gradual accumulation of meaning across extended context and rapid reconfiguration of representations at event boundaries. Despite their centrality to language and thought, how these processes are implemented in the human brain during naturalistic listening remains unclear. Here, we tested whether these two processes can be captured by annotation-free drift and shift signals and whether their neural expression dissociates across large-scale cortical systems. These signals were derived from a large language model (LLM) and formalized contextual drift and event shifts directly from the narrative input. To enable high-precision voxelwise encoding models with stable parameter estimates, we densely sampled one healthy adult across more than 7 hours of listening to thirteen crime stories while collecting ultra high-field (7T) BOLD data. We then modeled the feature-informed hemodynamic response using a regularized encoding framework validated on independent stories. Drift predictions were prevalent in default-mode network hubs, whereas shift predictions were evident bilaterally in the primary auditory cortex and language association cortex. Furthermore, activity in default-mode and parietal networks was best explained by a signal capturing how meaning accumulates and gradually fades over the course of the narrative. Together, these findings show that coherence during language comprehension is implemented through dissociable neural regimes of slow contextual integration and rapid event-driven reconfiguration, offering a mechanistic entry point for understanding disturbances of language coherence in psychiatric disorders.
Premise. Patterns of electrical brain activity recorded via electroencephalography (EEG) offer immense value for scientific and clinical investigations. The inability of supervised EEG encoders to learn robust EEG patterns and their over-reliance on expensive signal annotations have sparked a transition towards general-purpose self-supervised EEG encoders, i.e., EEG foundation models (EEG-FMs), for robust and scalable EEG feature extraction. However, the real-world readiness of early EEG-FMs and the rubrics for long-term research progress remain unclear. Objective. In this work, we conduct a review of ten early EEG-FMs to capture common trends and identify key directions for future development of EEG-FMs. Methods. We comparatively analyze each EEG-FM using three fundamental pillars of foundation modeling, namely the representation of input data, self-supervised modeling, and the evaluation strategy. Based on this analysis, we present a critical synthesis of EEG-FM methodology, empirical findings, and outstanding research gaps. Results. We find that most EEG-FMs adopt a sequence-based modeling scheme that relies on transformer-based backbones and the reconstruction of masked temporal EEG sequences for self-supervision. However, model evaluations remain heterogeneous and largely limited, making it challenging to assess their practical off-the-shelf utility. In addition to adopting standardized and realistic evaluations, future work should demonstrate more substantial scaling effects and make principled and trustworthy choices throughout the EEG representation learning pipeline. Significance. Our review indicates that the development of benchmarks, software tools, technical methodologies, and applications in collaboration with domain experts may advance the translational utility and real-world adoption of EEG-FMs.