Learning and memory require a balance between plasticity and stability: synaptic connections must encode new information without collapsing, saturating, or erasing previously useful structure. Associative-memory models can appear to learn successfully when fixed background connectivity already carries part of the task, making it difficult to distinguish genuine recall dynamics from structural assistance. We test this issue using an order-sensitive adaptive-plasticity benchmark for staged associative recall. The benchmark compares a quantum-like associative-memory model with matched real-valued no-phase and Markov-rate controls under the same task schedule, perturbation profiles, weak-support conditions, and plasticity settings. Here, "quantum-like" refers to the modeling formalism, not to a biological claim about quantum computation. We first screen weak structural support and then fix a conservative operating point for factorial comparisons across model families and plasticity mechanisms. The useful weak-support regime is narrow and non-monotonic. Weak structure alone does not rescue recall in the no-plasticity ablation, whereas most useful recall gains arise from adaptive plasticity, especially homeostatic stabilization. The Markov-rate control often achieves stronger raw recall, but the quantum-like model more consistently preserves order sensitivity and stage-dependent organization. These results do not support a universal quantum-like advantage. Instead, they show that model classes are better distinguished by a multi-objective profile combining recall, temporal organization, and context sensitivity than by any single recall score. The benchmark therefore provides a controlled framework for studying context-sensitive memory dynamics under weak support, regulated plasticity, and matched classical comparison.
Performing statistical inference is an essential component of data science. Our focus in this work is on two inference techniques, viz. regression and interpolation. We propose a reaction network based approach that can implement linear regression (both univariate and multivariate) and linear interpolation. We do this by encoding the steady state concentration of species as the output of these inference techniques. Towards this, we use a novel generalized division module that can handle division of negative numbers. We verify our results by comparing them with in-silico implementation on standard synthetic datasets.
Evolutionary accumulation models (EvAMs), also known as cancer progression models (CPMs), infer dependencies in the order of accumulation of mutations during tumor progression from cross-sectional data. It has been suggested that EvAMs could be used to identify therapeutic targets, but there is no procedure in the literature for how to extract predictions under intervention from these models. A simple approach of conditioning on the absence of a mutation gives incorrect predictions. We address this gap by formalizing what ``intervene'' means for all currently available EvAM methods (OT, OncoBN, CBN, H-ESBCN, MHN, HyperHMM, HyperTraPS), using Pearl's do operator and conditional interventions. For each model, we show how to implement the intervention (in most cases as specific parameter modifications), identify equivalent implementation procedures, and analyze whether the modularity assumption -- required for the intervention to be well-defined -- is justified. Drawing on individual-level causal DAGs that make fitness an explicit variable, we distinguish two types of intervention (killing and inactivating) that are conflated in standard EvAM representations. Since the goal is to prioritize intervention candidates, we recast the problem as one of ranking: we define three intervention objectives and provide a protocol for evaluating how well EvAMs rank targets. Our framework is not specific to cancer or EvAMs; it applies wherever fitted computational models can be interpreted as structural causal models. Code available from this https URL.
This study investigates the terminal breakdown of human neurophysiological function through the lens of non-linear dynamics by analyzing the multifractal spectrum. Using Multifractal Detrended Fluctuation Analysis (MF-DFA), we quantify the temporal evolution of complexity in synchronized electroencephalogram (EEG) and electrocardiogram (ECG) time series from patients in the terminal stage. Our results reveal a marked divergence in multifractal spectrum width: while neural activity exhibits a collapse of multifractality toward a more constrained state, cardiac signals undergo anomalous spectral broadening, indicating increased non-linear fluctuations and dynamical instability. A negative correlation between these spectral widths suggests effective functional decoupling and the emergence of anti-correlated dynamics between neural and cardiac systems. Rather than reflecting a uniform physiological decline, this divergence is consistent with a body-to-brain breakdown in which peripheral dysfunction progressively overwhelms central regulatory processes. In a broader context, the observed opposing trends resemble patterns reported in other body-driven adaptive processes, suggesting that inverse dynamics across coupled systems may emerge when constraints originate from peripheral rather than central mechanisms. Ultimately, the dying process appears to represent an extreme form of cross-system disintegration, marked by the collapse of the hierarchical coordination that normally sustains integrated physiological function.
Neural assemblies, transiently coordinated groups of neurons, observed in the hippocampus are thought to underlie the formation of episodic memories. Acetylcholine (ACh), a neuromodulator, that is received by the hippocampus, plays a critical role in memory and learning. A well supported hypothesis suggests that high levels of ACh during active exploration and rapid eye movement (REM) sleep promote memory encoding, while low levels during quiet waking and slow-wave sleep (SWS) support memory consolidation. We study this bidirectional role of ACh in neural assembly formation through its effect on the synchrony among neurons. We consider a network model of pyramidal neurons, each equipped with a slow, voltage-dependent, non-inactivating potassium current (M-current), which is downregulated in the presence of ACh. Neural assemblies are represented as cluster solutions to this system. Using a one-dimensional phase model reduction of a pair of weakly coupled pyramidal neurons under different levels of the M-current, we predict the symmetric cluster solutions that may emerge in larger networks equipped with all-to-all globally homogeneous, symmetric distance-dependent and nearest-neighbours coupling architectures. We find that under low ACh conditions, the network can fully synchronize, whereas high levels can desynchronize the network into multiple stable symmetric cluster solutions representing distinct neural assemblies.
A key question in theoretical biology is how effectively biological systems preserve information about their inputs while operating under physical and functional constraints. We examine that question at the neuromuscular junction (NMJ) by studying how neurotransmitter concentration is transformed into current at both cholinergic and glutamatergic NMJs. An information maximization analysis was used to derive a theoretical distribution over neurotransmitter concentrations based on biological understandings of dose-response relationships. These theoretical distributions were compared to an experimentally derived distribution obtained from a Drosophila NMJ. The theoretical and experimental distributions showed very little agreement, indicating that the Drosophila NMJ does not shape its distribution of synaptic vesicle release probabilities in order to maximize information transmission from nervous system to muscle. Predictions for cholinergic systems are provided.
Computational design of nanobodies that bind user-specified protein epitopes could transform therapeutic development, but current methods either rely on stochastic sampling requiring days of GPU computation or inverse folding approaches unable to target epitopes directly. Here we present EasyNano, a practical pipeline for rapid, epitope-targeted nanobody complementarity-determining region (CDR) design that operates in approximately 10-20 minutes on a high-end personal workstation. EasyNano optimizes CDR residue logits via gradient descent through the ESMFold2 pairwise distance distogram, using the lightweight ESMFold2-Fast model (721M) as a differentiable oracle guided by a composite loss including a dedicated epitope proximity term. A full ESMFold2 (1.3B) CA-coordinate structure prior prevents framework pose drift. The wild-type logit initialization bias emerges as a critical practical parameter controlling CDR mutability. Across six target-framework pairs spanning self-recovery and de novo design scenarios, EasyNano improves ipTM by up to +0.559 -- from 0.143 to 0.702 (Ty1/RBD) -- and achieves a 4.6-fold improvement (ipTM 0.117 to 0.538) on a manually docked AQP4-targeting framework, while preserving ipTM on already-strong binders. Random CDR baselines (n=30 per target) confirm statistical significance (5.7 sigma above random mean for Ty1). Multi-seed analysis reveals diverse local minima, underscoring the importance of replicate runs. Kabsch cross-validation against crystal structures confirms that designed CDRs preserve the framework pose basin. EasyNano demonstrates that ESMFold2-based differentiable optimization provides a fast, practical, and epitope-specific approach to nanobody CDR design.
Predicting single-cell transcriptional responses to genetic, chemical and cytokine perturbations is a fundamental challenge in computational biology and AI Virtual Cell (AIVC) modeling, with direct implications for drug discovery and the elucidation of gene regulatory networks. Existing approaches often rely on auxiliary cell-state encoders, hierarchical variational autoencoders, dedicated Transformer encoder-decoder modules, or gene-interaction priors to compress high-dimensional expression profiles into latent representations. While effective, these designs increase architectural complexity and may limit scalability and generalizability. This paper introduces OCOO-T, a minimalist flow-matching-based AIVC model for transcriptional perturbation response prediction. OCOO-T utilizes a vanilla Transformer stack that operates directly on continuous gene expression profiles and formulates perturbation response prediction as a continuous-time denoising process. Perturbation embeddings, dosage information, and cell-line/cell-type specificity are integrated through adaptive layer normalization and in-context tokens. Comprehensive evaluations on Tahoe100M, Replogle, and PBMC benchmarks demonstrate that OCOO-T achieves state-of-the-art performance across diverse perturbations and cell types while effectively scaling to long transcriptional profiles through patching and depatching of cellular contexts. By leveraging the simplicity of Transformer-based denoising for single-cell omics, OCOO-T provides an effective and scalable framework for in-silico cellular simulation.
Automated sleep staging is a fundamental application of passive Brain-Computer Interfaces (pBCI), decoding spontaneous neural states to enable closed-loop interventions independent of user intent. This study evaluates criticality features derived from Detrended Fluctuation Analysis (DFA) for the specific identification of deep sleep (N3). We analyzed $347,232$ EEG epochs from $290$ older women using UMAP manifold learning to visualize state transitions. Subsequently, six classifiers were benchmarked via 10-fold cross-validation, using balanced accuracy to determine the optimal "state-sensing" engine for this http URL Bayes achieved the highest mean balanced accuracy ($87.17\% \pm 0.24\%$), significantly outperforming a fully connected deep neural network (FNN: $81.58\%$) and Random Forest ($80.97\%$). Linear models (LDA: $57.21\%$; SVM: $51.01\%$) performed poorly, indicating that DFA-derived criticality features reside on a distinct, non-linear manifold. Probabilistic decoding of EEG criticality provides a high-accuracy sensing mechanism for pBCIs. This robust classification pipeline supports the development of state-dependent neurofeedback, such as targeted auditory stimulation, to enhance cognitive recovery.
Low-frequency, low intensity ultrasound (LIUS) has emerged as a promising physical modality capable of inducing selective apoptosis of cancer cells, while sparing healthy epithelial cells and fibroblasts. Hitherto, the mechanism underlying this selectivity has been unclear, but we now propose and develop a theoretical framework linking the distinct mechanical behaviours of cancer versus healthy cells to their differential responses to LIUS. We point out that cancer cells exhibit inhomogeneous ventral stress-fiber networks, which can produce irregular focal adhesion geometry and inward membrane curvature near focal adhesions under low-intensity ultrasound (LIUS). These curvature irregularities can favor loose packing of Piezo1 channels, thereby preserving their activity. In contrast, healthy epithelial cells and fibroblasts display more homogeneous cytoskeletal organization, which can result in more regular curvature profiles adjacent to focal adhesions. This leads to curvature-driven cholesterol redistribution, resulting in altered spatial organization of Piezo1 clusters and reduced coordinated channel activity and allowing cells to remain in their active, proliferative state when exposed to LIUS. Based on theoretical modeling and previous experimental findings, we propose that differences in cytoskeletal organization and membrane curvature can contribute to distinct Piezo1 activation patterns between healthy and cancerous cells. Our analysis identifies curvature-mediated Piezo1 redistribution as a potential physical basis for LIUS selectivity and provides a mechanistic foundation for designing ultrasound-based therapies to exploit the intrinsic cytoskeletal vulnerabilities of cancer cells.
AI decision-support systems can benefit from anticipating biases in human decision-making. Many such biases may arise from human cognitive limitations. The policy compression framework models decision-making as a trade-off between reward maximization and the cognitive cost of encoding state-dependent action policies, formalized as the mutual information between states and actions (policy complexity). We argue that this account is incomplete because it treats conditional entropy--the irreducible uncertainty about which action should be selected given a state--as costless, even though empirical evidence suggests that it modulates reaction times. We therefore extend the framework by defining cognitive cost as the sum of policy complexity and a weighted conditional-entropy term, governed by a new parameter, $\eta$. The resulting optimal policy retains the standard exponential form but becomes sharper as $\eta$ increases, allowing policy precision to vary more independently of reward sensitivity. This modification implies that the standard policy compression framework may underestimate the cognitive cost of action selection, and it has the potential to better account for biases in human decision-making. At the same time, it introduces additional complexity for fitting the model to human data, which future work will need to address.
Intentional communication has been studied extensively in primates, yet evidence from free-ranging non-ape species remains limited. Human-directed food-solicitation gestures in Hanuman langurs (Semnopithecus entellus) have recently been described, but whether these behaviours exhibit behavioural hallmarks associated with first-order intentionality remains unknown. Here, we experimentally investigated the presence of these hallmarks in free-ranging Hanuman langurs across six anthropogenic sites in southern West Bengal, India. We conducted 360 experimental and control trials and quantified behavioural markers commonly used to operationalize intentional communication. Experimental trials elicited audience checking, recipient-directed orientation, rapid approach responses, food-solicitation gestures and gestural flexibility, whereas these behaviours were rare or absent in control trials. Differences between experimental and control conditions were significant across all six study sites. Signalling also ceased following food acquisition, consistent with the stopping rule associated with an Apparently Satisfactory Outcome. Our findings demonstrate the presence of multiple behavioural hallmarks linked to first-order intentionality in the human-directed gestural communication of free-ranging Hanuman langurs. These results extend the study of intentionality beyond apes and provide new insights into the evolutionary distribution of intentionality-related traits across primates.
In recent years, neural ordinary differential equation frameworks such as Biologically-Informed Neural Networks (BINNs) have shown promise for learning mechanistic laws from sparse data. However, most existing approaches implicitly assume homoscedastic Gaussian noise, and therefore do not account for potentially meaningful structure in biological variability. Here, we present an extension to the existing BINNs framework that includes a learnable noise model, allowing discovery of the noise model directly from data. Using population growth as an example, we demonstrate that the framework accurately recovers the underlying noise structure and improves predictions of the underlying growth laws compared to existing approaches. As such, this work establishes a general likelihood-based framework for jointly learning dynamics and heteroscedastic noise within mechanistic neural network approaches.
Lipid metabolism is a central biological process that is commonly studied using destructive mass-spectrometry experiments. A recently proposed strategy, uses multiple labels to extract temporal information about lipid metabolism from a single destructive measurement. However, the computational complexity of the model-based data analysis increases rapidly with the number of labels, creating a fundamental trade-off between the information content of the measurements and the cost of analysis. Here, we examine how the number of modelled labels affects parameter estimation accuracy, trajectory recovery, and computational cost, and whether modelling fewer labels than are experimentally available can mitigate this trade-off. Using synthetic data from a five-label experiment, we find that modelling three of the five labels provides a practical balance between experimental feasibility, inferential power, and computational tractability. In an application to hepatocyte triglyceride cycling, we further show that the most cost-efficient, single-label model can yield biologically implausible predictions for unobserved species, whereas models that resolve more labels better constrain these latent dynamics. These results provide practical guidance for selecting model resolution in multi-label experiments and establish a quantitative basis for balancing inferential power against computational cost.
This work advances epidemic control beyond traditional mass vaccination models by integrating population heterogeneity, network structure, and machine-learning-based decision policies. Using the Email-Eu-core contact network, we compare classical centrality-driven vaccination strategies with graph neural network (GNN) and reinforcement learning (RL) approaches. Across 30 stochastic simulations, classical heuristics, including degree, betweenness, and layer-based vaccination, exhibit similar performance, reflecting the network's dense connectivity and modest community structure. In contrast, the GNN-based strategy substantially reduces peak infection, final epidemic size, and time to peak, demonstrating its ability to identify structurally critical nodes that classical metrics overlook. These results show that learning-based vaccination policies can significantly outperform traditional heuristics by exploiting higher-order relational patterns in real-world networks, offering a powerful framework for targeted epidemic intervention.
Human behaviour and epidemic dynamics are intertwined, yet accounting for this feedback remains one of the key challenges of epidemiological modelling. The COVID-19 pandemic was an opportunity to overcome the traditional limitations of the field, raising expectations that data-informed endogenous approaches to behaviour modelling would advance substantially. To quantify the progresses made, we conducted a systematic review of SARS-CoV-2 transmission models endogenously including human behaviour in response to epidemic dynamics. The COVID-19 pandemic saw great strides in terms of the expanded use of empirical data in epi-behavioural modelling. However, it also showed shortcomings with respect to limited use of behavioural empirical data, lack of innovation in model structure, and limited engagement with other disciplines and decision-makers. Overall, our results suggest that identifying priorities in model design and behavioural data, building an adequate data collection infrastructure, leveraging on AI advancements, and fostering interdisciplinarity are strategies of utmost importance for pandemic preparedness.
We develop a four-compartment susceptible--casual--addicted--resistant (SCAR) model for adolescent substance use in a high-school setting. The model divides students into susceptible non-users, casual or experimental users, students with sustained or substance-use-disorder (SUD)-level involvement, and resistant students in protective anti-use environments. It includes peer-driven initiation, escalation from casual to problematic use, protective peer influence, school disengagement, and partial re-entry after rehabilitation. Qualitative analysis and bifurcation diagrams show three main results. First, the return parameter \(\phi\) separates two regimes: when \(\phi=1\), the total population is conserved and interior equilibria may exist; when \(\phi<1\), problematic use causes net school-population loss, so positive scaled equilibria may not represent true endemic equilibria. Second, initiation and escalation are governed by distinct thresholds, meaning first use and progression to problematic use are dynamically different. Third, the model can exhibit multistability, including bistability between a substance-free state and a stable high-use state, so long-term outcomes may depend on initial conditions. These findings suggest that effective school policy should combine universal prevention, early intervention for casual users, targeted support for students at risk of problematic use, recovery-supportive environments, and strong school re-engagement pathways.
Protein language models are trained on highly imbalanced datasets, raising the question of how they represent underrepresented biological sequences. Using viral proteins as a case study across ESM model families, we identify a dominant nativeness axis in embedding space, aligned with masked reconstruction perplexity, that orders sequences from well-modeled cellular proteins through viral proteins to shuffled and random sequences. Scaling contracts this axis unevenly across viral families. Despite this, protein language model embeddings retain viral-specific signal: viral proteins remain linearly separable beyond zero-shot perplexity and shallow sequence features. Together, these results suggest that pLM representations are structured by a general notion of nativeness while preserving information specific to distinct biological groups.
Predicting how a cell's transcriptome responds to a drug it has never seen is a core, hard problem in computational cell biology: recent benchmarks show complex models often fail to beat trivial baselines once test compounds are held out by chemistry. We study one cell line and assay, THP-1 cells profiled by DRUG-seq, scored by the active-compound weighted MSE(wMSE) of the VCPI prediction contest. We propose a staged approach: dumb baselines (untreated control and mean training-compound response) that the field keeps failing to beat; non-parametric retrieval (a Tanimoto-weighted average of a held-out compound's nearest training compounds); and a fusion stage combining a frozen chemistry embedding with retrieval-support features to predict the residual over the mean, with an uncertainty head and gene programs. On the released VCPI THP-1 drug-seq data (14,026 training compounds), under a Bemis-Murcko scaffold split, the model ranking inverts depending on the metric. Under an inverse-variance per-gene proxy, a regularized linear regression on Morgan fingerprints appears to win over the deep models, retrieval, and ChemBERTa -- the textbook "simple baselines win" result. But under the contest's true active-set metric (per-(gene, compound) Mejia weights, validated against the official scorer; mean baseline 0.535 vs the organizers' 0.507 reference), that reverses: the deep models win, our fusion decoder significantly beats the linear fingerprint baseline (-0.012 wMSE, paired bootstrap p < 10^-4), and the proxy's winner becomes the worst chemistry-aware predictor. Picking the metric picks the winner -- to our knowledge the first demonstration on real held-out drug chemistry of the metric-calibration effect established largely on genetic perturbation. We release a reproducible pipeline wired to the official scorer that emits a valid submission over the real 1064 x 12,995 grid.
Machine-learning drug-discovery pipelines increasingly rely on generative models that propose molecules far from the data used to train downstream synthesizability filters. Existing filters (SAScore, SCScore, RAscore, DeepSA) are purely statistical and degrade in exactly this out-of-distribution (OOD) regime. We ask whether cheap, closed-form physical priors, used as auxiliary supervision on a graph neural network (GNN), improve OOD generalization. We add two auxiliary losses to a GINE backbone: a topological complexity regression supervised by the Bertz index, and a strain-energy soft penalty supervised by MMFF94 force-field energy. On a 65,177-molecule corpus (HIV, Tox21, COCONUT) labeled by SAScore thresholds we reproduce a strong in-distribution baseline, then evaluate a 4-way ablation (baseline / +complexity / +strain / +both) on a single-source OOD split (train on drug-like HIV+Tox21, test on COCONUT natural products), repeated over 5 seeds with paired bootstrap confidence intervals. All three physics-aware variants give a small but statistically significant OOD improvement over the baseline (mean OOD AUC 0.9774): +complexity Delta = +0.0060 (95% CI [+0.0023, +0.0102]), +strain Delta = +0.0032 ([+0.0008, +0.0052]), +both Delta = +0.0066 ([+0.0038, +0.0093]); every interval excludes zero, and the combination is best. The variants are indistinguishable in-distribution, so the effect is visible only under OOD evaluation. We are explicit that the effects are modest, and we report a cautionary methodological finding: a single-seed version of this experiment produced a qualitatively different (non-monotone) story that did not survive multi-seed evaluation.
Physics-Informed Neural Networks (PINNs) are an attractive tool for partial-observation problems in biology, where the governing dynamics are known but some compartments cannot be measured. Chemotherapy pharmacokinetics (PK) is a clean instance: drug concentration in plasma is routinely measured, but concentration in tissue -- which determines tumour kill and off-target toxicity -- is not. We benchmark a PINN against the standard clinical baseline (nonlinear least-squares on the analytical biexponential plasma solution, hereafter NLS) and a physics-agnostic neural baseline (a data-only MLP) on two PK problems. On the linear two-compartment problem, NLS is near-optimal; the PINN matches it to within a small constant factor while also producing the tissue curve in a single training pass, whereas the data-only MLP fails on tissue by roughly 10x. On a Michaelis-Menten extension (saturable elimination), the biexponential closed form no longer exists, so NLS is mis-specified and silently returns meaningless rate constants. The PINN instead exposes a deeper fact: the Michaelis-Menten two-compartment model is non-identifiable from plasma alone, and the PINN reports this honestly by converging to a basin with k12 -> 0. Adding two sparse tissue observations largely resolves identifiability: across five seeds the PINN recovers k21 to within 1% of truth and Vmax, Km to within one standard-deviation bar, while k12 moves in the correct direction (0.02 -> 0.82) but remains ~2 sigma below truth -- a recovery the closed-form NLS estimator cannot attempt at all, because its biexponential ansatz describes only plasma. Our claim is not that PINNs beat NLS. It is that PINNs offer a uniform recipe that ties the textbook estimator on the textbook problem, exposes structural identifiability that the textbook estimator hides, and absorbs heterogeneous measurements within a single loss.
Here we address the problem of estimating the dimensionality and nature of parametric variation in an unknown generative process directly from time-series data, without specifying or fitting a model. In particular we suppose that inter-instance variation in collections of time series is caused by parametric variation in the generating model. We hypothesize that, given a sufficiently large library of time-series features, low-dimensional parametric variation will manifest as low-dimensional structure in feature space, enabling interpretable estimators of the underlying degrees of freedom to be constructed. We test our hypothesis using a library of over 7000 diverse and interpretable time-series statistics and thirteen simulated systems with known parametric variation, spanning linear stochastic processes, nonlinear oscillators, and chaotic dynamics. Our unsupervised, data-driven approach often reconstructs the underlying parametric variation across this extensive range of simulated dynamical systems while also yielding interpretable estimators for each underlying dimension. Applied to the movement dynamics of 1143 fruit flies, we use this method to extract biologically meaningful components corresponding to sex and circadian rhythmicity. Our results pave the way for much-needed data-driven methods to bridge the gap between interpretable theoretical understanding of dynamics and the large and complex datasets that characterize modern scientific problems.
Large Language Models such as GPT-4o and GPT-5 achieve strong zero-shot performance on biomedical claim verification, but cost and opacity limit scalable use. We fine-tune three small LLMs: Phi-3-mini (3.8B), Qwen2.5-3B, and Mistral-7B, via QLoRA on SciFact and HealthVer, providing the first study of QLoRA models against GPT-4o and fine-tuned BioLinkBERT encoders. Mistral-7B QLoRA surpasses both GPT-4o and GPT-5 (up to 12% F1 gain) at a fractional cost using just 1,008 training examples. We conduct extensive in-domain and cross-domain evaluation: models trained on SciFact tested on HealthVer and vice versa, at matched sizes to isolate dataset structure from data quantity. We identify a previously unreported structural artifact in SciFact that inflates in-domain scores, and show through bidirectional out-of-domain evaluation that training on structurally sound data enables robust cross-domain transfer. We plan to release all code and adapter checkpoints.
Identifying latent dynamical systems from noisy, high-dimensional measurements is a central problem at the intersection of representation learning, system identification, and scientific discovery. We present DYSCO, a multi-view temporal contrastive learning algorithm that jointly recovers latent trajectories and the governing dynamics from such observations, by leveraging multiple independent noisy views of the same underlying process to disentangle signal from noise. By parameterizing the dynamics in a structured functional basis, our framework further enables symbolic recovery of the governing equations within an affine gauge. We offer theoretical guarantees for strong identification up to an affine indeterminacy, extending prior identifiability results to the realistic setting of noisy nonlinear observations. Empirically, we demonstrate accurate recovery of both latent trajectories and flow fields across a diverse set of dynamical regimes (e.g., chaotic, oscillatory, and metastable) under both Gaussian and Poisson observation noise, the latter being particularly relevant for neural recordings.
Lonafarnib (LNF) is an investigational drug targeting hepatitis delta virus (HDV) but not hepatitis B virus (HBV), providing a unique opportunity to model HDV kinetics and how changes in HDV affect HBV. We performed a detailed kinetic analysis and developed a mathematical model to explain serum HBV DNA, HDV RNA and hepatitis B surface antigen (HBsAg) kinetics in 15 HBV/HDV coinfected patients receiving LNF-based treatment. After a delay of 0-2 days, patients experienced a rapid 1st-phase HDV-decline followed by either a viral plateau, 2nd slower-decline phase, or viral breakthrough (VB). LNF monotherapy led to a flat-partial-response (often followed by VB), while LNF combination therapy with ritonavir or pegylated interferon-$\alpha$ (PEG-IFN$\alpha$) was associated with a biphasic HDV decline (without VB). All treatments except LNF+PEG-IFN$\alpha$ had at least one patient experiencing an increase in HBV on-treatment. Our model successfully reproduced the observed HDV and HBV kinetics. We estimated an HDV RNA half-life of 1.26 days [95% confidence interval, CI: 1.05--1.47] in serum and treatment efficacy of 94% in inhibiting HDV RNA production across all treatments [95% CI: 89%--97%], as reflected by the 1st phase HDV decline. The 2nd phase of HDV decline was explained by a time-dependent increase in efficacy, reaching a maximum of 98.9%. The model explained the increase in serum HBV DNA by a median 4-fold [interquartile range, IQR: 1--28] increase in HBV DNA production rate when HDV declined below an inhibitory threshold. The stability of serum HBsAg was explained by a constant number of HBsAg-producing cells.
Personalized health AI systems face a fundamental cold-start problem: machine learning models for physiological interpretation require weeks of individual behavioral data before they can distinguish constitutional variation from environmentally driven deviation. We propose a solution grounded in causal inference and Bayesian prior design. An individual's genomic profile serves as an exogenous genetic anchor -- a domain-informed, personalized prior that is fixed at conception, immune to reverse causation, and available before a single behavioral observation is collected. The anchor initializes a Bayesian belief state over an individual's physiological set point G-hat = mu + sum(beta_i * g_i), where beta_i are GWAS-derived effect sizes and g_i are risk-allele counts. Each incoming physiological measurement P produces a non-constitutional deviation delta = P - G-hat that separates the signal attributable to environment and state from the constitutionally fixed baseline. As behavioral data accrue, the prior decays according to G-hat_t = w(t)*G-hat_genomic + [1-w(t)]*P-bar_t, transitioning from genome-dominated to empirical-baseline-dominated inference. The same observed HRV of 55 ms generates a suppression hypothesis for a person whose prior predicts 80 ms, and an enhancement hypothesis for a person whose prior predicts 30 ms -- a reversal impossible without a personalized anchor. We develop this architecture across six physiological domains, grading genomic priors by evidence strength, distinguishing robustly replicated anchors (FTO, FADS1/2, FKBP5) from contested candidate genes (SLC6A4, MAOA, DRD2). We address the inference boundary between association, Mendelian randomization, and individual token causation, and define four constraints for deployment: evidence-graded priors, dynamic decay, ancestry-matched effect sizes, and attribution rather than deterministic output.
We often decide how to treat friends based on observations of their past behavior, whereas actions toward strangers are typically guided by their public reputations. These two kinds of information underlie two classical mechanisms for the evolution of cooperation$\unicode{x2014}$direct and indirect reciprocity$\unicode{x2014}$which have largely been studied in isolation. They are not interchangeable: we can recall the past actions of only a small circle of close contacts, whereas for the far larger pool of strangers we must rely on public reputations. Here we develop a mathematical framework built on this distinction. Each individual engages in direct reciprocity in local games within a finite neighborhood of friends, whose actions they observe directly, and in indirect reciprocity in global games with a large population of strangers, known only by reputation. Separating local and global interactions allows us to address two questions. First, can cooperation persist under a cognitively simple norm of judgment? We show that combining direct and indirect reciprocity resolves the scoring dilemma: conditional cooperators resist invasion by both unconditional cooperators and unconditional defectors, where indirect reciprocity alone would fail. Second, how should one treat a friend whose past behavior conflicts with their public reputation? We find that the strategies that maximize cooperation are forgiving$\unicode{x2014}$overlooking whichever piece of information is unfavorable$\unicode{x2014}$and that these forgiving strategies can often remain robust to invasion. By distinguishing between local and global scales of interaction and integrating information across them, our framework offers a more cognitively realistic account of how reciprocity sustains cooperation.
Glaucoma treatment relies on effective delivery of therapeutics to the anterior chamber; however, conventional approaches such as topical administration and intracameral injection are limited by rapid clearance and low intraocular bioavailability. In this study, a Computational Fluid Dynamics (CFD) framework was developed to comparatively evaluate drug transport, retention, and spatial distribution across three delivery strategies: intracameral injection, drug-eluting implants, and topical delivery via contact lens.
The investigation of biological conductivity has evolved from its classical foundation based on ionic fluxes underpinning cardiac and neuronal excitability to a multifaceted regulator of cellular physiology. Traditional approaches for probing electrical events in living matter focused largely on action potentials recording. However, bioelectricity in non-excitable cells governs key phenomena, including developmental patterning, tissue homeostasis, and disease progression. Pioneering studies implicated endogenous bioelectrics in many aspects of morphogenesis, wound healing, regeneration, and cancer. Early findings laid the groundwork for viewing bioelectricity as a means to influence cell fate, cell cycle progression, differentiation, and senescence. More recently, spatial variations in membrane potential within tumor microenvironments were found to correlate with metastatic potential. In parallel, substantial breakthroughs have been achieved in designing advanced bioelectrical interfaces for the study of neuronal networks and cardiac function. This perspective bridges the engineering and biological domains by examining how such technologies might enable new insights into non-excitable cell electrical events at different scales of operation to ultimately manipulate cellular pathways in cancer reprogramming, anti-aging interventions, and gene expression modulation.
Reconstructing latent state-space geometry from time series provides a powerful route to studying nonlinear dynamics across complex systems. Delay-coordinate embedding provides the theoretical basis but assumes long, noise-free recordings, which many domains violate. In many real-world domains, recordings are short, noisy, and coarsely sampled; in neuroimaging, for example, fMRI additionally contains autocorrelated background structure that can obscure oscillatory components and destabilize embeddings. We propose bootstrap Monte Carlo singular spectrum analysis (BMC-SSA), which combines Monte Carlo SSA with bootstrap stability to retain oscillatory modes that are statistically supported and reproducible across resampled data. This produces reconstructions that emphasize reliable oscillatory structure, enhancing determinism and stabilizing subsequent embeddings. Our results show that BMC-SSA improves the reliability of functional measures and uncovers differences in state-space dynamics in fMRI, offering a general framework for robust embedding of noisy, finite signals.
Evolution occurs in populations of reproducing individuals. In stochastic descriptions of evolutionary dynamics, such as the Moran process, individuals are chosen randomly for birth and for death. If the same type is chosen for both steps, then the reproductive event is wasted, because the composition of the population remains unchanged. Here we introduce a new phenotype, which we call a replacer. Replacers are efficient competitors. When a replacer is chosen for reproduction, the offspring will always replace an individual of another type (if available). We determine the selective advantage of replacers in well-mixed populations and on one-dimensional lattices. We find that being a replacer substantially boosts the fixation probability of neutral and deleterious mutants. In particular, fixation probability of a single neutral replacer who invades a well-mixed population of size $N$ is of the order of $1/\sqrt N$ rather than the standard $1/N$. Even more importantly, replacers are much better protected against invasions once they have reached fixation. Therefore, replacers dominate the mutation selection equilibrium even if the phenotype of being a replacer comes at a substantial cost: curiously, for large population size and small mutation rate the relative reproductive rate of a successful replacer can be as low as $1/e$.
Global pandemics, such as the recent COVID-19 crisis, highlight the need for stochastic epidemic models that can capture the randomness inherent in the spread of disease. Such models must be accompanied by methods for estimating parameters in order to generate fast nowcasts and short-term forecasts that can inform public health decisions. This paper presents a comparison of two advanced Bayesian inference methods: 1) pseudo-marginal particle Markov chain Monte Carlo, using an unbiased likelihood estimate obtained by Particle Filter (PF), and 2) Conditional Normalizing Flows (CNF). We investigate their performance on three commonly used compartmental models: A classical Susceptible-Infected-Susceptible (SIS), a Susceptible-Infected-Recovered (SIR) model and a two-variant Susceptible-Exposed-Infected-Recovered (SEIR) model, complemented by an observation model that maps latent trajectories to empirical data. Addressing the challenges of intractable likelihoods for parameter inference in stochastic settings, our analysis highlights how these likelihood-free methods provide accurate and robust inference capabilities. The results of our simulation study further underscore the effectiveness of these approaches in capturing the stochastic dynamics of epidemics, providing prediction capabilities for the control of epidemic outbreaks. Results on an Ethiopian cohort study demonstrate operational robustness under real-world noise and irregular data sampling. To facilitate reuse and to enable building pipelines that ultimately contribute to better informed decision making in public health, we make code and synthetic datasets publicly available.
Learning, inference, memory, and emergence in biological and artificial systems are often described using disparate theoretical frameworks. Here we develop a cognitive field theory in which cognition is described as a collective nonequilibrium phenomenon governed by the geometry and relaxation spectrum of a learned cognitive manifold. Starting from a stochastic cognitive-field equation on an adaptive Riemannian cognitive manifold, we derive a memory-dressed cognitive field equation incorporating nonlocal memory kernels and retarded self-energy feedback. We show that the local stability structure of learned cognitive geometry generates a spectrum of collective relaxation modes whose distribution is characterized by a time-scale density of states (TDOS). The TDOS provides a fundamental dynamical descriptor of cognition and determines the emergent memory kernel, collective response, and infrared temporal organization of the cognitive field. The accumulation of weakly damped collective modes suppresses the cognitive forgetting gap, enhances collective susceptibility, and drives the system toward a protected near-critical regime characterized by long-time contextual persistence and collective cognitive coherence. The resulting framework provides a unified dynamical description of learning, memory, inference, selfhood, and emergent cognition in terms of the collective organization of a memory-dressed cognitive field.
Precision oncology is currently limited by the small-N, large-P paradox, where high-dimensional genomic data is abundant but pharmacological response samples are sparse. While deep learning achieves predictive accuracy, it frequently fails to provide the mechanistic clarity required for clinical adoption. We present the Contextual Invertible World Model (CIWM), a Neuro-Symbolic Agentic Framework that bridges this gap by integrating a quantitative machine learning emulator with a Large Language Model reasoning layer. Utilising a stringently curated, high-fidelity data engineering pipeline on the Sanger GDSC dataset (\( N=83 \)), we isolate true biological signals from in vitro artifacts to establish a rigorous baseline predictive correlation for complex transcriptomics (\( r=0.268 \)). Through Inverse Reasoning, we perform in silico CRISPR perturbations across the colorectal landscape. The framework autonomously overturns classical mechanistic assumptions, identifying a hierarchical dominance of mutant KRAS over the APC/Wnt-axis in driving 5-fluorouracil resistance (\( \Delta=-0.0469 \)) via a "KRAS Shield" mapped to MAPK/PI3K networks. Furthermore, the agentic layer identified a "PIK3CA Paradox", revealing that repairing PIK3CA inadvertently increases chemoresistance (\( \Delta=+0.0085 \)) by triggering a compensatory feedback loop that hyperactivates the dominant MAPK survival pathway.
Dynamic functional connectivity (dFC) derived from resting-state functional magnetic resonance imaging (fMRI) has been extensively utilized in brain science research. The sliding window correlation (SWC) method is a widely used approach for constructing dFC by computing correlation coefficients between amplitude time series of signals from pairs of brain regions. In this study, we propose an integrated approach that incorporates both amplitude and phase information of fMRI signals to improve the detection of brain disorders. Specifically, we introduce a multi-scale fusion learning framework, namely MSFL, which leverages two complementary dFC features derived from SWC and phase synchronization (PS). Here, SWC captures amplitude correlations, while PS measures phase coherence within dFC. We evaluated the efficacy of MSFL in classifying autism spectrum disorder and major depressive disorder using two publicly available datasets: ABIDE I and REST-meta-MDD, respectively. The results indicate that MSFL significantly outperforms existing comparative models. Moreover, we performed model explanation analysis using the SHAP framework, which showed that both types of dFC features from SWC and PS contribute to detecting brain disorders.
Motivation: Protein function prediction is a challenging task and an open problem in computational biology. The Critical Assessment of protein Function Annotation (CAFA) is a triennial, community-driven initiative that provides an independent, large-scale evaluation of computational methods for protein function prediction through time-delayed benchmarking experiments. CAFA has played a key role in highlighting high-performing methodologies and fostering detailed analysis and exchange of ideas. However, outside the periodic CAFA challenges, there is no platform for the continuous evaluation of newly developed methods and tracking performance as function annotations accumulate. Results: Here we introduce the Longitudinal Assessment of Protein Function Annotation Models server (LAFA) as a persistent benchmarking system for protein function prediction methods. LAFA provides a continuous evaluation of containerized function prediction methods, enabling up-to-date and robust comparative assessment of method performance under evolving ground truth. LAFA accelerates methodological iteration, supports reproducibility, and offers a more dynamic and fine-grained view of progress in protein function prediction. Code and Data Availability: LAFA is available at this https URL. Detailed evaluation results can be found at this https URL
The human brain represents objects in a way that is both invariant across instances and flexible enough to support different contexts and tasks. Yet it remains unknown how object representations are dynamically remapped as the same object shifts across contextual roles. Using fMRI during naturalistic movie viewing we investigated how the same objects are represented when they are passive scene elements versus targets of goal-directed actions. Action targets engaged a parietal action network centered in the supramarginal and postcentral gyri, while passive objects recruited a distributed occipito-temporal network involved in visual object recognition. Within context-selective networks, representational geometry showed a double dissociation: target objects were organized by action affordance and hand posture affordance dimensions, while passive objects aligned with semantic dimensions. Visual representational structure was invariant to context. Outside these networks, representational content retained invariance, indicating that flexibility and invariance operate at different levels of the same representational system. These findings demonstrate neural remapping of object representations depending on moment-to-moment changes in contextual roles during a naturalistic scene.
Topologically Associating Chromatin Domains are spatially distinct chromatin regions that regulate transcription by segregating active and inactive genomic elements. Empirical studies show that their formation correlates with local patterns of epigenetic markers, yet the precise mechanisms linking 1D epigenetic landscapes to 3D chromatin folding remain unclear. Recent models represent chromatin as a spin system, where nucleosomes are treated as discrete-state variables coupled by interaction strengths derived from genomic and epigenetic data. Classical samplers struggle with these models due to high frustration and dense couplings. Here, we present a quantum annealing (QA) approach to efficiently sample chromatin states, embedding an epigenetic Ising model into the topology of D-Wave quantum processors. Rather than reconstructing exact TAD size distributions or insulation scores, our method reproduces statistical features, such as mean marker incidences and intra-/inter-nucleosome correlations, while generating configurations that exhibit TAD-like structural motifs. These results demonstrate QA as an alternative to explore the chromatin architecture and provide a foundation in epigenetic modeling.
We introduce the Urysohn Machine, an effective model of classification-oriented computation in which metric separation, frontier structure, and contraction are explicit parts of the computational state. Its basic object is a \emph{Urysohn Triple}: a support region, a target partition, and a separating classifier stored in a reusable Metric Library. The topological foundation is a constructive Urysohn Realization theorem for finite simplicial settings. It builds separators from dyadic ladders of nested polyhedral regions and equips their frontiers with a chain-level calculus: frontiers are cycles, and shells between levels have boundaries given by differences of frontiers. This construction yields two related complexity measures: decision-boundary width, the geometric measure of a single classifier's boundary, and Urysohn width, the total frontier mass represented by a library or realization. We prove an Amortized Separation Theorem showing that approximating a boundary of width to accuracy requires a number of simple basis triples proportional to boundary width and inversely proportional to resolution, under explicit boundary-footprint assumptions. We also introduce a contrastive separation operator whose graph-cut functional consistently estimates decision-boundary width from sampled metric data, while its Laplacian spectrum certifies class-component structure and conductance. Finally, we analyze the dynamic Urysohn ladder and prove four guarantees: separability under quotient collapse, stability of committed frontiers, bounded capacity under contraction, and scalability with quotient distance. Together, these results give a metric-topological account of classification complexity, amortized inference, and compositional reuse that preserves classical computability while exposing geometric structure hidden by purely symbolic descriptions.
The maximum agreement forest (MAF) problem in phylogenetics takes as input a set t >= 2 of binary phylogenetic trees T on the same set of taxa X. It asks for a partition of X into the smallest number of blocks such that the subtrees induced by these blocks are disjoint and have common topology across all the trees in T. We produce a modified version of the well-known chain reduction rule in order to prove that after exhaustive application of reduction rules each tree has O( t * r * k ) leaves, where k is the natural parameter (the number of blocks) and r=min{max{k,3},t+1}}. We prove this bound for both the unrooted and rooted version of the problem, and demonstrate that the bound r, the length to which common chains are truncated, is tight. Our results constitute the first kernels for MAF in the t>2 regime.