New articles on Quantitative Biology


[1] 2603.15702

Whole slide and microscopy image analysis with QuPath and OMERO

QuPath is open-source software for bioimage analysis. As a desktop application that is flexible and easy to install, QuPath is used by labs worldwide to visualise and analyse large and complex images. However, relying only on images stored only on a local file system limits QuPath's use for larger studies. This paper describes a new extension that enables QuPath to access pixels and metadata from an OMERO server. This enhances the software by allowing it to work efficiently with images stored remotely, while also serving as a template for developers who want to connect QuPath to other image management systems.


[2] 2603.16064

Evaluating Targeted Mobility Restrictions on COVID-19 Transmission in Seoul: A Metapopulation Modeling Study Using Mobile Phone Data

Broad mobility restrictions can help control infectious disease spread, but their socioeconomic costs and the variation in transmission risks by mobility purpose, age group, and spatial connectivity highlight the need for targeted approaches. In this study, we developed an age-structured SEIR metapopulation model for COVID-19 across Seoul's 25 districts, integrating mobile phone-derived origin-destination data. We stratified mobility by age (0-19, 20-59, 60+) and purpose: residential (H), school/work (W), and other non-routine (O). Using 2024 mobility data as a baseline and incorporating pandemic-period (2020-2021) mobility deviations, we investigated counterfactual strategies under various targeting scenarios. Our results showed that W restrictions among adults aged 20-59 produced the highest per-capita reductions in infection. Spatial clustering based on population-adjusted W inflows showed that high-inflow central business districts corresponded to the fast-spreading districts identified in the simulations. Targeting W flows into and within this cluster consistently reduced epidemic size across uncertain seeding locations. Furthermore, weekday-inclusive schedules outperformed weekend-only restrictions. Overall, our findings suggest that although citywide restrictions achieve larger reductions, strategically targeting routine school/work mobility among adults aged 20-59 within fast-spreading clusters can provide substantial epidemiological benefits while reducing broader socioeconomic disruption.


[3] 2603.16178

Early Pre-Stroke Detection via Wearable IMU-Based Gait Variability and Postural Drift Analysis

Early identification of individuals at risk of stroke remains a major clinical challenge, as prodromal motor im- pairments are often subtle and transient. In this pilot study, a wearable sensor-based framework is proposed for early pre- stroke risk screening using a single inertial measurement unit mounted on the sacral region to capture pelvic motion during gait and standing tasks. The pelvis is treated as a biomechanical proxy for global motor control, enabling the quantification of gait variability and postural drift as digital biomarkers of neurological instability. Raw inertial signals are processed using a sensor fusion pipeline to estimate pelvic kinematics, from which variability and nonlinear dynamic features are extracted. These features are subsequently used to train a machine learning model for risk stratification across control, pre-stroke, and stroke groups. Progressive increases in pelvic angular variability and postural instability are observed from the control to stroke groups, with the pre-stroke cohort exhibiting intermediate char- acteristics. As a proof-of-concept investigation, the proposed framework demonstrates the feasibility of using a minimal wearable configuration to capture pelvic micro-instability associ- ated with early cerebrovascular motor adaptation. The classifier achieves a macro-averaged area under the curve of 0.785, indicating preliminary discriminative capability between risk categories. While not intended for clinical diagnosis, the proposed approach provides a low-cost, non-invasive, and scalable solution for continuous community-level screening, supporting proactive intervention prior to the onset of major stroke events.


[4] 2603.16194

TPMM: Three-component Posterior Mixture Model Enables Robust Inverton Detection in Low-Depth Metagenomes and Suggests Potential Viral Invertons

Bacterial phase variation enables reversible, locus-specific phenotypic switching, often driven by DNA inversion (invertons). To identify these events, researchers commonly rely on sequencing reads that provide orientation-specific support. Metagenomic sequencing, which captures total genetic material independent of cultivation, offers a powerful platform for the comprehensive study of invertons. However, computational inverton calling from metagenomic data is difficult at low sequencing depth: hard read-support cutoffs can miss true events, while sequence-only predictors lack read-backed interpretability and uncertainty quantification. To address this, we present TPMM, a three-component posterior mixture model for inverton calling in metagenomic data. TPMM explicitly incorporates sequencing depth to formulate inverton detection as a probabilistic mixture problem. Starting from candidates flanked by inverted repeats, the model classifies the candidates into noise, low-probability, or high-probability inversion signals using read evidence. Finally, TPMM assigns posterior probabilities as soft labels and applies cumulative Bayesian False Discovery Rate control to robustly identify true invertons. On two real gut metagenomic datasets, TPMM agrees well with PhaseFinder at high depth but recovers substantially more invertons under systematic downsampling, demonstrating superior performance in sparse-data regimes. We further examine potential reversible inversion elements in viral genomes and provide supporting analyses, suggesting a broader scope for inversion-mediated regulation.


[5] 2603.16288

Hippocampus mediates conceptual generalization of pain modulation

Pain is strongly influenced by expectations and learning from previous experience, such as in classical conditioning. Conditioned responses and expectations can generalize to perceptually and conceptually related cues, but how generalization influences pain experience and the neurobiological processing of pain remains unclear. We used fMRI and multilevel mediation analyses to address this question. Thirty-six human participants first learned to associate two visual cues from distinct conceptual categories (e.g., animals vs. vehicles) with high or low levels of heat pain. In a subsequent phase, they were presented novel cues (images, drawings, or words) not previously paired with pain, but which shared the conceptual category of the initial pain-predictive cues. Participants who developed explicit expectations during learning reported greater pain in response to stimuli conceptually related to high-vs. low-pain cues ('generalization stimuli'), demonstrating generalization of cue influences on pain. This effect was mediated by increased pain-related activity to generalization stimuli in the hippocampus, which correlated with individual differences in cue-evoked expectations. A broader network, including areas of the default mode network and striatum, also contributed to conceptual generalization of pain modulation, while threat-related regions such as the amygdala responded to generalization stimuli but did not mediate effects on pain ratings. These findings extend our understanding of expectancy-driven pain modulation by showing how conceptual processes can influence pain and its neurobiological substrates, offering new insight into placebo effects and maladaptive learning in chronic pain.


[6] 2603.16501

The immediate effect of kangaroo mother care on Mother-infant inter-brain synchrony and infant brain function

Kangaroo mother care (KMC) is an intervention involving skin-to-skin contact that promotes physiological stability and supports long-term neurodevelopment in preterm infants. However, the underlying neurophysiological mechanisms remain unclear. We aimed to investigate the immediate effects of the first KMC on infants' brain function, mother-infant inter-brain synchrony, as well as their associations. Fifty-eight preterm infants (gestational age < 32 weeks or birth weight < 1500 g) and their mothers underwent synchronous dual-electroencephalography recording before and during the first KMC session. Infant brain function was assessed via power spectrum energy and graph theory-based network metrics, and mother-infant inter-brain synchrony was quantified using phase-locking value (PLV), from which inter-brain density and inter-brain strength were calculated. Correlation analyses were performed between infant intra-brain metrics and inter-brain synchrony this http URL the first KMC, preterm infants showed enhanced theta, alpha, and beta power alongside reduced relative delta power, while brain network topological metrics remained stable. Concurrently, mother-infant inter-brain synchrony was significantly enhanced across all frequency bands, as evidenced by increased inter-brain density and strength (all p < .001). Furthermore, in the alpha band, inter-brain strength correlated positively with infant local efficiency and clustering coefficient, and in the beta band, it was positively correlated with infant small-worldness. The first KMC session can immediately enhance both preterm infant single-brain activity and mother-infant inter-brain synchrony. The strength of inter-brain synchrony is associated with the infant's intra-brain network organization, suggesting that KMC may promote intra-brain development in preterm infants via enhancing mother-infant inter-brain synchrony.


[7] 2603.16587

HistoAtlas: A Pan-Cancer Morphology Atlas Linking Histomics to Molecular Programs and Clinical Outcomes

We present HistoAtlas, a pan-cancer computational atlas that extracts 38 interpretable histomic features from 6,745 diagnostic H&E slides across 21 TCGA cancer types and systematically links every feature to survival, gene expression, somatic mutations, and immune subtypes. All associations are covariate-adjusted, multiple-testing corrected, and classified into evidence-strength tiers. The atlas recovers known biology, from immune infiltration and prognosis to proliferation and kinase signaling, while uncovering compartment-specific immune signals and morphological subtypes with divergent outcomes. Every result is spatially traceable to tissue compartments and individual cells, statistically calibrated, and openly queryable. HistoAtlas enables systematic, large-scale biomarker discovery from routine H&E without specialized staining or sequencing. Data and an interactive web atlas are freely available at this https URL .


[8] 2603.16770

Training a force field for proteins and small molecules from scratch

Force fields for molecular dynamics are usually developed manually, limiting their transferability and making systematic exploration of functional forms challenging. We developed a graph neural network that assigns all force field parameters for diverse molecules using continuous atom typing. The freely-available model, called Garnet, was trained on quantum mechanical, condensed phase and protein nuclear magnetic resonance data without the use of existing parameters. The resulting force field shows comparable performance to current force fields on small molecules, folded proteins, protein complexes and disordered proteins. It shows similar results to popular approaches for relative binding free energy predictions across a range of targets. Assessing different functional forms shows that the double exponential potential is a flexible and accurate alternative to the Lennard-Jones potential. Garnet provides a platform for automated, reproducible force field discovery that brings the benefits of machine learning to classical force fields.


[9] 2603.16773

Age-dependent distribution of officially reported cases of vector-borne infections

OBJECTIVE: To propose a new approach to analyze the age-distribution of reported cases for vector-transmitted infections. METHODS: Using officially reported number of cases of dengue, Zika, chikungunya, malaria and leishmaniasis for distinct geographical areas, in different periods. Data were treated in special but well-known procedure, transforming the raw data into a density age-dependent distribution and fitting a special continuous function to it. RESULTS: We found that the proportion of age-dependent cases with respect to the total number of cases in a given year (or any transmission season) is probably determined by the ecological interactions between vectors and hosts. The age-distribution of the proportion of cases for the three Aedes-related infections are essentially the same independently of the magnitude of the outbreak and the geographical region considered. On the other hand, for the infections transmitted by other vectors, the age-distributions of the proportion of cases are entirely different. CONCLUSIONS: During specific outbreaks, the ratio between the age distribution of the proportion of officially reported cases and the total number of cases for Aedes transmitted infections such as dengue, chikungunya and zika is independent of the size of the outbreak, the size of the studied population, the period when the outbreak occurs; and the geographical region considered. Our results also suggest that the age-distribution of cases is mainly due to the interaction between vectors and their hosts.


[10] 2603.15711

Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease

Alkaptonuria (AKU) is an ultra-rare autosomal recessive metabolic disorder caused by mutations in the HGD (Homogentisate 1,2-Dioxygenase) gene, leading to a pathological accumulation of homogentisic acid (HGA) in body fluids and tissues. This leads to systemic manifestations, including premature spondyloarthropathy, renal and prostatic stones, and cardiovascular complications. Being ultra-rare, the amount of data related to the disease is limited, both in terms of clinical data and literature. Knowledge graphs (KGs) can help connect the limited knowledge about the disease (basic mechanisms, manifestations and existing therapies) with other knowledge; however, AKU is frequently underrepresented or entirely absent in existing biomedical KGs. In this work, we apply a text-mining methodology based on PubTator3 for large-scale extraction of biomedical relations. We construct two KGs of different sizes, validate them using existing biochemical knowledge and use them to extract genes, diseases and therapies possibly related to AKU. This computational framework reveals the systemic interactions of the disease, its comorbidities, and potential therapeutic targets, demonstrating the efficacy of our approach in analyzing rare metabolic disorders.


[11] 2603.16185

Sample-Efficient Adaptation of Drug-Response Models to Patient Tumors under Strong Biological Domain Shift

Predicting drug response in patients from preclinical data remains a major challenge in precision oncology due to the substantial biological gap between in vitro cell lines and patient tumors. Rather than aiming to improve absolute in vitro prediction accuracy, this work examines whether explicitly separating representation learning from task supervision enables more sample-efficient adaptation of drug-response models to patient data under strong biological domain shift. We propose a staged transfer-learning framework in which cellular and drug representations are first learned independently from large collections of unlabeled pharmacogenomic data using autoencoder-based representation learning. These representations are then aligned with drug-response labels on cell-line data and subsequently adapted to patient tumors using few-shot supervision. Through a systematic evaluation spanning in-domain, cross-dataset, and patient-level settings, we show that unsupervised pretraining provides limited benefit when source and target domains overlap substantially, but yields clear gains when adapting to patient tumors with very limited labeled data. In particular, the proposed framework achieves faster performance improvements during few-shot patient-level adaptation while maintaining comparable accuracy to single-phase baselines on standard cell-line benchmarks. Overall, these results demonstrate that learning structured and transferable representations from unlabeled molecular profiles can substantially reduce the amount of clinical supervision required for effective drug-response prediction, offering a practical pathway toward data-efficient preclinical-to-clinical translation.


[12] 2603.16281

Laya: A LeJEPA Approach to EEG via Latent Prediction over Reconstruction

Electroencephalography (EEG) is a widely used tool for studying brain function, with applications in clinical neuroscience, diagnosis, and brain-computer interfaces (BCIs). Recent EEG foundation models trained on large unlabeled corpora aim to learn transferable representations, but their effectiveness remains unclear; reported improvements over smaller task-specific models are often modest, sensitive to downstream adaptation and fine-tuning strategies, and limited under linear probing. We hypothesize that one contributing factor is the reliance on signal reconstruction as the primary self-supervised learning (SSL) objective, which biases representations toward high-variance artifacts rather than task-relevant neural structure. To address this limitation, we explore an SSL paradigm based on Joint Embedding Predictive Architectures (JEPA), which learn by predicting latent representations instead of reconstructing raw signals. While earlier JEPA-style methods often rely on additional heuristics to ensure training stability, recent advances such as LeJEPA provide a more principled and stable formulation. We introduce Laya, the first EEG foundation model based on LeJEPA. Across a range of EEG benchmarks, Laya demonstrates improved performance under linear probing compared to reconstruction-based baselines, suggesting that latent predictive objectives offer a promising direction for learning transferable, high-level EEG representations.


[13] 2603.16384

Controlling Fish Schools via Reinforcement Learning of Virtual Fish Movement

This study investigates a method to guide and control fish schools using virtual fish trained with reinforcement learning. We utilize 2D virtual fish displayed on a screen to overcome technical challenges such as durability and movement constraints inherent in physical robotic agents. To address the lack of detailed behavioral models for real fish, we adopt a model-free reinforcement learning approach. First, simulation results show that reinforcement learning can acquire effective movement policies even when simulated real fish frequently ignore the virtual stimulus. Second, real-world experiments with live fish confirm that the learned policy successfully guides fish schools toward specified target directions. Statistical analysis reveals that the proposed method significantly outperforms baseline conditions, including the absence of stimulus and a heuristic "stay-at-edge" strategy. This study provides an early demonstration of how reinforcement learning can be used to influence collective animal behavior through artificial agents.


[14] 2603.16562

Understanding Cell Fate Decisions with Temporal Attention

Understanding non-genetic determinants of cell fate is critical for developing and improving cancer therapies, as genetically identical cells can exhibit divergent outcomes under the same treatment conditions. In this work, we present a deep learning approach for cell fate prediction from raw long-term live-cell recordings of cancer cell populations under chemotherapeutic treatment. Our Transformer model is trained to predict cell fate directly from raw image sequences, without relying on predefined morphological or molecular features. Beyond classification, we introduce a comprehensive explainability framework for interpreting the temporal and morphological cues guiding the model's predictions. We demonstrate that prediction of cell outcomes is possible based on the video only, our model achieves balanced accuracy of 0.94 and an F1-score of 0.93. Attention and masking experiments further indicate that the signal predictive of the cell fate is not uniquely located in the final frames of a cell trajectory, as reliable predictions are possible up to 10 h before the event. Our analysis reveals distinct temporal distribution of predictive information in the mitotic and apoptotic sequences, as well as the role of cell morphology and p53 signaling in determining cell outcomes. Together, these findings demonstrate that attention-based temporal models enable accurate cell fate prediction while providing biologically interpretable insights into non-genetic determinants of cellular decision-making. The code is available at this https URL.


[15] 2603.16741

Bayesian Inference of Psychometric Variables From Brain and Behavior in Implicit Association Tests

Objective. We establish a principled method for inferring mental health related psychometric variables from neural and behavioral data using the Implicit Association Test (IAT) as the data generation engine, aiming to overcome the limited predictive performance (typically under 0.7 AUC) of the gold-standard D-score method, which relies solely on reaction times. Approach. We propose a sparse hierarchical Bayesian model that leverages multi-modal data to predict experiences related to mental illness symptoms in new participants. The model is a multivariate generalization of the D-score with trainable parameters, engineered for parameter efficiency in the small-cohort regime typical of IAT studies. Data from two IAT variants were analyzed: a suicidality-related E-IAT ($n=39$) and a psychosis-related PSY-IAT ($n=34$). Main Results. Our approach overcomes a high inter-individual variability and low within-session effect size in the dataset, reaching AUCs of 0.73 (E-IAT) and 0.76 (PSY-IAT) in the best modality configurations, though corrected 95% confidence intervals are wide ($\pm 0.18$) and results are marginally significant after FDR correction ($q=0.10$). Restricting the E-IAT to MDD participants improves AUC to 0.79 $[0.62, 0.97]$ (significant at $q=0.05$). Performance is on par with the best reference methods (shrinkage LDA and EEGNet) for each task, even when the latter were adapted to the task, while the proposed method was not. Accuracy was substantially above near-chance D-scores (0.50-0.53 AUC) in both tasks, with more consistent cross-task performance than any single reference method. Significance. Our framework shows promise for enhancing IAT-based assessment of experiences related to entrapment and psychosis, and potentially other mental health conditions, though further validation on larger and independent cohorts will be needed to establish clinical utility.


[16] 2603.16789

Conservative Continuous-Time Treatment Optimization

We develop a conservative continuous-time stochastic control framework for treatment optimization from irregularly sampled patient trajectories. The unknown patient dynamics are modeled as a controlled stochastic differential equation with treatment as a continuous-time control. Naive model-based optimization can exploit model errors and propose out-of-support controls, so optimizing the estimated dynamics may not optimize the true dynamics. To limit extrapolation, we add a consistent signature-based MMD regularizer on path space that penalizes treatment plans whose induced trajectory distribution deviates from observed trajectories. The resulting objective minimizes a computable upper bound on the true cost. Experiments on benchmark datasets show improved robustness and performance compared to non-conservative baselines.


[17] 2603.16801

A low-data, low-cost, and open-source workflow for 3D printing lithographs for digital accessibility of microscopy images

Describe an animal without using the verb look. Can you effectively provide an alternative method for interpreting complex microscopy images while preserving the length scale? The world is filled with features too small for our eyes to see: the setae on a gecko's feet, the cuticles covering a rat's whisker, or the fuzziness of a bat's wing. Furthermore, these structures are non-homogeneous, often shifting from stiff to soft. We provide a workflow for producing low-data, low-cost, and open-source lithograph files, allowing tactile accessibility in microscopy images. The lithographs made with this workflow can be printed on a 350 USD 3D printer using 3D files under 100 Mb, for a total cost per print of 0.75 USD. This work seeks to leverage advanced 3D printing to create tactile graphics and art that make science more accessible and enable tactile exploration of biological structures. This framework in this text is aligned with a GitHub repository that will be constantly updated, allowing tactile media to be created as 3D printing and lithography become more streamlined in the years to come.


[18] 2012.14309

General Mechanism of Evolution Shared by Proteins and Words

Complex systems, such as life and languages, are governed by principles of evolution. The analogy and comparison between biology and linguistics\cite{alphafold2, RoseTTAFold, lang_virus, cell language, faculty1, language of gene, Protein linguistics, dictionary, Grammar of pro_dom, complexity, genomics_nlp, InterPro, language modeling, Protein language modeling} provide a computational foundation for characterizing and analyzing protein sequences, human corpora, and their evolution. However, no general mathematical formula has been proposed so far to illuminate the origin of quantitative hallmarks shared by life and language. Here we show several new statistical relationships shared by proteins and words, which inspire us to establish a general mechanism of evolution with explicit formulations that can incorporate both old and new characteristics. We found natural selection can be quantified via the entropic formulation by the principle of least effort to determine the sequence variation that survives in evolution. Besides, the origin of power law behavior and how changes in the environment stimulate the emergence of new proteins and words can also be explained via the introduction of function connection network. Our results demonstrate not only the correspondence between genetics and linguistics over their different hierarchies but also new fundamental physical properties for the evolution of complex adaptive systems. We anticipate our statistical tests can function as quantitative criteria to examine whether an evolution theory of sequence is consistent with the regularity of real data. In the meantime, their correspondence broadens the bridge to exchange existing knowledge, spurs new interpretations, and opens Pandora's box to release several potentially revolutionary challenges. For example, does linguistic arbitrariness conflict with the dogma that structure determines function?


[19] 2211.13231

Predicting Biomedical Interactions with Probabilistic Model Selection for Graph Neural Networks

Heterogeneous molecular entities and their interactions, commonly depicted as a network, are crucial for advancing our systems-level understanding of biology. With recent advancements in high-throughput data generation and a significant improvement in computational power, graph neural networks (GNNs) have demonstrated their effectiveness in predicting biomedical interactions. Since GNNs follow a neighborhood aggregation scheme, the number of graph convolution (GC) layers (i.e., depth) determines the neighborhood orders from which they can aggregate information, thereby significantly impacting the model's performance. However, it often relies on heuristics or extensive experimentation to determine an appropriate GNN depth for a given biomedical network. These methods can be unreliable or result in expensive computational overhead. Moreover, GNNs with more GC layers tend to exhibit poor calibration, leading to high confidence in incorrect predictions. To address these challenges, we propose a Bayesian model selection framework to jointly infer the most plausible number of GC layers supported by the data, apply dropout regularization, and learn network parameters. Experiments on four biomedical interaction datasets demonstrate that our method achieves superior performance over competing methods, providing well-calibrated predictions by allowing GNNs to adapt their depths to accommodate interaction information from various biomedical networks. Source code and data is available at: this https URL


[20] 2503.18855

Boundary effects in biological planar networks: pentagonsdominate Pyropia marginal cells

The topological and geometrical features at the boundary zone of planar polygonal networks remain poorly understood. Based on observations and mathematical proofs, we propose that marginal cells in the thalli of Pyropia haitanensis, a two-dimensional (2D) biological polygonal network, have an average edge number of approximately five. We demonstrate that this number is maintained by specific division patterns. Furthermore, we observe that both marginal cells and inner cells follow the trends predicted by the Lewis law and Aboav-Weaire law, but each cell type requires its own set of correlation parameters to more accurately describe its topological and geometrical features. The boundary effects are also evident in the differences between marginal cells and inner cells in terms of the distributions of interior angles and edge lengths. Similar to inner cells, cell division tends to occur in marginal cells with large sizes and transects a pair of unconnected edges. In particular, this study finds that the division of marginal cells preferentially transects the marginal edge. These specific topological and geometrical features of marginal cells and division patterns may inform the development of modelling algorithms for boundary conditions in biological 2D cellular networks.


[21] 2507.19416

Dual Mechanisms for Heterogeneous Responses of Inspiratory Neurons to Noradrenergic Modulation

Respiration is an essential involuntary function necessary for survival. This poses a challenge for the control of breathing. The preBötzinger complex (preBötC) is a heterogeneous neuronal network responsible for driving the inspiratory rhythm. While neuromodulators such as norepinephrine (NE) allow it to be both robust and flexible for all living beings to interact with their environment, the basis for how neuromodulation impacts neuron-specific properties remains poorly understood. In this work, we examine how NE influences different preBötC neuronal subtypes by modeling its effects through modulating two key parameters: calcium-activated nonspecific cationic current gating conductance ($g_{\rm CAN}$) and inositol-triphosphate ($\rm IP_3$), guided by experimental studies. Our computational model captures the experimentally observed differential effects of NE on distinct preBötC bursting patterns. We show that this dual mechanism is critical for inducing conditional bursting and identify specific parameter regimes where silent neurons remain inactive in the presence of NE. Furthermore, using methods of dynamical systems theory, we uncover the mechanisms by which NE differentially modulates burst frequency and duration in NaP-dependent and CAN-dependent bursting neurons. These results align well with previously reported experimental findings and provide a deeper understanding of cell-specific neuromodulatory responses within the respiratory network.


[22] 2509.24657

Modelling the control of West Nile virus using mosquito reduction methods, vaccination of equids, and human behavioral adaptation to the usage of personal protective equipment

West Nile virus (WNV) is a mosquito-borne virus in the genus Flavivirus that circulates between mosquitoes and birds, whereas humans, equids, and other mammals are dead-end hosts. Since its emergence in Germany in 2018, the virus has spread across the country, emphasising the need for effective intervention strategies. However, it remains unclear how different strategies should be combined and timed to effectively reduce WNV transmission under temperature-driven dynamics. In this study, we develop a temperature-dependent, process-based model to evaluate the effectiveness of WNV control strategies, such as mosquito reduction methods, equid vaccination, and the use of personal protective equipment (PPE). Human behavioural responses to infection risk are incorporated through imitation dynamics that capture how individuals adopt PPE based on perceived infection risk and social influence. An optimal control problem has been formulated and studied to determine the seasonal timing of mosquito controls under temperature forcing. Results suggest that mosquito control efforts initiated in early spring and intensified in early May, may reduce the August peak in the infectious bird population. Moreover, a combined scenario of mosquito control methods, human PPE adoption, and equid vaccination could be the best strategy among dead-end hosts. The analysis of various combinations of constant controls is available as an interactive application, allowing users to explore intervention strategies under different temperature projections corresponding to the low-mitigation (SSP126), intermediate (SSP245), and high-emission (SSP585) scenarios.


[23] 2603.06819

Modeling Metabolic State Transitions in Obesity Using a Time-Varying Lambda-Omega Framework

Obesity does not emerge abruptly; rather, it develops gradually over extended periods. The gradual progression often prevents early recognition of physiological changes until excess adiposity is established. A common belief is that weight reduction can be achieved simply by "eating less and moving more". Although reductions in caloric intake and increases in physical activity are fundamental principles of weight management, this perspective oversimplifies a complex and adaptive biological system. Metabolic rate, hormonal regulation, behavioral factors, and compensatory physiological responses all influence the body's resistance to changes in weight. During weight loss, reduced metabolic rate and increased efficiency make maintaining a caloric deficit increasingly difficult. Conversely, during periods of overfeeding, resting metabolic rate, the thermic effect of food, and non-exercise activity thermogenesis increase with rising body weight, partially offsetting the caloric surplus and slowing weight gain. However, these compensatory responses are asymmetrical, with stronger and more persistent adaptations to underfeeding than to overfeeding. This asymmetry helps explain why weight gain often occurs gradually and why sustained weight loss is biologically challenging. In this work, we employ a lambda-omega model from dynamical systems theory to describe metabolic regulation in response to lifestyle perturbations. We introduce time-varying parameters that allow the regulatory coefficients to evolve gradually under sustained environmental and physiological stressors. By allowing lambda(t) and omega(t) to vary over time, the model captures progressive shifts in the metabolic set-point and deformation of the underlying dynamical landscape. This framework enables exploration of transitions between metabolic states and long-term adaptations that shape trajectories of weight gain and loss.


[24] 2603.09531

Association of Progressive PPFE and Mortality in Lung Cancer Screening Cohorts

Background: Pleuroparenchymal fibroelastosis (PPFE) is an upper lobe predominant fibrotic lung abnormality associated with increased mortality in established interstitial lung disease. However, the clinical significance of radiologic PPFE progression in lung cancer screening (LCS) populations remains unclear. Methods: We analysed longitudinal low-dose CT scans and clinical data from two LCS studies: National Lung Screening Trial (NLST; n=7,980); SUMMIT study (n=8,561). An automated algorithm quantified PPFE volume on baseline and follow-up scans. Annualised change in PPFE was derived and dichotomised using a distribution-based threshold to define progressive PPFE. Associations between progressive PPFE and mortality were evaluated using Cox proportional hazards models adjusted for demographic and clinical variables. In SUMMIT cohort, associations between progressive PPFE and clinical outcomes were assessed using incidence rate ratios (IRR) and odds ratios (OR). Findings: Progressive PPFE independently associated with mortality in both LCS cohorts (NLST: Hazard Ratio (HR)=1.25, 95% Confidence Interval (CI): 1.01--1.56, p=0.042; SUMMIT: HR=3.14, 95% CI: 1.66--5.97, p<0.001). Within SUMMIT, progressive PPFE was strongly associated with higher respiratory admissions (IRR=2.79, p<0.001), increased antibiotic and steroid use (IRR=1.55, p=0.010), and showed a trend towards higher modified medical research council scores (OR=1.40, p=0.055). Interpretation: Radiologic PPFE progression independently associates with mortality across two large LCS cohorts, and associates with adverse clinical outcomes. Quantitative assessment of PPFE progression may provide a clinically relevant imaging biomarker to identify individuals at increased risk of respiratory morbidity within LCS programmes.


[25] 2603.11872

ELISA: An Interpretable Hybrid Generative AI Agent for Expression-Grounded Discovery in Single-Cell Genomics

Translating single-cell RNA sequencing (scRNA-seq) data into mechanistic biological hypotheses remains a critical bottleneck, as agentic AI systems lack direct access to transcriptomic representations while expression foundation models remain opaque to natural language. Here we introduce ELISA (Embedding-Linked Interactive Single-cell Agent), an interpretable framework that unifies scGPT expression embeddings with BioBERT-based semantic retrieval and LLM-mediated interpretation for interactive single-cell discovery. An automatic query classifier routes inputs to gene marker scoring, semantic matching, or reciprocal rank fusion pipelines depending on whether the query is a gene signature, natural language concept, or mixture of both. Integrated analytical modules perform pathway activity scoringacross 60+ gene sets, ligand--receptor interaction prediction using 280+ curated pairs, condition-aware comparative analysis, and cell-type proportion estimation all operating directly on embedded data without access to the original count matrix. Benchmarked across six diverse scRNA-seq datasets spanning inflammatory lung disease, pediatric and adult cancers, organoid models, healthy tissue, and neurodevelopment, ELISA significantly outperforms CellWhisperer in cell type retrieval (combined permutation test, $p < 0.001$), with particularly large gains on gene-signature queries (Cohen's $d = 5.98$ for MRR). ELISA replicates published biological findings (mean composite score 0.90) with near-perfect pathway alignment and theme coverage (0.98 each), and generates candidate hypotheses through grounded LLM reasoning, bridging the gap between transcriptomic data exploration and biological discovery. Code available at: this https URL (If you use ELISA in your research, please cite this work).


[26] 2603.12253

Binding Free Energies without Alchemy

Absolute Binding Free Energy (ABFE) methods are among the most accurate computational techniques for predicting protein-ligand binding affinities, but their utility is limited by the need for many simulations of alchemically modified intermediate states. We propose Direct Binding Free Energy (DBFE), an end-state ABFE method in implicit solvent that requires no alchemical intermediates. DBFE outperforms OBC2 double decoupling on a host-guest benchmark and performs comparably to OBC2 MM/GBSA on a protein-ligand benchmark. Since receptor and ligand simulations can be precomputed and amortized across compounds, DBFE requires only one complex simulation per ligand compared to the many lambda windows needed for double decoupling, making it a promising candidate for virtual screening workflows. We publicly release the code for this method at this https URL.


[27] 2603.12662

Dual-Laws Model for a theory of artificial consciousness

Objectively verifying the generative mechanism of consciousness is extremely difficult because of its subjective nature. As long as theories of consciousness focus solely on its generative mechanism, developing a theory remains challenging. We believe that broadening the theoretical scope and enhancing theoretical unification are necessary to establish a theory of consciousness. This study proposes seven questions that theories of consciousness should address: phenomena, self, causation, state, function, contents, and universality. The questions were designed to examine the functional aspects of consciousness and its applicability to system design. Next, we will examine how our proposed Dual-Laws Model (DLM) can address these questions. Based on our theory, we anticipate two unique features of a conscious system: autonomy in constructing its own goals and cognitive decoupling from external stimuli. We contend that systems with these capabilities differ fundamentally from machines that merely follow human instructions. This makes a design theory that enables high moral behavior indispensable.


[28] 2603.15217

A multiscale discrete-to-continuum framework for structured population models

Mathematical models of biological populations commonly use discrete structure classes to capture trait variation among individuals (e.g. age, size, phenotype, intracellular state). Upscaling these discrete models into continuum descriptions can improve analytical tractability and scalability of numerical solutions. Common upscaling approaches based solely on Taylor expansions may, however, introduce ambiguities in truncation order, uniform validity and boundary conditions. To address this, here we introduce a discrete multiscale framework to systematically derive continuum approximations of structured population models. Using the method of multiple scales and matched asymptotic expansions applied to discrete systems, we identify regions of structure space for which a continuum representation is appropriate and derive the corresponding partial differential equations. The leading-order dynamics are given by a nonlinear advection equation in the bulk domain and advection-diffusion processes in small inner layers about the leading wavefronts and stagnation point. We further derive discrete boundary layer descriptions for regions where a continuum representation is fundamentally inappropriate. Finally, we demonstrate the method on a simple lipid-structured model for early atherosclerosis and verify consistency between the discrete and continuum descriptions. The multiscale framework we present can be applied to other heterogeneous systems with discrete structure in order to obtain appropriate upscaled dynamics with asymptotically consistent boundary conditions.


[29] 2405.08979

drGT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network

For translational impact, both accurate drug response prediction and biological plausibility of predictive features are needed. We present drGT, a heterogeneous graph deep learning model over drugs, genes, and cell lines that couples prediction with mechanism-oriented interpretability via attention coefficients (ACs). We assess both predictive generalization (random, unseen-drug, unseen-cell, and zero-shot splits) and biological plausibility (use of text-mined PubMed gene-drug co-mentions and comparison to a structure-based DTI predictor) on GDSC, NCI60, and CTRP datasets. Across benchmarks, drGT consistently delivers top regression performance while maintaining competitive classification accuracy for drug sensitivity. Under random 5-fold cross-validation, drGT attains an AUROC of up to 0.945 (3rd overall) and an $R^2$ up to 0.690, outperforming all baselines on regression. In leave-one-out tests for unseen cell lines and drugs, drGT achieves AUROCs of 0.706 and 0.844, and $R^2$ values of 0.692 and 0.022, the only model yielding positive $R^2$ for unseen drugs. In zero-shot prediction, drGT achieves an AUROC of 0.786 and a regression $R^2$ of 0.334, both representing the highest scores among all models. For interpretability, AC-derived drug-gene links recover known biology: among 976 drugs with known DTIs, 36.9% of predicted links match established DTIs, and 63.7% are supported by either PubMed abstracts or a structure-based predictive model. Enrichment analyses of AC-prioritized genes reveal drug-perturbed biological processes, providing pathway-level explanations. drGT advances predictive generalization and mechanism-centered interpretability, offering state-of-the-art regression accuracy and literature-supported biological hypotheses that demonstrate the use of graph learning from heterogeneous input data for biological discovery. Code: this https URL


[30] 2407.19892

Making Multi-Axis Gaussian Graphical Models Scalable to Millions of Cells

Motivation: Networks underlie the generation and interpretation of many biological datasets: gene networks shed light on the regulatory structure of the genome, and cell networks can capture structure of the tumor micro-environment. However, most methods that learn such networks make the faulty 'independence assumption'; to learn the gene network, they assume that no cell network exists. 'Multi-axis' methods, which do not make this assumption, fail to scale beyond a few thousand cells or genes. This limits their applicability to only the smallest datasets. Results: We develop a multi-axis method capable of processing million-cell datasets within minutes. This was previously impossible, and unlocks the use of such methods on modern scRNA-seq datasets, as well as more complex datasets. We show that our method yields novel biological insights from real single-cell data, and compares favorably to the existing hdWGCNA methodology. In particular, it identifies long non-coding RNA genes that potentially have a regulatory or functional role in neuronal development. Availability and implementation: Our methodology is available as a Python package GmGM on PyPI (this https URL). The code for all experiments performed in this paper is available on GitHub (this https URL). Contact: sceba@leeds.this http URL Supplementary information: Our proofs, and some additional experiments, are available in the supplementary material. Keywords: gaussian graphical models, multi-axis models, transcriptomics, multi-omics, scalability


[31] 2504.13727

High-dimensional dynamics in low-dimensional networks

Many networks in nature and applications have an approximate low-rank structure in the sense that their connectivity structure is dominated by a few dimensions. It is natural to expect that dynamics on such networks would also be low-dimensional. Indeed, theoretical results show that low-rank networks produce low-dimensional dynamics whenever the network is isolated from external perturbations or input. However, networks in nature are rarely isolated. Here, we study the dimensionality of dynamics in recurrent networks with low-dimensional structure driven by high-dimensional inputs or perturbations. We find that dynamics in such networks can be high- or low-dimensional and we derive mathematical conditions on the network structure under which dynamics are high-dimensional. In many low-rank networks, dynamics are suppressed in directions aligned with the network's low-rank structure, a phenomenon we term ``low-rank suppression.'' We show that several low-rank network structures arising in nature satisfy the conditions for generating high-dimensional dynamics and low-rank suppression. Our results clarify important, but counterintuitive relationships between a recurrent network's connectivity structure and the structure of its response to external input.


[32] 2505.21777

Memorization to Generalization: Emergence of Diffusion Models from Associative Memory

Dense Associative Memories (DenseAMs) are generalizations of Hopfield networks, which have superior information storage capacity and can store training data points (memories) at local minima of the energy landscape. When the amount of training data exceeds the critical memory storage capacity of these models, new local minima, which are different from the training data, emerge. In Associative Memory these emergent local minima are called $\textit{spurious}\; \textit{states}$, which hinder memory retrieval. In this work, we examine diffusion models (DMs) through the DenseAM lens, viewing their generative process as an attempt of a memory retrieval. In the small data regimes, DMs create distinct attractors for each training sample, akin to DenseAMs below the critical memory storage. As the training data size increases, they transition from memorization to generalization. We identify a critical intermediate phase, predicted by DenseAM theory -- the spurious states. In generative modeling, these states are no longer negative artifacts but rather are the first signs of generative capabilities. We characterize the basins of attraction, energy landscape curvature, and computational properties of these previously overlooked states. Their existence is demonstrated across a wide range of architectures and datasets.


[33] 2509.25386

Spatial correlations in SIS processes on random regular graphs

In network-based SIS models of infectious disease transmission, infection can only occur between directly connected individuals. This constraint naturally gives rise to spatial correlations between the states of neighboring nodes, as the infection status of connected individuals becomes interdependent. Although mean-field approximations and the standard pairwise model are commonly used to simplify disease forecasting on networks, they inadequately capture spatial correlations; mean-field frameworks assume that populations are well-mixed, while the pairwise model neglects correlations beyond nearest-neighbor connections, which leads to inaccurate predictions of infection numbers over time. As such, the development of approximations that account for higher order spatially correlated infections is of great interest, as they offer a compromise between accurate disease forecasting and analytic tractability. Here, we use existing corrections to mean-field theory on the regular lattice to construct a more general framework for equivalent corrections on random regular graph topologies. We derive and simulate a hierarchical system of ordinary differential equations for the time evolution of the spatial correlation function at various geodesic distances on random networks. Solving these equations allows us to predict the time-dependent global infection density, which agrees well with numerical simulations. Our results substantially improve on existing corrections to mean-field theory for infectious individuals in SIS processes and provide an in-depth characterization of how structural randomness in networks affects the dynamical trajectories of infectious diseases on networks.


[34] 2603.14691

A Unified Variational Principle for Branching Transport Networks: Wave Impedance, Viscous Flow, and Tissue Metabolism

The branching geometry of biological transport networks is characterized by a diameter scaling exponent $\alpha$. Two structural attractors compete: impedance matching ($\alpha \sim 2$) for pulsatile flow and viscous-metabolic minimization ($\alpha = 3$) for steady flow. Neither predicts the empirically observed $\alpha_{\mathrm{exp}} = 2.70 \pm 0.20$ in mammalian arterial trees. Incorporating sub-linear vessel-wall scaling $h(r) \propto r^p$ ($p = 0.77$) into a three-term metabolic cost rigorously breaks Murray's cubic law -- via Cauchy's functional equation -- bounding the static optimum to $\alpha_t \in [2.90, 2.94]$. We formulate a unified network-level Lagrangian balancing wave-reflection penalties against transport-metabolic costs. Because the operational duty cycle $\eta$ is uncertain over developmental timescales, we cast the optimization as a zero-sum game between network architecture and environment. Von Neumann's minimax theorem -- proved constructively via strict monotonicity of the cost curves -- yields a unique saddle point $(\alpha^*, \eta^*)$ satisfying an exact equal-cost condition. We further prove $N = 2$ uniquely maximizes the network stiffness ratio $\kappa_{\mathrm{eff}}(N)$, deriving binary branching as a structural consequence of the framework. For the porcine coronary tree ($G = 11$ generations), $\alpha^* = 2.72$, within $0.1\sigma$ of morphometric data. Sensitivity analysis confirms $|\Delta\alpha^*| < 0.01$ across physiological metabolic ranges; the prediction depends critically only on the histological exponent $p$ -- a zero-parameter derivation from fundamental scaling principles.


[35] 2603.15080

Open Biomedical Knowledge Graphs at Scale: Construction, Federation, and AI Agent Access with Samyama Graph Database

Biomedical knowledge is fragmented across siloed databases -- Reactome for pathways, STRING for protein interactions, this http URL for study registries, DrugBank for drug vocabularies, DGIdb for drug-gene interactions, SIDER for side effects. We present three open-source biomedical knowledge graphs -- Pathways KG (118,686 nodes, 834,785 edges from 5 sources), Clinical Trials KG (7,774,446 nodes, 26,973,997 edges from 5 sources), and Drug Interactions KG (32,726 nodes, 191,970 edges from 3 sources) -- built on Samyama, a high-performance graph database written in Rust. Our contributions are threefold. First, we describe a reproducible ETL pattern for constructing large-scale KGs from heterogeneous public data sources, with cross-source deduplication, batch loading (Python Cypher and Rust native loaders), and portable snapshot export. Second, we demonstrate cross-KG federation: loading all three snapshots into a single graph tenant enables property-based joins across datasets. Third, we introduce schema-driven MCP server generation for LLM agent access, evaluated on a new BiomedQA benchmark (40 pharmacology questions): domain-specific MCP tools achieve 98% accuracy vs. 0% for text-to-Cypher and 75% for standalone GPT-4o. All data sources are open-license. The combined federated graph (7.9M nodes, 28M edges) loads in approximately 3 minutes on commodity cloud hardware, and cross-KG queries complete in 80ms-4s.