New articles on Quantitative Biology


[1] 2606.18295

Archetypal Microbiome Profiles as Indicators of Nitrous Oxide Emission States in Activated Sludge

Nitrous oxide (N2O) emissions from water resource recovery facilities (WRRFs) fluctuate over time and can arise from multiple microbial pathways, making source attribution and full-scale prediction difficult. The difficulty is compounded by the high dimensionality of activated sludge microbiomes, whose complex and dynamic community structure can obscure relationships with N2O emission patterns. This study evaluated whether interpretable, low-dimensional representations of activated sludge microbiomes can be correlated with N2O emission states. Temporal 16S rRNA gene amplicon profiles and N2O emission metrics were collected from two full-scale WRRFs in Switzerland. Genus-level relative-abundance profiles were summarized using archetypal analysis (AA), which represents each sample as a convex combination of a small number of interpretable community profiles. In both WRRFs, three archetypes captured most explainable variation in community composition (63%--73%) and defined a simplex state space in which samples clustered near vertices and edges, indicating that community compositions were organized around distinct archetypal states and their mixtures. Without using emission labels while training, the archetypal state space aligned strongly with binary N2O emission states: high-emission observations in both plants concentrated around a specific archetype, and temporal trajectories showed consistent high weights of this archetype during high-emission periods. Functional summaries suggested site-specific but pathway-relevant interpretations of the high-N2O archetype. Temperature further structured the archetypal state space, indicating seasonal forcing of microbiome configurations associated with elevated N2O. Overall, AA provides an interpretable framework to track microbiome regime shifts and may support operational tracking of high-N2O emission states in full-scale WRRFs.


[2] 2606.18302

Protein-Based Fish Species Identification: Dataset, Models, and Insights from Native Bangladeshi Fish

Correct identification of fish species is highly significant for food security, economic development, and climate resilience in Bangladesh. Protein sequences directly reflect functional and evolutionary constraints which are important for species authentication and biodiversity monitoring. Yet there exists no benchmark for native Bangladeshi fish species identification from protein sequence. In this study, we addressed this gap by introducing the first curated dataset for nine native Bangladeshi fish species of 2845 high quality protein sequences. We also established the first protein sequence classification baseline for this domain through a systematic benchmarking of seven architectural paradigms. Moreover, we propose a realistic deployable novel hybrid architecture of MotifCNN and Transformer with Terminal-Aware Positional-Encoding (MotifCNN-Transformer+TA-PE). Our novel architecture achieves 79.80% accuracy with macro-F1 of 0.80. The highest 83.04% accuracy is achieved by finetuned protein language model ProtBERT that has 420M parameters and requires dual 16GB GPUs for inference. According to McNemar's test, ProtBERT's 3.24% accuracy gain over our MotifCNN-Transformer+TA-PE is statistically insignificant (p = 0.1120). Our novel architecture beats it among six of the nine classes in per class identification. Also our MotifCNN-Transformer+TA-PE is approximately 5x faster, 42x smaller, and supports 16x larger batch size than ProtBERT and has GPU free inference, making it more practical for deployment in resources constrained areas such as rural Bangladesh. Beyond this, our foundational work shows effects of phylogenetic relationships on sequence similarity and establishes pathways for fisheries management, food authentication and biodiversity conservation in South Asia's protein dependent economy.


[3] 2606.18523

DART: A design-aware microfluidic chip paradigm for real-time live-cell image analysis

High-throughput microfluidic live-cell imaging generates rich single-cell data. Yet semi-automated procedures for locating regions of interest (RoIs), each containing one cell population, and removing surrounding microfluidic structures from recorded images, scale with the number of RoIs. This prevents real-time image analysis and delays time-to-insight by hours to days. We introduce the Design-Aware and Real-Time capable (DART) paradigm for microfluidic cultivation chips, which aligns the CAD blueprint with the physical chip and thereby enables throughput-independent localization of all RoIs and fully automated image processing across diverse RoI geometries and chip layouts. DART establishes this alignment through embedded fiducial markers and deep-learning-based marker detection. We validate DART using the Swiss Army Knife chip, which combines eight structurally distinct RoI designs across 1164 RoI locations. DART localizes all RoIs in five minutes, removes microfluidic structures from raw microscopy images in 40 ms, and performs fully automated image analysis, including cell segmentation, in under 1.1 s per image. Together, these capabilities establish DART as an end-to-end hardware-software paradigm with real-time-capable analysis that paves the way toward closed-loop and outcome-driven smart microscopy.


[4] 2606.18575

Adaptive COVID-19 Trajectory Forecasting Using MAB-Inspired Ensemble Weighting

Forecasting epidemic trajectories is important for public health decision-making, but no single model is consistently reliable across epidemic phases and forecasting settings. We evaluate Multi-Armed Bandit (MAB)-inspired adaptive weighting strategies for combining epidemic forecasting models when component-model performance changes over time. Using U.S. COVID-19 incidence data from three epidemic waves, we compare UCB, EXP3, and epsilon-greedy weighting rules under fixed short-window and growing calibration windows, with both deterministic and stochastic ensemble variants. The model pool includes SIR, SEIR, GLM, Gompertz, Richards, ARIMA, random walk with drift, simple exponential smoothing, Holt's linear trend method, and exponential growth. Adaptive ensembles are compared with individual models and with naive, unweighted, and inverse-WIS weighted ensemble benchmarks. Forecast performance is assessed using RMSE, weighted interval score (WIS), 95% prediction-interval coverage, and mean 95% prediction-interval width. Across waves, calibration windows, and forecast horizons, EXP3Stoch, EXP3Det, and EPSStoch achieved the lowest mean forecast WIS. The main gains were in probabilistic forecast quality, especially WIS and interval coverage, rather than uniformly lower point forecast error. Simple benchmarks, including the unweighted and inverse-WIS ensembles, remained competitive in several settings. These results suggest that MAB-inspired adaptive weighting is a useful complementary tool for epidemic forecasting, especially when model skill is time-varying and forecast uncertainty is substantial.


[5] 2606.18660

Effects of spatial environmental noise on evolution of cooperation

We investigate the effects of environmental noise on cooperation in a spatial evolutionary game model with variable population size. Building on a one-dimensional lattice model in which vacancies promote cooperation through spatial selection, we add random noise to the environmental quality parameter and consider two distinct types: annealed noise, where the environmental quality fluctu ates independently at each site and each time step, and quenched noise, where each site is assigned a permanently fixed random value. For annealed noise, we develop a mean-field theory by replacing the noise-dependent death probabilities with their distribution averages, and find that increasing the noise intensity shifts both the cooperator-defector phase boundary and the absorbing boundary upward in the parameter space, simultaneously expanding the cooperative regime and the extinc tion region. These predictions are confirmed by numerical simulations. In contrast, quenched noise leaves the phase boundary nearly unchanged across all noise levels, exerting only a weak effect on cooperator frequency. Together, these results demonstrate that temporal fluctuations, rather than static spatial heterogeneity, are the primary driver of noise-induced shifts in the cooperative phase structure.


[6] 2606.18667

Can neurons speak? Semantic narration of vision at single-cell resolution

Identifying what individual neurons encode in higher-order visual cortex is an open problem. Responses resist intuitive parameterization, and the deep-network embeddings used in their place are black boxes. Here, we introduce NEURRATOR, a framework that decodes spiking activity into free-form natural-language narration of the viewed scene at single-neuron resolution. A learned encoder maps spike trains from arbitrary subsets of simultaneously-recorded neurons into the patch-embedding space of a frozen CLIP, from which a multimodal language model and sparse autoencoder generates and validates a description with no language-side training. Applied to Neuropixel recordings of mouse visual cortex during natural-movie viewing, NEURRATOR narrates from thousands of neurons, singular cortical regions, local populations, or from a molecularly-defined cell-types. We use this property to (i) quantify how decoding fidelity scales with population size and cortical region, and (ii) "neurrate", in plain language, what individual neurons and genetically-tagged inhibitory cell-types contribute to visual representation. This recasts cell identity from a classification target into a functional probe of the visual system, providing a new unit of biological insights in neural systems.


[7] 2606.19081

Retrieval-Based Brain Decoding by Alignment, not Complexity

A prominent theory in cognitive science suggests that concepts in the brain are organized as high-dimensional vectors, with semantic meaning captured by directions and relative angles in this space. Brain decoding is the effort of reconstructing or retrieving stimuli (or their representations) from neural activity and involves finding a function that approximates how the brain represents concepts. This motivates the investigation of contrastive objectives as biologically plausible candidates to reverse the brain loss function. In this work, we study how functional MRI (fMRI) activity can generally be mapped with the embedding spaces of foundation models in vision, language, and audio. Although neural computations are highly non-linear at the microscale, fMRI measurements average signals across space and time, further smoothed by noise, effectively linearizing the observable representation. Consistent with these views, our experiments across multiple datasets demonstrate that linear contrastive decoders consistently outperform ridge regression and standard non-linear alternatives, and that these results generalize across images, text, and sound. These findings indicate that decoding gains arise more from the choice of training objective than from architectural complexity, pointing to contrastive-linear models as a principled strategy for brain decoding.


[8] 2606.19280

CollaboratoR: A scalable workflow for collaborative data entry and management

Effective collaborative data entry and transparency are foundational for building robust databases and high-quality data synthesis. Yet researchers often face inconsistent data entries, inadvertently introducing errors, misreadings, and inconsistencies that compromise data integrity. Despite the growing use of open-source tools, many still rely on inefficient formats or costly commercial platforms, while fewer adopt complex open-source solutions. These inefficiencies slow workflows and hinder researchers' ability to build foundational databases for synthesis research, including meta-analyses. To address this, we developed CollaboratoR, a customizable R package that automates data validation and aggregation, ensuring consistency and transparency and adhering to FAIR data principles, while optionally using Google Sheets for collaborative data entry and GitHub for version control. CollaboratoR fills the gap between ad-hoc spreadsheets and complex systems for data extraction in meta-analyses. Data are entered into shared Google Sheets, validated, and pushed to GitHub for version control, then re-validated after verification to ensure accuracy before finalizing. Tested in two case studies, plant competition and avian interaction databases, CollaboratoR proved effective at managing large collaborative datasets. In both, automated validation flagged common entry and formatting issues early, improving traceability and reducing time spent on post-hoc cleaning. This framework applies across disciplines where data synthesis informs data-driven decision-making, such as social science, ecology, and medical and pharmaceutical research. Ultimately, CollaboratoR offers guidance for efficient, transparent, and reproducible collaborative data management, enhancing research synthesis across fields and industries alike.


[9] 2606.18277

Multi-network comparison of between-farm contacts for infectious disease surveillance in swine production

Understanding how swine farms are interconnected, directly and indirectly, is essential to characterizing infectious disease transmission. This study aimed to describe the connectivity of swine farms across 11 network types, including vehicle movements (i.e., trucks and trailers), animal movements, and distance-based farm-to-farm contacts, to identify links among production types and farms likely to be consistently characterized as super-spreaders. Truck and trailer movement networks were the most densely connected, particularly for feed transport, showing connectivity levels between 98.7% and 99.7% higher than those of pig movement and distance-based networks. These networks also exhibited the highest degree and frequency of connections between farms, while the aggregated truck network, which included all truck types, showed the greatest potential to act as a bridge connecting farms. Finisher farms were highly interconnected with other farm types across all networks. Sow farms were frequently reached by other farm types, especially through feed truck movements, representing up to 8.7% of these links. We demonstrated that in vehicle movements and proximity networks, finisher farms played a major role as super-spreaders. When comparing the top 50 farms ranked by super-spreader score in each network, vehicle-based networks showed the highest similarity, with up to 89% of top-ranked farms shared between vehicle networks. In contrast, pig movement and distance-based networks identified largely distinct sets of top-ranked farms, sharing at most 4% and 8%, respectively, with other contact networks. Overall, each network exhibited a distinct connectivity structure, resulting in different sets of high-risk farms, particularly regarding potential transmission to breeding farms. These findings support the integration of multiple transmission pathways into disease surveillance.


[10] 2606.18390

MOLAR: Learning Multimodal Molecular Representations from Noisy Labels

Motivation: Noisy labels are a common challenge in molecular property prediction because molecular annotations are often obtained from assays, curated databases, or weak annotation pipelines rather than directly observed clean biological states. Treating recorded labels as reliable supervision can cause models to memorize corrupted observations and learn misleading molecular evidence. In multimodal molecular representation learning, this issue can be amplified by graph-text fusion or alignment, which may propagate label-induced errors across modalities. Results: We propose MOLAR, a noise-aware framework for learning multimodal molecular representations from noisy labels. MOLAR separates latent clean-property inference from recorded-label observation: graph and text views contribute residual evidence to a clean-property distribution, and a categorical label-observation channel maps this distribution to recorded labels for training. This formulation derives posterior label reliability and modality-specific molecular evidence from the model. Experiments on naturally noisy molecular benchmarks and controlled label-flipping benchmarks show that MOLAR consistently outperforms representative baselines. Visualization analyses further show that MOLAR provides interpretable reliability and modality-evidence diagnostics.


[11] 2606.18420

Measurement noise limits the advantage of nonlinear models over linear models in biomedical prediction

On biomedical tabular data, flexible models such as deep networks, gradient-boosted trees, and kernel methods are repeatedly matched or beaten by linear and logistic regression given the same features. The usual reaction is to treat this as a model-side shortfall, to be fixed with more data, a better architecture, or tuning, on the assumption that the nonlinear structure is there and the model has failed to capture it. We argue that these fixes cannot help when the binding limit is the measurement rather than the model, as it frequently is in biomedicine. Additive noise blurs the population-optimal predictor, and because blurring removes a function's fine, rapidly varying detail before its broad shape, it erases nonlinear structure faster than linear structure. A degree-$k$ interaction is attenuated by the $k$-th power of feature reliability, while the linear part is attenuated only once. At the reliabilities typical of biomedical measurement, the nonlinear advantage can vanish even when the underlying biology is strongly nonlinear, and what the noise removes cannot be recovered by a larger cohort or a more flexible model, only by better measurement. The nonlinearity is hidden, not absent, and a tie between linear and flexible models is not by itself a verdict on the biology. These pieces are classical, drawn from measurement-error statistics, psychometrics, and Gaussian analysis, and we assemble them into an exact excess-risk identity. Measurement reliability is one of three conditions, alongside sample size and feature representation, that must align for a flexible model to help, and together they leave only a narrow window that most biomedical tasks fall outside. Across 140 UK Biobank tasks, the gap between flexible and linear models, where it exists, carries the predicted noise signature, and the three conditions can be separated by intervention but not by a benchmark alone.


[12] 2606.18495

Bayesian Sampling of Structural Ensembles: The Role of Ensemble-Counting Measures

Structural ensemble refinement is widely used to integrate molecular simulations with experimental measurements. While most applications focus on the maximum-a-posteriori (MAP) ensemble, Bayesian sampling of the posterior distribution can provide uncertainty estimates and posterior averages for arbitrary observables. A notable step in this direction was introduced by the Bayesian Energy Landscape Tilting (BELT) framework, where sampling is performed on a family of maximum-entropy ensembles parametrized by Lagrange multipliers. Here, we show that Bayesian sampling in this setting requires an explicit choice of ensemble-counting measure. In particular, the flat measure in Lagrange-multiplier space used in the original BELT formulation leads to a posterior distribution that is formally non-normalizable for finite reference trajectories. We propose the Jeffreys measure as an invariant ensemble-counting prescription, restoring normalizability in the finite-sample situations considered here, and providing a consistent definition of posterior averages. Using both an analytically tractable Gaussian model and maximum-entropy refinement of RNA oligomer simulations, we compare different ensemble-counting measures and show that they can significantly affect Bayesian estimates. The resulting methodology has been implemented in the \texttt{MDRefine} software package.


[13] 2606.18640

MetaboNet-Bench: A Multi-modal Benchmark for Glucose Forecasting in Type 1 Diabetes

Glucose forecasting algorithms are an important aspect of glycemic control management in type 1 diabetes. So far, the research community has developed numerous algorithms and models for forecasting. However, it is well-recognized that the lack of standardized model performance evaluation benchmarks makes fair comparison difficult and hinders further innovation, and thus benchmark standardization is in urgent need. Furthermore, many published glucose forecasting algorithms are limited to CGM data alone, ignoring other multimodal signals such as insulin dosing and carbohydrate intake. Here, we introduce MetaboNet-Bench, a benchmark for multimodal glucose forecasting for patients with type 1 diabetes that provides an extensible open-source evaluation framework for comparison of glucose forecasting algorithms that leverage glucose, insulin, and carbohydrate data. We then demonstrate its utility by benchmarking several recently published glucose forecasting models and a custom multimodal time-series model, representing different model architectures. The results show that the benefit of adding data modalities is conditioned on the complexity of the model and that incorporating more clinical metrics helps identify meaningful gaps to fill for future research.


[14] 2606.18672

scGTN: Deep Siamese Graph Transformer Network for Single-cell RNA Sequencing Clustering

Single-cell RNA sequencing (scRNA-seq) serves a pivotal role in characterizing gene expression at the cellular level, enabling the identification of cell types and advancing the understanding of cellular heterogeneity. Despite the significant progress in scRNA-seq data clustering, we argue that current methods always ignore the sparsity and noise, as well as the complex intercellular structural information inherent in scRNA-seq data. Toward this end, in this paper, we propose a novel single-cell RNA-seq clustering framework via deep Siamese Graph Transformer Network (termed scGTN), which explicitly integrates gene expression profile and intercellular structural dependencies for cell clustering. In particular, we formulate scRNA-seq data as a graph and construct two augmented graph views that serve as dual views to capture complementary intercellular information. Then, a Siamese graph transformer network is employed to explicitly incorporate shortest-path information and node-wise distances for capturing richer structural relationships between cells. Finally, we employ an optimal transport strategy to guide the cell clustering in a self-supervised manner. Extensive experiments on multiple benchmark scRNA-seq datasets demonstrate that our scGTN consistently outperforms existing methods. Our code is available at this https URL.


[15] 2606.18703

Contextualizing Biological Language Models across Modalities via Logit-Space Contrastive Alignment

Pretrained biological language models expose per-token probability distributions through masked-token prediction, providing the likelihood interface central to sequence design, variant scoring, and mechanistic interpretation. Yet these distributions are learned from broad unlabeled corpora and are not naturally conditioned on task-specific biological contexts such as interaction partners, cellular environments, or therapeutic interventions. Existing contextual matching methods often distort this interface through pooled embeddings, contrastive latent spaces, or task-specific prediction heads. We introduce LOGICA (Logit-space Contrastive Alignment), a framework for context-conditioned prediction that performs contrastive learning directly in output-logit space. Using gated cross-modal adapters compatible with each model's native token head, LOGICA preserves the pretrained likelihood interface and converts contextualized token log-likelihoods into matching scores. Alignment is defined through context-sensitive token probabilities rather than proximity in a shared embedding space, enabling learning from sparse paired data across models with distinct vocabularies, without a shared tokenizer or decoder. LOGICA is particularly effective for mutation-local variant ranking, where comparisons reduce to context-conditioned likelihoods of mutant tokens at perturbed sites. Across protein--ligand binding, TCR--peptide activity, and drug-conditioned resistance prediction, LOGICA improves over prior state-of-the-art methods, including matched latent-contrastive and conditional MLM baselines, while retaining a token-level interface for interpretation and generation. On held-out-gene single-mutation drug-resistance prediction, LOGICA improves AUC from near-random latent-space baselines of $\sim$0.55 to $\sim$0.65.


[16] 2505.13373

State- versus Reaction-Based Information Processing in Biochemical Networks

Trajectory mutual information is frequently used to quantify information transfer in biochemical systems. Tractable solutions of the trajectory mutual information can be obtained via the widely used Linear-Noise Approximation (LNA) using Gaussian channel theory. This approach is expected to be accurate for sufficiently large systems. However, recent observations show that there are cases, where the mutual information obtained this way differs qualitatively from results derived using an exact Markov jump process formalism, and that the differences remain even in the large copy number regime. In this letter, we show that these differences can be explained by introducing the notion of reaction- versus state-based descriptions of trajectories. In chemical systems, the information is encoded in the sequence of reaction events, and the reaction-based trajectories of Markov jump processes capture this information. We show that within the Gaussian formalism, trajectories can be defined either based on individual reaction channels, or on a state-based level, where different reaction channels are summarised into a single noise term. While both definitions agree in terms of copy number fluctuations, state-based trajectories contain in general less information than reaction-based trajectories. The commonly used Gaussian mutual information via the Linear-Noise Approximation is consistent with a state-based trajectory notion, which causes a systematic loss of information independent of system size. We show that an alternative, reaction-based variant of the Gaussian mutual information prevents this loss of information. We illustrate the consequences of different trajectory descriptions for two common cellular reaction motifs and discuss their connection with Berg-Purcell and Maximum-Likelihood sensing.


[17] 2508.02400

Assimilation of machine learning-predicted nitrate to improve the quality of phytoplankton forecasting in the shelf sea environment

We demonstrate that assimilating Neural Network (NN)-predicted surface nitrate leads to a major improvement in phytoplankton short-range (1-5 day) dynamical model forecasts for the North-West European Shelf (NWES) seas. We show that assimilation of only ocean color chlorophyll-$a$ in the current Met Office NWES operational system can lead to excess surface nitrate concentrations in the post-Spring bloom period and these are a major reason behind some known, fast-growing biases in NWES phytoplankton forecasts during late Spring and Summer. Assimilating observations of nitrate would potentially help address this, but NWES nitrate data are typically not available in sufficient abundance to be effectively assimilated. We have therefore used a recently developed and validated neural network (NN) model predicting surface nitrate concentrations from a range of observable variables and assimilated the NN-predicted nitrate within a research and development version of the Met Office's NWES operational forecasting system. As a result of nitrate assimilation the phytoplankton 5-day forecast skill improves by up to 30%. We show that although much of this improvement can be achieved by using a weekly nitrate climatology predicted by the NN model, there is a clear advantage in using flow-dependent nitrate data. We discuss the impacts of this improvement on a range of additional eutrophication indicators, such as dissolved inorganic phosphorus and sea bottom oxygen. We argue that it should be feasible to upgrade this approach to a fully hybrid machine learning - data assimilation within the near-real time NWES operational forecasting system.


[18] 2508.10178

Estimating carbon pools in the European Shelf sea environment: replacing reanalysis by model-informed machine learning?

Shelf seas are important for the economy and the carbon cycle, but shelf sea observations for carbon pools are often sparse, or highly uncertain. An alternative can be provided by carbon reanalyses (whether assimilating proxy variables, such as chlorophyll-$a$, or directly carbon), but these are often expensive to run. We propose to use a computationally cheap ensemble of neural networks (i.e. deep ensemble) to learn the relationship between the directly observable (atmospheric, riverine and ocean) variables and marine carbon pools from a coupled physics-biogeochemistry model. The deep ensemble was trained on a North-West European Shelf (NWES) physical-biogeochemistry model free run simulation. After training, the deep ensemble was run using inputs from the NWES reanalysis instead of the free run, demonstrating that it can efficiently predict several NWES carbon pools (e.g., detritus, zooplankton, heterotrophic bacteria) in much better agreement with the reanalysis than the free run, while also providing uncertainty information. We further show that the deep ensemble performs similarly well when it is driven directly by the observations assimilated into the reanalysis, with the limitation that carbon pools can then be predicted only at the observed locations and times. We focus on explainability of the results and demonstrate potential use of the deep ensembles for future climate what-if scenarios. We suggest that model-informed machine learning presents a viable alternative to expensive reanalyses and could complement observations, wherever they are missing and/or highly uncertain.


[19] 2511.14555

DecNefSimulator: A Modular, Interpretable Framework for Decoded Neurofeedback Simulation Using Generative Models

Decoded Neurofeedback (DecNef) is a promising non-invasive approach to brain modulation with wide-ranging applications in neuromedicine and cognitive neuroscience. However, progress in DecNef research remains constrained by subject-dependent learning variability, reliance on indirect measures to quantify progress, and the high cost and time demands of experimentation. We present DecNefSimulator, a modular and interpretable simulation framework that formalizes DecNef as a machine learning problem. Beyond providing a virtual laboratory, DecNefSimulator enables researchers to model, analyze and understand neurofeedback dynamics. Using latent variable generative models as simulated participants, DecNefSimulator allows direct observation of internal cognitive states and systematic evaluation of how different protocol designs and subject characteristics influence learning. We demonstrate how this approach can (i) reproduce empirical phenomena of DecNef learning, (ii) identify conditions under which DecNef feedback fails to induce learning, and (iii) guide the design of more robust and reliable DecNef protocols in silico before human implementation. In summary, DecNefSimulator bridges computational modeling and cognitive neuroscience, offering a principled foundation for methodological innovation, robust protocol design, and ultimately, a deeper understanding of DecNef-based brain modulation.


[20] 2601.12805

SciHorizon-GENE: Benchmarking LLM for Life Sciences Inference from Gene Knowledge to Functional Understanding

Large language models (LLMs) have shown growing promise in biomedical research, particularly for knowledge-driven interpretation tasks. However, their ability to reliably reason from gene-level knowledge to functional understanding, a core requirement for knowledge-enhanced cell atlas interpretation, remains largely underexplored. To address this gap, we introduce SciHorizon-GENE, a large-scale gene-centric benchmark constructed from authoritative biological databases. The benchmark integrates curated knowledge for over 190K human genes and comprises more than 540K questions covering diverse gene-to-function reasoning scenarios relevant to cell type annotation, functional interpretation, and mechanism-oriented analysis. Motivated by behavioral patterns observed in preliminary examinations, SciHorizon-GENE evaluates LLMs along four biologically critical perspectives: research attention sensitivity, hallucination tendency, answer completeness, and literature influence, explicitly targeting failure modes that limit the safe adoption of LLMs in biological interpretation pipelines. We systematically evaluate a wide range of state-of-the-art general-purpose and biomedical LLMs, revealing substantial heterogeneity in gene-level reasoning capabilities and persistent challenges in generating faithful, complete, and literature-grounded functional interpretations. Our benchmark establishes a systematic foundation for analyzing LLM behavior at the gene scale and offers insights for model selection and development, with direct relevance to knowledge-enhanced biological interpretation.


[21] 2603.27465

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models

Foundation models trained on DNA sequences have achieved strong performance across biological tasks including variant effect prediction and genome design. These models rely on massive public genomic datasets comprising trillions of nucleotide tokens. Unlike natural language, DNA sequences lack semantic transparency, making corrupted or adversarially crafted entries difficult to detect during data curation. We present the first systematic study of training data poisoning in genomic language models, targeting both pre-training and fine-tuning stages. At pre-training, using Evo 2 and GENERator architectures, we show that less than 1% adversarially crafted sequences in the training corpus can selectively degrade generative performance on targeted genomic contexts while leaving unrelated sequences unaffected. We evaluate three scenarios: corruption of TATA-box promoter motifs, disruption of CTCF binding sites, and insertion of synthetic sequences absent from all training genomes. At fine-tuning, we demonstrate two additional attacks. First, poisoning a subset of CTCF sites in a ClinVar-derived corpus installs a conditional backdoor in a LoRA-adapted model that activates almost exclusively when the trigger sequence is present. Second, using frozen Evo 2 7B embeddings, targeted label corruption of downstream training data selectively compromises a clinically relevant variant classification task, demonstrated on BRCA1 variant effect prediction. These results show genomic foundation models are susceptible to targeted data poisoning with minimal footprint. We urge the field to adopt data provenance tracking, integrity verification, and adversarial robustness evaluation as standard components of the genomic model development pipeline.


[22] 2605.01056

Numerical Reliability of Logistic Gene Regulatory Network Models: Preventing Expression Shutdown and Robust Integration of Boolean-Derived ODE Systems

Gene regulatory networks are routinely translated from Boolean update rules into large continuous ODE systems integrated numerically for attractor identification, sensitivity analysis, and control design. The reliability of that integration depends critically on the sigmoidal kernel representing regulation. This simulation study shows that the Hill function -- the near-universal choice -- is a generically unreliable kernel, while the logistic function is a robust replacement. Two failure modes are demonstrated. First, because the Hill function vanishes at zero input, bistable circuits acquire an absorbing off-state: with experimentally grounded \textit{E. coli} galactose-operon autoregulation parameters, a Hill model stays trapped below the unstable separatrix, whereas the logistic model -- whose basal rate is strictly positive by construction -- escapes in about $44$~minutes through basal production alone, matching an analytical estimate of ${\approx}58$~min. A saddle-node analysis characterises the bistable window via an explicit transcendental equation and identifies the threshold $\lambda\theta=2$ separating monostable from bistable regimes. Second, when the Hill exponent is non-integer -- as in dose-response fits -- the power law $x^n=e^{n\ln x}$ turns complex-valued whenever a solver overshoots into negative concentrations. On an $80$-gene Boolean-derived benchmark with $n\approx3.509$, the Hill solver is silently contaminated by complex values from $t\approx52.64$, yielding smooth but spurious trajectories, whereas the logistic formulation completes $t\in[0,200]$ without a single warning. Because the logistic vector field is globally Lipschitz with explicit constant, we further prove an a priori global-error bound of classical order -- a guarantee structurally unavailable to the Hill formulation.


[23] 2502.02904

ScholaWrite: A Dataset of End-to-End Scholarly Writing Process

Writing is a cognitively demanding activity that requires constant decision-making, heavy reliance on working memory, and frequent shifts between tasks of different goals. To build writing assistants that truly align with writers' cognition, we must capture and decode the complete thought process behind how writers transform ideas into final texts. We present ScholaWrite, the first dataset of end-to-end scholarly writing, tracing the multi-month journey from initial drafts to final manuscripts. We contribute three key advances: (1) a Chrome extension that unobtrusively records keystrokes on Overleaf, enabling the collection of realistic, in-situ writing data; (2) a novel corpus of full scholarly manuscripts, enriched with fine-grained annotations of cognitive writing intentions. The dataset includes \LaTeX-based edits from five computer science preprints, capturing nearly 62K text changes over four months; and (3) analyses and insights into the micro-dynamics of scholarly writing, highlighting gaps between human writing processes and the current capabilities of large language models (LLMs) in providing meaningful assistance. ScholaWrite underscores the value of capturing end-to-end writing data to develop future writing assistants that support, not replace, the cognitive work of scientists.


[24] 2506.13506

Stimulus Motion Perception Studies Imply Specific Neural Computations in Human Visual Stabilization

Even during fixation the human eye is constantly in low amplitude motion, jittering over small angles in random directions at up to 100Hz. This motion results in all features of the image on the retina constantly traversing a number of cones, yet objects which are stable in the world are perceived to be stable, and any object which is moving in the world is perceived to be moving. A series of experiments carried out over a dozen years revealed the psychophysics of visual stabilization to be more nuanced than might be assumed, say, from the mechanics of stabilization of camera images, or what might be assumed to be the simplest solution from an evolutionary perspective. The psychophysics revealed by the experiments strongly implies a specific set of operations on retinal signals resulting in the observed stabilization behavior. The presentation is in two levels. First is a functional description of the action of the mechanism that is very likely responsible for the experimentally observed behavior. Second is a more speculative proposal of circuit-level neural elements that might implement the functional behavior.


[25] 2510.12614

Modeling Epidemics on Multiplex Networks: Epidemic Threshold and Basic Reproduction Number

Accurate epidemic forecasting requires models that account for the layered and heterogeneous nature of real social interactions. The basic reproduction number $\mathcal R_0$, as calculated from models that assume homogeneous mixing or single-layer contact structures, has limited applicability to complex social systems. Here, we derive an expression for $\mathcal R_0$ in the context of multiplex networks, enabling the analysis of disease transmission across multiple social layers. We adapt the Degree-Based Mean-Field (DBMF) SIR model for single-layer complex networks to the multiplex setting, where each layer is characterized by its own degree distribution and infection rate. Using the Next Generation Matrix method, we derive an analytical expression for the basic reproduction number $\mathcal R_0$. Numerical integration of the multiplex DBMF equations shows that $\mathcal R_0=1$ marks the epidemic threshold and governs the behavior of key outbreak indicators as expected. In addition to the exact expression for $\mathcal R_0$, we introduce an approximation, denoted by $\tau$, which is simpler to compute and admits a more transparent interpretation in terms of the epidemiological and topological parameters of the system. Stochastic agent-based simulations support these findings, demonstrating a direct correspondence between $\tau$ and the average number of secondary infections generated during the early stages of an outbreak, consistent with the epidemiological interpretation of $\mathcal R_0$. This work provides a robust generalization of $\mathcal R_0$ for layered contact structures, offering a more realistic basis for epidemic forecasting and the design of intervention strategies.


[26] 2511.05221

ActiTect: A Generalizable Machine Learning Pipeline for REM Sleep Behavior Disorder Screening through Standardized Actigraphy

Isolated rapid eye movement sleep behavior disorder (iRBD) is a major prodromal marker of $\alpha$-synucleinopathies, often preceding the clinical onset of Parkinson's disease, dementia with Lewy bodies, or multiple system atrophy. While wrist-worn actimeters hold significant potential for detecting RBD in large-scale screening efforts by capturing abnormal nocturnal movements, they become inoperable without a reliable and efficient analysis pipeline. This study presents ActiTect, a fully automated, open-source machine learning tool to identify RBD from actigraphy recordings. To ensure generalizability across heterogeneous acquisition settings, our pipeline includes robust preprocessing and automated sleep-wake detection to harmonize multi-device data and extract physiologically interpretable motion features characterizing activity patterns. Model development was conducted on a cohort of 78 individuals, yielding strong discrimination under nested cross-validation (AUROC = 0.95). Generalization was confirmed on a blinded local test set (n = 31, AUROC = 0.86) and on two independent external cohorts (n = 113, AUROC = 0.84; n = 57, AUROC = 0.94). To assess real-world robustness, leave-one-dataset-out cross-validation across the internal and external cohorts demonstrated consistent performance (AUROC range = 0.84-0.89). A complementary stability analysis showed that key predictive features remained reproducible across datasets, supporting the final pooled multi-center model as a robust pre-trained resource for broader deployment. By being open-source and easy to use, our tool promotes widespread adoption and facilitates independent validation and collaborative improvements, thereby advancing the field toward a unified and generalizable RBD detection model using wearable devices.


[27] 2603.04939

When minor issues matter: symmetries, pluralism, and polarization in similarity-based opinion dynamics

Understanding how opinions evolve through social interactions is crucial for mitigating polarization. Existing opinion-dynamics models incorporate both attractive and repulsive interactions but typically assume that all issues are equally important. We develop and analyze a stochastic agent-based model where issues carry heterogeneous weights that influence both social affinity and the likelihood of opinion change. Surprisingly, introducing even a single issue with arbitrarily small weight can destabilize otherwise stable states, increasing convergence times by orders of magnitude. To explain these dynamics, we derive a mean-field approach and characterize the equilibrium symmetries governing consensus, polarization, and persistent pluralism. A complete classification of these symmetries for up to five issues reveals that polarization increases when importance is concentrated on a small number of issues. Conversely, distributing importance more broadly across issues promotes diversity of opinions and reduces polarization. Our symmetry-based framework highlights how issue salience and social tolerance jointly shape collective opinion evolution.