New articles on Quantitative Biology


[1] 2503.03773

A Phylogenetic Approach to Genomic Language Modeling

Genomic language models (gLMs) have shown mostly modest success in identifying evolutionarily constrained elements in mammalian genomes. To address this issue, we introduce a novel framework for training gLMs that explicitly models nucleotide evolution on phylogenetic trees using multispecies whole-genome alignments. Our approach integrates an alignment into the loss function during training but does not require it for making predictions, thereby enhancing the model's applicability. We applied this framework to train PhyloGPN, a model that excels at predicting functionally disruptive variants from a single sequence alone and demonstrates strong transfer learning capabilities.


[2] 2503.03783

Passive Heart Rate Monitoring During Smartphone Use in Everyday Life

Resting heart rate (RHR) is an important biomarker of cardiovascular health and mortality, but tracking it longitudinally generally requires a wearable device, limiting its availability. We present PHRM, a deep learning system for passive heart rate (HR) and RHR measurements during everyday smartphone use, using facial video-based photoplethysmography. Our system was developed using 225,773 videos from 495 participants and validated on 185,970 videos from 205 participants in laboratory and free-living conditions, representing the largest validation study of its kind. Compared to reference electrocardiogram, PHRM achieved a mean absolute percentage error (MAPE) < 10% for HR measurements across three skin tone groups of light, medium and dark pigmentation; MAPE for each skin tone group was non-inferior versus the others. Daily RHR measured by PHRM had a mean absolute error < 5 bpm compared to a wearable HR tracker, and was associated with known risk factors. These results highlight the potential of smartphones to enable passive and equitable heart health monitoring.


[3] 2503.03784

Neural Models of Task Adaptation: A Tutorial on Spiking Networks for Executive Control

Understanding cognitive flexibility and task-switching mechanisms in neural systems requires biologically plausible computational models. This tutorial presents a step-by-step approach to constructing a spiking neural network (SNN) that simulates task-switching dynamics within the cognitive control network. The model incorporates biologically realistic features, including lateral inhibition, adaptive synaptic weights through unsupervised Spike Timing-Dependent Plasticity (STDP), and precise neuronal parameterization within physiologically relevant ranges. The SNN is implemented using Leaky Integrate-and-Fire (LIF) neurons, which represent excitatory (glutamatergic) and inhibitory (GABAergic) populations. We utilize two real-world datasets as tasks, demonstrating how the network learns and dynamically switches between them. Experimental design follows cognitive psychology paradigms to analyze neural adaptation, synaptic weight modifications, and emergent behaviors such as Long-Term Potentiation (LTP), Long-Term Depression (LTD), and Task-Set Reconfiguration (TSR). Through a series of structured experiments, this tutorial illustrates how variations in task-switching intervals affect performance and multitasking efficiency. The results align with empirically observed neuronal responses, offering insights into the computational underpinnings of executive function. By following this tutorial, researchers can develop and extend biologically inspired SNN models for studying cognitive processes and neural adaptation.


[4] 2503.03786

Self is the Best Learner: CT-free Ultra-Low-Dose PET Organ Segmentation via Collaborating Denoising and Segmentation Learning

Organ segmentation in Positron Emission Tomography (PET) plays a vital role in cancer quantification. Low-dose PET (LDPET) provides a safer alternative by reducing radiation exposure. However, the inherent noise and blurred boundaries make organ segmentation more challenging. Additionally, existing PET organ segmentation methods rely on co-registered Computed Tomography (CT) annotations, overlooking the problem of modality mismatch. In this study, we propose LDOS, a novel CT-free ultra-LDPET organ segmentation pipeline. Inspired by Masked Autoencoders (MAE), we reinterpret LDPET as a naturally masked version of Full-Dose PET (FDPET). LDOS adopts a simple yet effective architecture: a shared encoder extracts generalized features, while task-specific decoders independently refine outputs for denoising and segmentation. By integrating CT-derived organ annotations into the denoising process, LDOS improves anatomical boundary recognition and alleviates the PET/CT misalignments. Experiments demonstrate that LDOS achieves state-of-the-art performance with mean Dice scores of 73.11% (18F-FDG) and 73.97% (68Ga-FAPI) across 18 organs in 5% dose PET. Our code is publicly available.


[5] 2503.03790

DDCSR: A Novel End-to-End Deep Learning Framework for Cortical Surface Reconstruction from Diffusion MRI

Diffusion MRI (dMRI) plays a crucial role in studying brain white matter connectivity. Cortical surface reconstruction (CSR), including the inner whiter matter (WM) and outer pial surfaces, is one of the key tasks in dMRI analyses such as fiber tractography and multimodal MRI analysis. Existing CSR methods rely on anatomical T1-weighted data and map them into the dMRI space through inter-modality registration. However, due to the low resolution and image distortions of dMRI data, inter-modality registration faces significant challenges. This work proposes a novel end-to-end learning framework, DDCSR, which for the first time enables CSR directly from dMRI data. DDCSR consists of two major components, including: (1) an implicit learning module to predict a voxel-wise intermediate surface representation, and (2) an explicit learning module to predict the 3D mesh surfaces. Compared to several baseline and advanced CSR methods, we show that the proposed DDCSR can largely increase both accuracy and efficiency. Furthermore, we demonstrate a high generalization ability of DDCSR to data from different sources, despite the differences in dMRI acquisitions and populations.


[6] 2503.03913

Proton Flows, Proton Gradients and Subcellular Architecture in Biological Energy Conversion

Hydrogen ions, or protons, provide the medium by which energy is stored and converted in biological systems. Such pre-eminence relies on the interplay between interfacial and bulk chemical transformations, according to mechanisms that are shared by organisms in all phyla of life. The present work provides an introduction to the fundamental aspects of biological energy management by focusing on the relationship between vectorial proton flows and the geometry of energy producing organelles in eukaryotes. The leading models of proton-mediated energy conversion, the delocalised proton (or chemiosmotic) model and the localised proton model, are presented in a complementary perspective. While the delocalised model provides a description that relies on equilibrium thermodynamics, the localised model addresses dynamic processes that are better described using out-of-equilibrium thermodynamics. The work reviews the salient aspects of such mechanisms, traces the development of our present understanding, and highlights areas that are open to future developments.


[7] 2503.03950

The Nature of Organization in Living Systems

Living systems are thermodynamically open but closed in their organization. In other words, even though their material components turn over constantly, a material-independent property persists, which we call organization. Moreover, organization comes from within organisms themselves, which requires us to explain how this self-organization is established and maintained. In this paper we propose a mathematical and conceptual framework to understand the kinds of organized systems that living systems are, aiming to explain how self-organization emerges from more basic elemental processes. Additionally, we map our own notions to existing traditions in theoretical biology and philosophy, aiming to bring the main formal ideas into conceptual congruence.


[8] 2503.03989

Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows

The dynamic nature of proteins, influenced by ligand interactions, is essential for comprehending protein function and progressing drug discovery. Traditional structure-based drug design (SBDD) approaches typically target binding sites with rigid structures, limiting their practical application in drug development. While molecular dynamics simulation can theoretically capture all the biologically relevant conformations, the transition rate is dictated by the intrinsic energy barrier between them, making the sampling process computationally expensive. To overcome the aforementioned challenges, we propose to use generative modeling for SBDD considering conformational changes of protein pockets. We curate a dataset of apo and multiple holo states of protein-ligand complexes, simulated by molecular dynamics, and propose a full-atom flow model (and a stochastic version), named DynamicFlow, that learns to transform apo pockets and noisy ligands into holo pockets and corresponding 3D ligand molecules. Our method uncovers promising ligand molecules and corresponding holo conformations of pockets. Additionally, the resultant holo-like states provide superior inputs for traditional SBDD approaches, playing a significant role in practical drug discovery.


[9] 2503.04069

Integrating network pharmacology, metabolomics, and gut microbiota analysis to explore the effects of Jinhong tablets on chronic superficial gastritis

Chronic superficial gastritis (CSG) severely affects quality of life and can progress to worse gastric pathologies. Traditional Chinese Medicine (TCM) effectively treats CSG, as exemplified by Jinhong Tablets (JHT) with known anti-inflammatory properties, though their mechanism remains unclear. This study integrated network pharmacology, untargeted metabolomics, and gut microbiota analyses to investigate how JHT alleviates CSG. A rat CSG model was established and evaluated via H&E staining. We identified JHT's target profiles and constructed a multi-layer biomolecular network. Differential metabolites in plasma were determined by untargeted metabolomics, and gut microbiota diversity/composition in fecal and cecal samples was assessed via 16S rRNA sequencing. JHT markedly reduced gastric inflammation. Network pharmacology highlighted metabolic pathways, particularly lipid and nitric oxide metabolism, as essential to JHT's therapeutic effect. Metabolomics identified key differential metabolites including betaine (enhancing gut microbiota), phospholipids, and citrulline (indicating severity of CSG). Pathway enrichment supported the gut microbiota's involvement. Further microbiota analysis showed that JHT increased betaine abundance, improved short-chain fatty acid production, and elevated Faecalibaculum and Bifidobacterium, thereby alleviating gastric inflammation. In conclusion, JHT alleviates CSG via diverse metabolic processes, especially lipid and energy metabolism, and influences metabolites like betaine alongside gut microbes such as Faecalibaculum and Bifidobacterium. These findings underscore JHT's therapeutic potential and deepen our understanding of TCM's role in CSG management.


[10] 2503.04200

DeepSilencer: A Novel Deep Learning Model for Predicting siRNA Knockdown Efficiency

Background: Small interfering RNA (siRNA) is a promising therapeutic agent due to its ability to silence disease-related genes via RNA interference. While traditional machine learning and early deep learning methods have made progress in predicting siRNA efficacy, there remains significant room for improvement. Advanced deep learning techniques can enhance prediction accuracy, reducing the reliance on extensive wet-lab experiments and accelerating the identification of effective siRNA sequences. This approach also provides deeper insights into the mechanisms of siRNA efficacy, facilitating more targeted and efficient therapeutic strategies. Methods: We introduce DeepSilencer, an innovative deep learning model designed to predict siRNA knockdown efficiency. DeepSilencer utilizes advanced neural network architectures to capture the complex features of siRNA sequences. Our key contributions include a specially designed deep learning model, an innovative online data sampling method, and an improved loss function tailored for siRNA prediction. These enhancements collectively boost the model's prediction accuracy and robustness. Results: Extensive evaluations on multiple test sets demonstrate that DeepSilencer achieves state-of-the-art performance using only siRNA sequences and basic physicochemical properties. Our model surpasses several other methods and shows superior predictive performance, particularly when incorporating thermodynamic parameters. Conclusion: The advancements in data sampling, model design, and loss function significantly enhance the predictive capabilities of DeepSilencer. These improvements underscore its potential to advance RNAi therapeutic design and development, offering a powerful tool for researchers and clinicians.


[11] 2503.04339

Reproductive system and interaction with fauna in a Mediterranean Pyrophite shrub

The ULEX model, in its present state, involves the study of the biomass and the population of the shrub Ulex parviflorus Pourret, but while being a dynamic model, it is static in the sense that it does not imply the appearance of new specimens of this plant. As a complement to the ULEX model in its two dynamic and spatial aspects, and with the idea of extending the model, the authors have introduced from a biological and statistical point of view four characteristics of this species, flowering, pollination, fructification, taking special interest in the role played by the pollinators (bees) and dispersion of seeds.


[12] 2503.04477

Exact first passage time distribution for nonlinear chemical reaction networks II: monomolecular reactions and a A + B - C type of second-order reaction with arbitrary initial conditions

In biochemical reaction networks, the first passage time (FPT) of a reaction quantifies the time it takes for the reaction to first occur, from the initial state. While the mean FPT historically served as a summary metric, a far more comprehensive characterization of the dynamics of the network is contained within the complete FPT distribution. The relatively uncommon theoretical treatments of the FPT distribution that have been given in the past have been confined to linear systems, with zero and first-order processes. Recently, we presented theoretically exact solutions for the FPT distribution, within nonlinear systems involving two-particle collisions, such as A+B - C. Although this research yielded invaluable results, it was based upon the assumption of initial conditions in the form of a Poisson distribution. This somewhat restricts its relevance to real-world biochemical systems, which frequently display intricate behaviour and initial conditions that are non-Poisson in nature. Our current study extends prior analyses to accommodate arbitrary initial conditions, thereby expanding the applicability of our theoretical framework and providing a more adaptable tool for capturing the dynamics of biochemical reaction networks.


[13] 2503.04648

Assessing the performance of compartmental and renewal models for learning $R_{t}$ using spatially heterogeneous epidemic simulations on real geographies

The time-varying reproduction number ($R_t$) gives an indication of the trajectory of an infectious disease outbreak. Commonly used frameworks for inferring $R_t$ from epidemiological time series include those based on compartmental models (such as the SEIR model) and renewal equation models. These inference methods are usually validated using synthetic data generated from a simple model, often from the same class of model as the inference framework. However, in a real outbreak the transmission processes, and thus the infection data collected, are much more complex. The performance of common $R_t$ inference methods on data with similar complexity to real world scenarios has been subject to less comprehensive validation. We therefore propose evaluating these inference methods on outbreak data generated from a sophisticated, geographically accurate agent-based model. We illustrate this proposed method by generating synthetic data for two outbreaks in Northern Ireland: one with minimal spatial heterogeneity, and one with additional heterogeneity. We find that the simple SEIR model struggles with the greater heterogeneity, while the renewal equation model demonstrates greater robustness to spatial heterogeneity, though is sensitive to the accuracy of the generation time distribution used in inference. Our approach represents a principled way to benchmark epidemiological inference tools and is built upon an open-source software platform for reproducible epidemic simulation and inference.


[14] 2406.12723

BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity

As part of an ongoing worldwide effort to comprehend and monitor insect biodiversity, this paper presents the BIOSCAN-5M Insect dataset to the machine learning community and establish several benchmark tasks. BIOSCAN-5M is a comprehensive dataset containing multi-modal information for over 5 million insect specimens, and it significantly expands existing image-based biological datasets by including taxonomic labels, raw nucleotide barcode sequences, assigned barcode index numbers, geographical, and size information. We propose three benchmark experiments to demonstrate the impact of the multi-modal data types on the classification and clustering accuracy. First, we pretrain a masked language model on the DNA barcode sequences of the BIOSCAN-5M dataset, and demonstrate the impact of using this large reference library on species- and genus-level classification performance. Second, we propose a zero-shot transfer learning task applied to images and DNA barcodes to cluster feature embeddings obtained from self-supervised learning, to investigate whether meaningful clusters can be derived from these representation embeddings. Third, we benchmark multi-modality by performing contrastive learning on DNA barcodes, image data, and taxonomic information. This yields a general shared embedding space enabling taxonomic classification using multiple types of information and modalities. The code repository of the BIOSCAN-5M Insect dataset is available at https://github.com/bioscan-ml/BIOSCAN-5M.


[15] 2503.04221

Random search with stochastic resetting: when finding the target is not enough

In this paper we consider a random search process with stochastic resetting and a partially accessible target $\calU$. That is, when the searcher finds the target by attaching to its surface $\partial \calU$ it does not have immediate access to the resources within the target interior. After a random waiting time, the searcher either gains access to the resources within or detaches and continues its search process. We also assume that the searcher requires an alternating sequence of periods of bulk diffusion interspersed with local surface interactions before being able to attach to the surface. The attachment, detachment and target entry events are the analogs of adsorption, desorption and absorption of a particle by a partially reactive surface in physical chemistry. In applications to animal foraging, the resources could represent food or shelter while resetting corresponds to an animal returning to its home base. We begin by considering a Brownian particle on the half-line with a partially accessible target at the origin $x=0$. We calculate the non-equilibrium stationary state (NESS) in the case of reversible adsorption and obtain the corresponding first passage time (FPT) density for absorption when adsorption is only partially reversible. We then reformulate the stochastic process in terms of a pair of renewal equations that relate the probability density and FPT density for absorption in terms of the corresponding quantities for irreversible adsorption. The renewal equations allow us to incorporate non-Markovian models of absorption and desorption. They also provide a useful decomposition of quantities such as the mean FPT (MFPT) in terms of the number of desorption events and the statistics of the waiting time density. Finally, we consider various extensions of the theory, including higher-dimensional search processes and an encounter-based model of absorption.


[16] 2503.04347

Large Language Models for Zero-shot Inference of Causal Structures in Biology

Genes, proteins and other biological entities influence one another via causal molecular networks. Causal relationships in such networks are mediated by complex and diverse mechanisms, through latent variables, and are often specific to cellular context. It remains challenging to characterise such networks in practice. Here, we present a novel framework to evaluate large language models (LLMs) for zero-shot inference of causal relationships in biology. In particular, we systematically evaluate causal claims obtained from an LLM using real-world interventional data. This is done over one hundred variables and thousands of causal hypotheses. Furthermore, we consider several prompting and retrieval-augmentation strategies, including large, and potentially conflicting, collections of scientific articles. Our results show that with tailored augmentation and prompting, even relatively small LLMs can capture meaningful aspects of causal structure in biological systems. This supports the notion that LLMs could act as orchestration tools in biological discovery, by helping to distil current knowledge in ways amenable to downstream analysis. Our approach to assessing LLMs with respect to experimental data is relevant for a broad range of problems at the intersection of causal learning, LLMs and scientific discovery.


[17] 2503.04362

A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery

Structure-based drug discovery (SBDD) is a systematic scientific process that develops new drugs by leveraging the detailed physical structure of the target protein. Recent advancements in pre-trained models for biomolecules have demonstrated remarkable success across various biochemical applications, including drug discovery and protein engineering. However, in most approaches, the pre-trained models primarily focus on the characteristics of either small molecules or proteins, without delving into their binding interactions which are essential cross-domain relationships pivotal to SBDD. To fill this gap, we propose a general-purpose foundation model named BIT (an abbreviation for Biomolecular Interaction Transformer), which is capable of encoding a range of biochemical entities, including small molecules, proteins, and protein-ligand complexes, as well as various data formats, encompassing both 2D and 3D structures. Specifically, we introduce Mixture-of-Domain-Experts (MoDE) to handle the biomolecules from diverse biochemical domains and Mixture-of-Structure-Experts (MoSE) to capture positional dependencies in the molecular structures. The proposed mixture-of-experts approach enables BIT to achieve both deep fusion and domain-specific encoding, effectively capturing fine-grained molecular interactions within protein-ligand complexes. Then, we perform cross-domain pre-training on the shared Transformer backbone via several unified self-supervised denoising tasks. Experimental results on various benchmarks demonstrate that BIT achieves exceptional performance in downstream tasks, including binding affinity prediction, structure-based virtual screening, and molecular property prediction.


[18] 2503.04365

A Protocol to Exposure Path Analysis for Multiple Stressors Associated with Cardiovascular Disease Risk: A Novel Approach Using NHANES Data

Background: Multiple medical and non-medical stressors, along with the complicity of their exposure pathways, have posted significant challenges to the epidemiological interpretation of the non-communicable diseases, including cardiovascular disease (CVD). Objective: To develop a protocol for deconstructing the complex exposure pathways linking various stressors to adverse outcomes and to elucidate the sequential determinants contributing to CVD risk in depth. Methods: In this study, we developed a Path-Lasso approach, rooted in Adaptive Lasso regression, to construct the network and paths to interpret the determinants of CVD in an in-depth way by using data from the National Health and Nutrition Examination Survey (NHANES). Univariate logistic regression was initially employed to screen out all potential factors of influencing CVD. Then a programmed approach, using Path-Lasso technique, stratified covariates and established a causal network to predict CVD risk. Results: Age, smoking and waist circumference were identified as the most significant predictors of CVD risk. Other factors, such as race, marital status, physical activity, cadmium exposure and diabetes acted as the intermediary or proximal variables. All these stressors (or nodes) formed the network with paths (or edges to link the CVD), in which the latent layer variables that causally associate to the outcome are linearly formed by the stressors in each layer. Discussion: The Path-Lasso approach revealed the epidemiological pathways, linking covariates to CVD risk, which is instrumental in elucidating the inter-covariate transitions of their predication to the outcome, and providing the hierarchal network for foundation of the assessment of CVD risk and the beyond.


[19] 2503.04483

InfoSEM: A Deep Generative Model with Informative Priors for Gene Regulatory Network Inference

Inferring Gene Regulatory Networks (GRNs) from gene expression data is crucial for understanding biological processes. While supervised models are reported to achieve high performance for this task, they rely on costly ground truth (GT) labels and risk learning gene-specific biases, such as class imbalances of GT interactions, rather than true regulatory mechanisms. To address these issues, we introduce InfoSEM, an unsupervised generative model that leverages textual gene embeddings as informative priors, improving GRN inference without GT labels. InfoSEM can also integrate GT labels as an additional prior when available, avoiding biases and further enhancing performance. Additionally, we propose a biologically motivated benchmarking framework that better reflects real-world applications such as biomarker discovery and reveals learned biases of existing supervised methods. InfoSEM outperforms existing models by 38.5% across four datasets using textual embeddings prior and further boosts performance by 11.1% when integrating labeled data as priors.


[20] 2503.04490

Large Language Models in Bioinformatics: A Survey

Large Language Models (LLMs) are revolutionizing bioinformatics, enabling advanced analysis of DNA, RNA, proteins, and single-cell data. This survey provides a systematic review of recent advancements, focusing on genomic sequence modeling, RNA structure prediction, protein function inference, and single-cell transcriptomics. Meanwhile, we also discuss several key challenges, including data scarcity, computational complexity, and cross-omics integration, and explore future directions such as multimodal learning, hybrid AI models, and clinical applications. By offering a comprehensive perspective, this paper underscores the transformative potential of LLMs in driving innovations in bioinformatics and precision medicine.


[21] 2503.04527

The nexus between disease surveillance, adaptive human behavior and epidemic containment

Epidemics exhibit interconnected processes that operate at multiple time and organizational scales, a hallmark of complex adaptive systems. Modern epidemiological modeling frameworks incorporate feedback between individual-level behavioral choices and centralized interventions. Nonetheless, the realistic operational course for disease detection, planning, and response is often overlooked. Disease detection is a dynamic challenge, shaped by the interplay between surveillance efforts and transmission characteristics. It serves as a tipping point that triggers emergency declarations, information dissemination, adaptive behavioral responses, and the deployment of public health interventions. Evaluating the impact of disease surveillance systems as triggers for adaptive behavior and public health interventions is key to designing effective control policies. We examine the multiple behavioral and epidemiological dynamics generated by the feedback between disease surveillance and the intertwined dynamics of information and disease propagation. Specifically, we study the intertwined dynamics between: $(i)$ disease surveillance triggering health emergency declarations, $(ii)$ risk information dissemination producing decentralized behavioral responses, and $(iii)$ centralized interventions. Our results show that robust surveillance systems that quickly detect a disease outbreak can trigger an early response from the population, leading to large epidemic sizes. The key result is that the response scenarios that minimize the final epidemic size are determined by the trade-off between the risk information dissemination and disease transmission, with the triggering effect of surveillance mediating this trade-off. Finally, our results confirm that behavioral adaptation can create a hysteresis-like effect on the final epidemic size.


[22] 2503.04572

Social Imitation Dynamics of Vaccination Driven by Vaccine Effectiveness and Beliefs

Declines in vaccination coverage for vaccine-preventable diseases, such as measles and chickenpox, have enabled their surprising comebacks and pose significant public health challenges in the wake of growing vaccine hesitancy. Vaccine opt-outs and refusals are often fueled by beliefs concerning perceptions of vaccine effectiveness and exaggerated risks. Here, we quantify the impact of competing beliefs -- vaccine-averse versus vaccine-neutral -- on social imitation dynamics of vaccination, alongside the epidemiological dynamics of disease transmission. These beliefs may be pre-existing and fixed, or coevolving attitudes. This interplay among beliefs, behaviors, and disease dynamics demonstrates that individuals are not perfectly rational; rather, they base their vaccine uptake decisions on beliefs, personal experiences, and social influences. We find that the presence of a small proportion of fixed vaccine-averse beliefs can significantly exacerbate the vaccination dilemma, making the tipping point in the hysteresis loop more sensitive to changes in individuals' perceived costs of vaccination and vaccine effectiveness. However, in scenarios where competing beliefs spread concurrently with vaccination behavior, their double-edged impact can lead to self-correction and alignment between vaccine beliefs and behaviors. The results show that coevolution of vaccine beliefs and behaviors makes populations more sensitive to abrupt changes in perceptions of vaccine cost and effectiveness compared to scenarios without beliefs. Our work provides valuable insights into harnessing the social contagion of even vaccine-neutral attitudes to overcome vaccine hesitancy.


[23] 2503.04659

Predicting Heteropolymer Phase Separation Using Two-Chain Contact Maps

Phase separation in polymer solutions often correlates with single-chain and two-chain properties, such as the single-chain radius of gyration, Rg, and the pairwise second virial coefficient, B22. However, recent studies have shown that these metrics can fail to distinguish phase-separating from non-phase-separating heteropolymers, including intrinsically disordered proteins (IDPs). Here we introduce an approach to predict heteropolymer phase separation from two-chain simulations by analyzing contact maps, which capture how often specific monomers from the two chains are in physical proximity. Whereas B22 summarizes the overall attraction between two chains, contact maps preserve spatial information about their interactions. To compare these metrics, we train phase-separation classifiers for both a minimal heteropolymer model and a chemically specific, residue-level IDP model. Remarkably, simple statistical properties of two-chain contact maps predict phase separation with high accuracy, vastly outperforming classifiers based on Rg and B22 alone. Our results thus establish a transferable and computationally efficient method to uncover key driving forces of IDP phase behavior based on their physical interactions in dilute solution.


[24] 2503.04677

Capacitive response of biological membranes

We present a minimal model to analyze the capacitive response of a biological membrane subjected to a step voltage via blocking electrodes. Through a perturbative analysis of the underlying electrolyte transport equations, we show that the leading-order relaxation of the transmembrane potential is governed by a capacitive timescale, ${\tau_{\rm C} =\dfrac{\lambda_{\rm D}L}{D}\left(\dfrac{2+\Gamma\delta^{\rm M}/L}{4+\Gamma\delta^{\rm M}/\lambda_{\rm D}}\right)}$, where $\lambda_{\rm D}$ is the Debye screening length, $L$ is the electrolyte width, $\Gamma$ is the ratio of the dielectric permittivity of the electrolyte to the membrane, $\delta^{\rm M}$ is the membrane thickness, and $D$ is the ionic diffusivity. This timescale is considerably shorter than the traditional RC timescale ${\lambda_{\rm D} L / D}$ for a bare electrolyte due to the membrane's low dielectric permittivity and finite thickness. Beyond the linear regime, however, salt diffusion in the bulk electrolyte drives a secondary, nonlinear relaxation process of the transmembrane potential over a longer timescale ${\tau_{\rm L} =L^2/4\pi^2 D}$. A simple equivalent-circuit model accurately captures the linear behavior, and the perturbation expansion remains applicable across the entire range of observed physiological transmembrane potentials. Together, these findings underscore the importance of the faster capacitive timescale and nonlinear effects on the bulk diffusion timescale in determining transmembrane potential dynamics for a range of biological systems.


[25] 2503.04716

Optimal Cell Shape for Accurate Chemical Gradient Sensing in Eukaryote Chemotaxis

Accurate gradient sensing is crucial for efficient chemotaxis in noisy environments, but the relationship between cell shape deformations and sensing accuracy is not well understood. Using a theoretical framework based on maximum likelihood estimation, we show that the receptor dispersion, quantified by cell shape convex hull, fundamentally limits gradient sensing accuracy. Cells with a concave shape and isotropic error space achieve optimal performance in gradient detection. This concave shape, resulting from active protrusions or contractions, can significantly improve gradient sensing accuracy at the cost of increased energy expenditure. By balancing sensing accuracy and deformation cost, we predict that a concave, three-branched shape as optimal for cells in shallow gradients. To achieve efficient chemotaxis, our theory suggests that a cell should adopt a repeating "run-and-expansion" cycle. Our theoretical predictions align well with experimental observations, implying that the fast amoeboid cell motion is optimized near the physical limit for chemotaxis. This study highlights the crucial role of active cell shape deformation in facilitating accurate chemotaxis.