Truncus arteriosus (TA) is a rare and severe congenital heart disease. Quadricuspid valve morphology occurs in 25% of all TA patients and is linked to regurgitation and increased risk of re-operation. It remains unclear how hemodynamic changes after TA repair alter valve performance. This study simulated pre- and postoperative conditions in a neonatal TA patient to investigate valve performance without direct intervention. We hypothesize that valve performance before and after truncal repair can be predicted in-silico, matching in-vivo imaging and identifying mechanisms how hemodynamic changes after repair will reduce valve regurgitation without direct intervention. Pre- and postoperative CT images of a neonatal patient with quadricuspid valve were segmented. Free edge length and geometric height from the patient's echocardiogram were used to model the valve. For the preoperative condition, ventricular pressures were set equal modeling an unrestricted ventricular septal defect. Systemic and pulmonary resistances were tuned based on the patient's Qp:Qs ratio. For the postoperative condition, boundary conditions were modified to mimic patient-specific hemodynamics after TA repair. The preoperative simulation confirmed mild valve regurgitation seen in-vivo. Interaction between asymmetric flow and surrounding vessel resulted in asymmetric opening and closing. Poor central coaptation led to a central regurgitant jet toward the septum. Altered postoperative hemodynamics improved coaptation and eliminated regurgitation, as seen in-vivo. This modeling approach reproduced in-vivo pre- and postoperative valve performance and identified mechanisms improving coaptation after TA repair. TA repair led to elimination of regurgitation due to enhanced central coaptation. Thus, altered postoperative hemodynamic conditions after TA repair may improve valve performance without direct leaflet intervention.
Epithelial cells regulate ion concentrations and volume through coordinated membrane pumps, ion channels, and paracellular pathways which can be modeled by classical single-compartment pump-leak equations (PLEs). Many epithelial functions, however, depend on the interaction between a cell and an enclosed luminal space, a geometry that cannot be captured by classical PLEs. To address this, we develop a two-compartment model consisting of an intracellular compartment coupled to a luminal compartment through the apical membrane, with both compartments interfacing an infinite extracellular bath and connected to it through the basolateral membrane and a paracellular pathway. Building on the five-dimensional single-cell PLEs, we formulate a ten-dimensional PLE system for this geometry and derive analytical equilibria and steady-state formulas for both the passive system and the Na+/K+-ATPase (NKA) driven active system. We characterize how these equilibria depend on physiologically relevant parameters, analyze local stability across wide parameter ranges, and apply global sensitivity and robustness methods to identify the principal determinants of ion and volume homeostasis. The model reveals fundamental differences between basolateral and apical placement of the NKA, including the onset of luminal volume blow-up when apical potassium recycling is insufficient. More broadly, this framework provides a mathematically tractable and physiologically grounded foundation for studying epithelial transport and for predicting conditions under which pump localization and conductance changes lead to stable function or pathological lumen expansion.
Cerebellar-like networks, in which input activity patterns are separated by projection to a much higher-dimensional space before classification, are a recurring neurobiological motif, present in the cerebellum, dentate gyrus, insect olfactory system, and electrosensory system of the electric fish. Their relatively well-understood design presents a promising test-case for probing principles of biological learning. The circuits' expansive projections have long been modelled as random, enabling effective general purpose pattern separation. However, electron-microscopy studies have discovered interesting hints of structure in both the fly mushroom body and mouse cerebellum. Recent numerical work suggested that this non-random connectivity enables the circuit to prioritise learning of some, presumably natural, tasks over others. Here, rather than numerical results, we present a robust mathematical link between the observed connectivity patterns and the cerebellar circuit's learning ability. In particular, we extend a simplified kernel regression model of the system and use recent machine learning theory results to relate connectivity to learning. We find that the reported structure in the projection weights shapes the network's inductive bias in intuitive ways: functions are easier to learn if they depend on inputs that are oversampled, or on collections of neurons that tend to connect to the same hidden layer neurons. Our approach is analytically tractable and pleasingly simple, and we hope it continues to serve as a model for understanding the functional implications of other processing motifs in cerebellar-like networks.
There is little debate about the importance of the ancestral recombination graph in population genetics. An important theoretical tool, the main obstacle to its widespread usage is the computational cost required to match the ever-increasing scale of the data being analyzed. Many of these difficulties have been overcome in the past two decades, which have consequently seen the development of increasingly sophisticated ARG simulation and inference software. Nonetheless, challenges remain, especially in the area of ancestry inference. This paper is a comprehensive review of ARG samplers that have emerged in the past three decades to meet the need for scalable and flexible ancestry simulation and inference solutions. It specifically focuses on their performance, usability, and the biological realism of the underlying algorithm, and aims primarily to provide a technical overview of the field for researchers seeking to write their own coalescent-with-recombination sampler. As a complement to this article, we have compiled links to software, source code and documentation and made them available at this https URL.
Efficient resolution of neuroinflammation and debris clearance is a key determinant of successful central nervous system regeneration. Regenerative vertebrates such as Danio rerio often exhibit faster immune resolution and debris clearance than mammals, yet the molecular determinants underlying these differences remain incompletely understood. TAM receptor tyrosine kinases (Tyro3, Axl, and Mertk) and their ligands Gas6 and Protein S are central regulators of phagocytosis and immune resolution in the nervous system, but whether intrinsic structural properties of these receptor-ligand complexes contribute to regenerative efficiency has not been systematically explored. Here, we present a comparative in silico analysis of TAM receptors and ligands from zebrafish, human, and mouse, integrating sequence evolution, high-confidence structural modeling, interface characterization, and electrostatic analysis. Despite substantial sequence divergence, ligand-binding domains display strong structural conservation, supporting a conserved global mode of TAM-ligand engagement. At the interface level, zebrafish complexes show enhanced electrostatic contributions and increased salt-bridge density, particularly in the Tyro3-Protein S interaction. Residue-level electrostatic analysis reveals clustered interface hotspots that are spatially conserved across species despite evolutionary rewiring of individual contacts. Together, these results suggest that TAM receptor-ligand interfaces are evolutionarily tuned through subtle electrostatic and geometric optimization rather than large-scale structural changes, providing a conserved yet adaptable framework for species-specific modulation of phagocytic signaling.
Diffusion models have emerged as a powerful class of generative models for molecular design, capable of capturing complex structural distributions and achieving high fidelity in 3D molecule generation. However, their widespread use remains constrained by long sampling trajectories, stochastic variance in the reverse process, and limited structural awareness in denoising dynamics. The Directly Denoising Diffusion Model (DDDM) mitigates these inefficiencies by replacing stochastic reverse MCMC updates with deterministic denoising step, substantially reducing inference time. Yet, the theoretical underpinnings of such deterministic updates have remained opaque. In this work, we provide a principled reinterpretation of DDDM through the lens of the Reverse Transition Kernel (RTK) framework by Huang et al. 2024, unifying deterministic and stochastic diffusion under a shared probabilistic formalism. By expressing the DDDM reverse process as an approximate kernel operator, we show that the direct denoising process implicitly optimizes a structured transport map between noisy and clean samples. This perspective elucidates why deterministic denoising achieves efficient inference. Beyond theoretical clarity, this reframing resolves several long-standing bottlenecks in molecular diffusion. The RTK view ensures numerical stability by enforcing well-conditioned reverse kernels, improves sample consistency by eliminating stochastic variance, and enables scalable and symmetry-preserving denoisers that respect SE(3) equivariance. Empirically, we demonstrate that RTK-guided deterministic denoising achieves faster convergence and higher structural fidelity than stochastic diffusion models, while preserving chemical validity across GEOM-DRUGS dataset. Code, models, and datasets are publicly available in our project repository.
Interval-censored covariates are frequently encountered in biomedical studies, particularly in time-to-event data or when measurements are subject to detection or quantification limits. Yet, the estimation of regression models with interval-censored covariates remains methodologically underdeveloped. In this article, we address the estimation of generalized linear models when one covariate is subject to interval censoring. We propose a likelihood-based approach, GELc, that builds upon an augmented version of Turnbull's nonparametric estimator for interval-censored data. We prove that the GELc estimator is consistent and asymptotically normal under mild regularity conditions, with available standard errors. Simulation studies demonstrate favorable finite-sample performance of the estimator and satisfactory coverage of the confidence intervals. Finally, we illustrate the method using two real-world applications: the AIDS Clinical Trials Group Study 359 and an observational nutrition study on circulating carotenoids. The proposed methodology is available as an R package at this http URL.
Fisher's fundamental theorem describes the change caused by natural selection as the change in gene frequencies multiplied by the partial regression coefficients for the average effects of genes on fitness. Fisher's result has generated extensive controversy in biology. I show that the theorem is a simple example of a general partition for change in regression predictions across altered contexts. By that rule, the total change in a mean response is the sum of two terms. The first ascribes change to the difference in predictor variables, holding constant the regression coefficients. The second ascribes change to altered context, captured by shifts in the regression coefficients. This general result follows immediately from the product rule for finite differences applied to a regression equation. Economics widely applies this same partition, the Oaxaca-Blinder decomposition, as a fundamental tool that can in proper situations be used for causal analysis. Recognizing the underlying mathematical generality clarifies Fisher's theorem, provides a useful tool for causal analysis, and reveals connections across disciplines.
Analysis of learned representations has a blind spot: it focuses on $similarity$, measuring how closely embeddings align with external references, but similarity reveals only what is represented, not whether that structure is robust. We introduce $geometric$ $stability$, a distinct dimension that quantifies how reliably representational geometry holds under perturbation, and present $Shesha$, a framework for measuring it. Across 2,463 configurations in seven domains, we show that stability and similarity are empirically uncorrelated ($\rho \approx 0.01$) and mechanistically distinct: similarity metrics collapse after removing the top principal components, while stability retains sensitivity to fine-grained manifold structure. This distinction yields actionable insights: for safety monitoring, stability acts as a functional geometric canary, detecting structural drift nearly 2$\times$ more sensitively than CKA while filtering out the non-functional noise that triggers false alarms in rigid distance metrics; for controllability, supervised stability predicts linear steerability ($\rho = 0.89$-$0.96$); for model selection, stability dissociates from transferability, revealing a geometric tax that transfer optimization incurs. Beyond machine learning, stability predicts CRISPR perturbation coherence and neural-behavioral coupling. By quantifying $how$ $reliably$ systems maintain structure, geometric stability provides a necessary complement to similarity for auditing representations across biological and computational systems.
Sweepstakes reproduction may be generated by chance matching of reproduction with favorable environmental conditions. Gene genealogies generated by sweepstakes reproduction are in the domain of attraction of multiple-merger coalescents where a random number of lineages merges at such times. We consider population genetic models of sweepstakes reproduction for haploid panmictic populations of both constant ($N$), and varying population size, and evolving in a random environment. We construct our models so that we can recover the observed number of new mutations in a given sample without requiring strong assumptions regarding the population size or the mutation rate. Our main results are {\it (i)} continuous-time coalescents that are either the Kingman coalescent or specific families of Beta- or Poisson-Dirichlet coalescents; when combining the results the parameter $\alpha$ of the Beta-coalescent ranges from 0 to 2, and the Beta-coalescents may be incomplete due to an upper bound on the number of potential offspring an arbitrary individual may produce; {\it (ii)} in large populations we measure time in units proportional to either $ N/\log N$ or $N$ generations; {\it (iii)} incorporating fluctuations in population size leads to time-changed multiple-merger coalescents where the time-change does not depend on $\alpha$; {\it (iv)} using simulations we show that in some cases approximations of functionals of a given coalescent do not match the ones of the ancestral process in the domain of attraction of the given coalescent; {\it (v)} approximations of functionals obtained by conditioning on the population ancestry (the ancestral relations of all gene copies at all times) are broadly similar (for the models considered here) to the approximations obtained without conditioning on the population ancestry.
In this work, we characterized the material properties of an animal model of the rotator cuff tendon using full volume datasets of both its intact and injured states by capturing internal strain behavior throughout the tendon. Our experimental setup, involving tension along the fiber direction, activated volumetric, tensile, and shear mechanisms due to the tendon's complex geometry. We implemented an approach to model inference that we refer to as variational system identification (VSI) to solve the weak form of the stress equilibrium equation using these full volume displacements. Three constitutive models were used for parameter inference: a neo-Hookean model, a modified Holzapfel-Gasser-Ogden (HGO) model with higher-order terms in the first and second invariants, and a reduced polynomial model consisting of terms based on the first, second, and fiber-related invariants. Inferred parameters were further refined using an adjoint-based partial differential equation (PDE)-constrained optimization framework. Our results show that the modified HGO model captures the tendon's deformation mechanisms with reasonable accuracy, while the neo-Hookean model fails to reproduce key internal features, particularly the shear behavior in the injured tendon. Surprisingly, the simplified polynomial model performed comparably to the modified HGO formulation using only three terms. These findings suggest that while current constitutive models do not fully replicate the complex internal mechanics of the tendon, they are capable of capturing key trends in both intact and damaged tissue, using a homogeneous modeling approach. Continued model development is needed to bridge this gap and enable clinical-grade, predictive simulations of tendon injury and repair.
Classifying Antimicrobial Peptides (AMPs) from the vast collection of peptides derived from metagenomic sequencing offers a promising avenue for combating antibiotic resistance. However, most existing AMP classification methods rely primarily on sequence-based representations and fail to capture the spatial structural information critical for accurate identification. Although recent graph-based approaches attempt to incorporate structural information, they typically construct residue- or atom-level graphs that introduce redundant atomic details and increase structural complexity. Furthermore, the class imbalance between the small number of known AMPs and the abundant non-AMPs significantly hinders predictive performance. To address these challenges, we employ lightweight OmegaFold to predict the three-dimensional structures of peptides and construct peptide graphs using C {\alpha} atoms to capture their backbone geometry and spatial topology. Building on this representation, we propose the Spatial GNN-based AMP Classifier (SGAC), a novel framework that leverages Graph Neural Networks (GNNs) to extract structural features and generate discriminative graph representations. To handle class imbalance, SGAC incorporates Weight-enhanced Contrastive Learning to cluster structurally similar peptides and separate dissimilar ones through adaptive weighting, and applies Weight-enhanced Pseudo-label Distillation to generate high-confidence pseudo labels for unlabeled samples, achieving balanced and consistent representation learning. Experiments on publicly available AMP and non-AMP datasets demonstrate that SGAC significantly achieves state-of-the-art performance compared to baselines.
The ethical and legal imperative to share research data without causing harm requires careful attention to privacy risks. While mounting evidence demonstrates that data sharing benefits science, legitimate concerns persist regarding the potential leakage of personal information that could lead to reidentification and subsequent harm. We reviewed metadata accompanying neuroimaging datasets from heterogeneous studies openly available on OpenNeuro, involving participants across the lifespan, from children to older adults, with and without clinical diagnoses, and including associated clinical score data. Using metaprivBIDS (this https URL), a software application for BIDS compliant tsv/json files that computes and reports different privacy metrics (k-anonymity, k-global, l-diversity, SUDA, PIF), we found that privacy is generally well maintained, with serious vulnerabilities being rare. Nonetheless, issues were identified in nearly all datasets and warrant mitigation. Notably, clinical score data (e.g., neuropsychological results) posed minimal reidentification risk, whereas demographic variables: age, sex assigned at birth, sexual orientations, race, income, and geolocation, represented the principal privacy vulnerabilities. We outline practical measures to address these risks, enabling safer data sharing practices.
Turning rich neuroimaging data into mechanistic insight remains challenging. Statistical models capture associations but remain largely agnostic to underlying mechanisms. Biophysical models embody candidate mechanisms but remain difficult to deploy without specialized expertise. Here, we present a hypothesis-first framework recasting model specifications as testable mechanistic hypotheses and streamlines the procedure for rejecting inappropriate hypotheses before moving to typical analyses. The key innovation is an expectation of model behavior under feature generalization constraints: we compute the model's expected $Y$ output across the parameter space based on the likelihood for a broader/distinct feature $Z$. Mirror statistical models are derived from these expected outputs and compared to the empirical ones with standard statistics. In synthetic experiments, our framework rejected mis-specified hypotheses and penalized unnecessary degrees of freedom while retaining valid hypotheses. These results demonstrate a practical hypothesis-driven approach for using mechanistic models in neuroimaging without requiring advanced training, complementing traditional analyses.
Cardiac muscle tissue exhibits highly non-linear hyperelastic and orthotropic material behavior during passive deformation. Traditional constitutive identification protocols therefore combine multiple loading modes and typically require multiple specimens and substantial handling. In soft living tissues, such protocols are challenged by inter- and intra-sample variability and by manipulation-induced alterations of mechanical response, which can bias inverse calibration. In this work we exploit spatially heterogeneous full-field kinematics as an information-rich alternative to multimodal testing. We adapt EUCLID, an unsupervised method for the automated discovery of constitutive models, towards Bayesian parameter inference for highly nonlinear, orthotropic constitutive models. Using synthetic myocardial tissue slabs, we demonstrate that a single heterogeneous biaxial experiment, combined with sparse reaction-force measurements, enables robust recovery of Holzapfel-Ogden parameters with quantified uncertainty, across multiple noise levels. The inferred responses agree closely with ground-truth simulations and yield credible intervals that reflect the impact of measurement noise on orthotropic material model inference. Our work supports single-shot, uncertainty-aware characterization of nonlinear orthotropic material models from a single biaxial test, reducing sample demand and experimental manipulation.
Mapping habitat suitability, based on factors like host availability and environmental suitability, is a common approach to determining which locations are important for the spread of a species. Mapping habitat connectivity takes geographic analyses a step further, evaluating the potential roles of locations in biological invasions, pandemics, or species conservation. Locations with high habitat suitability may play a minor role in species spread if they are geographically isolated. Yet, a location with lower habitat suitability may play a major role in a species' spread if it acts as a bridge between regions that would otherwise be physically fragmented. Here we introduce the geohabnet R package, which evaluates the potential importance of locations for the spread of species through habitat landscapes. geohabnet incorporates key factors such as dispersal probabilities and habitat suitability in a network framework, for better understanding habitat connectivity for host-dependent species, such as pathogens, arthropod pests, or pollinators. geohabnet uses publicly available or user-provided datasets, six network centrality metrics, and a user-selected geographic scale. We provide examples using geohabnet for surveillance prioritization of emerging plant pests in Africa and the Americas. These examples illustrate how users can apply geohabnet for their species of interest and generate maps of the estimated importance of geographic locations for species spread. geohabnet provides a quick, open-source, and reproducible baseline to quantify a species' habitat connectivity across a wide range of geographic scales and evaluates potential scenarios for the expansion of a species through habitat landscapes. geohabnet supports biosecurity programs, invasion science, and conservation biology when prioritizing management efforts for transboundary pathogens, pests, or endangered species.
This paper gives an in-depth theoretical analysis of the direction and speed selectivity properties of idealized models of the spatio-temporal receptive fields of simple cells and complex cells, based on the generalized Gaussian derivative model for visual receptive fields. According to this theory, the receptive fields are modelled as velocity-adapted affine Gaussian derivatives for different image velocities and different degrees of elongation. By probing such idealized receptive field models of visual neurons to moving sine waves with different angular frequencies and image velocities, we characterize the computational models to a structurally similar probing method as is used for characterizing the direction and speed selective properties of biological neurons. By comparison to results of neurophysiological measurements of direction and speed selectivity for biological neurons in the primary visual cortex, we find that our theoretical results are consistent with (i) velocity-tuned visual neurons that are sensitive to particular motion directions and speeds, and (ii) different visual neurons having broader vs. sharper direction and speed selective properties. Our theoretical results in combination with results from neurophysiological characterizations of motion-sensitive visual neurons are also consistent with a previously formulated hypothesis that the simple cells in the primary visual cortex ought to be covariant under local Galilean transformations, so as to enable processing of visual stimuli with different motion directions and speeds.
During the COVID-19 crisis, mechanistic models have guided evidence-based decision making. However, time-critical decisions in a dynamical environment limit the time available to gather supporting evidence. We address this bottleneck by developing a graph neural network (GNN) surrogate of an age-structured and spatially resolved mechanistic metapopulation simulation model. This combined approach complements classical modeling approaches which are mostly mechanistic and purely data-driven machine learning approaches which are often black box. Our design of experiments spans outbreak and persistent-threat regimes, up to three contact change points, and age-structured contact matrices on a spatial graph with 400 nodes representing German counties. We benchmark multiple GNN layers and identify an ARMAConv-based architecture that offers a strong accuracy-runtime trade-off. Across horizons of 30-90 day simulation and prediction, allowing up to three contact change points, the surrogate model attains 10-27 \% mean absolute percentage error (MAPE) while delivering (near) constant runtime with respect to the forecast horizon. Our approach accelerates evaluation by up to 28,670 times compared with the mechanistic model, allowing responsive decision support in time-critical scenarios and straightforward web integration. These results show how GNN surrogates can translate complex metapopulation models into immediate, reliable tools for pandemic response.