Infectious diseases continue to pose significant public health challenges worldwide, requiring effective prevention and control strategies to mitigate their negative impact. Infectious diseases can be broadly classified into two groups: vaccine-preventable diseases (e.g., measles, polio, influenza, hepatitis B, pneumonia) and vaccine-non-preventable diseases (e.g., HIV/AIDS). Vaccine-preventable disease models are one of the essential tools for understanding infectious disease dynamics, evaluating intervention strategies, and guiding public health policies. In this review article, we explore the recent advancements in modeling two particular vaccine-preventable infectious diseases. Here, we consider both deterministic and stochastic models to comprehensively capture the complexity of disease transmission, vaccine efficacy, and population-level immunity. We highlight the application of these models to the infectious diseases, namely, bacterial and viral pneumonia caused by the bacteria Streptococcus pneumoniae (S. pneumoniae) and the respiratory syncytial virus (RSV). Pneumonia carry a substantial global burden, where modeling has played a crucial role in assessing vaccine impacts and optimizing immunization strategies to minimize the disease burden. By synthesizing recent methodologies and findings, this review provides valuable insights for future research and policy decisions aimed at improving vaccine-preventable disease control for pneumonia caused by S. pneumoniae and RSV.
The onset of life is often framed around membrane bound compartments and encoded metabolism, leaving unresolved how spatial organization arose before stable boundaries. In this context, environmental gradients are usually treated as boundary conditions rather than variables structuring chemical dynamics. We ask whether spatial localization and functional coupling can emerge under realistic environmental gradients in the absence of membranes, proposing that spatial variations in energy availability act as organizing variables that bias transport and reaction. We introduce a reaction diffusion model in which interacting chemical species evolve within an externally imposed activity landscape defined by coupled gradients in pH, redox potential and temperature, integrating diffusion, gradient driven drift and position dependent reaction kinetics. We performed simulations across a range of gradient strengths representative of hydrothermal vent like conditions. Our results suggest that sufficiently strong gradients induce spontaneous accumulation of reactants, spatial alignment of reaction maxima and the emergence of stable, confined chemical states. Localization arises above a threshold at which gradient driven transport overcomes diffusive and degradative losses. We conclude that spatially structured energy landscapes can support organized chemical dynamics without predefined compartments, providing a mechanism for coupling and persistence in continuous media. Potential applications include experimental platforms for studying prebiotic chemistry, microfluidic systems with controlled gradients and the design of chemically responsive materials.
Cell proliferation and cell movement are fundamentally stochastic processes which lead to variability in the growth and spatial structure of cell populations in many biological settings, such as cell invasion, wound healing, and tumour growth. We develop stochastic, on-lattice agent-based models (ABMs) which incorporate volume exclusion, random movement, and multi-stage representations of the cell cycle. The multi-stage framework enables a more realistic representation of true cell cycle time distributions. We also introduce a novel form of myopic behaviour, in which cells sense their local environment when attempting to proliferate. For each ABM, we derive a corresponding continuum partial differential equation (PDE) description under the mean-field approximation. Using numerical simulations, we investigate how different proliferation mechanisms influence population-level dynamics in both the discrete and continuum models. In particular, we consider biologically relevant contexts of growth-to-confluence assays (using uniform initial conditions) and travelling wave behaviour associated with cell invasion. We examine how the PDE solutions compare with the behaviour of the corresponding ABMs averaged over many realisations.
The integration of single-cell proteomic data is often hindered by the fragmented nature of targeted antibody panels. To address this limitation, we introduce scpFormer, a transformer-based foundation model designed for single-cell proteomics. Pre-trained on over 390 million cells, scpFormer replaces standard index-based tokenization with a continuous, sequence-anchored approach. By combining Evolutionary Scale Modeling (ESM) with value-aware expression embeddings, it dynamically maps variable panels into a shared semantic space without artificial discretization. We demonstrate that scpFormer generates global cell representations that perform competitively in large-scale batch integration and unsupervised clustering. Moreover, its open-vocabulary architecture facilitates in silico panel expansion, assisting in the reconstruction of biological manifolds in sparse clinical datasets. Finally, this learned protein co-expression logic is transferable to bulk-omics tasks, supporting applications like cancer drug response prediction. scpFormer provides a versatile, panel-agnostic framework to facilitate scalable biomarker discovery and precision oncology.
Virtual cell modeling predicts molecular state changes under genetic perturbations in silico, which is essential for biological mechanism studies. However, existing approaches suffer from unconstrained reasoning, uninterpretable predictions, and retrieval signals that are weakly aligned with regulatory topology. To address these limitations, we propose AROMA, an Augmented Reasoning Over a Multimodal Architecture for virtual cell genetic perturbation modeling. AROMA integrates textual evidence, graph-topology information, and protein sequence features to model perturbation-target dependencies, and is trained with a two-stage optimization strategy to yield predictions that are both accurate and interpretable. We also construct two knowledge graphs and a perturbation reasoning dataset, PerturbReason, containing more than 498k samples, as reusable resources for the virtual cell domain. Experiments show that AROMA outperforms existing methods across multiple cell lines, and remains robust under zero-shot evaluation on an unseen cell line, as well as in knowledge-sparse, long-tail scenarios. Overall, AROMA demonstrates that combining knowledge-driven multimodal modeling with evidence retrieval provides a promising pathway toward more reliable and interpretable virtual cell perturbation prediction. Model weights are available at this https URL. Code is available at this https URL.
Biases in molecular evolution can significantly influence evolutionary trajectories. They have been described in a variety of contexts such as development and mutation, but not for acquiring new functions (i.e. emergence). Here, we formalize the term, emergence bias, as the molecular predisposition that, upon mutation, biases a genetic sequence towards or against gaining new functions or causing new phenotypes. These biases have been observed in previous studies for the emergence of promoters, enhancers, and de novo proteins, but never formally characterized as such. In this Perspective piece, we describe these studies and synthesize their findings through the prism of a unifying term, emergence bias, to provide support for this new concept , and speculate on its molecular underpinnings. We believe that emergence biases may play an important role in evolutionary innovations.
Designing regulatory DNA elements with precise cell-type-specific activity is broadly relevant for cell engineering and gene therapy. Deep generative models can generate functional gene-regulatory elements, but existing methods struggle to achieve high specificity against undesired cell types while adhering to the genome's natural regulatory grammar. Here, we introduce DNA-CRAFT, a generative framework that integrates class-conditioned discrete diffusion with Monte Carlo tree search to design cell-type-specific and biologically faithful regulatory elements. We first train a discrete diffusion model on the ENCODE registry of 3.2 million candidate regulatory elements. Second, we condition the model to learn class-specific regulatory grammars of naturally occurring DNA sequences, including enhancers and promoters. Third, we employ conditional Monte Carlo tree guidance, an inference-time alignment algorithm designed to maximize the differential regulatory activity between desired and undesired cell types. By benchmarking DNA-CRAFT on regulatory sequence design tasks for human cell lines and immune cell types, we demonstrate that our model generates sequences with high predicted cell-type-specific activity and biological fidelity, achieving the best trade-offs compared to methods that use diffusion, autoregressive models, and gradient-based optimization.
Lateral predictive coding (LPC) is a simple theoretical framework to appreciate feature detection in biological neural circuits. Recent theoretical work [Huang et al., Phys.Rev.E 112, 034304 (2025)] has successfully constructed optimal LPC networks capable of extracting non-Gaussian hidden input features by imposing the tradeoff between energetic cost and information robustness, but the resulting dynamical systems of recurrent interactions can be very slow in responding to external inputs. We investigate response-time reduction in the present paper. We find that the characteristic response time of the LPC system can be minimized to closely approaching the lower-bound value without compromising the mean predictive error (energetic cost) and the information robustness of signal transmission. We further demonstrate that optimal LPC networks taking a modular structural organization with extensively reduced number of lateral interactions are equally excellent as all-to-all completely connected networks, in terms of feature detection performance, response time, energetic cost and information robustness.
Modern optical microscopes are fully motorised; however, transforming them into truly smart systems requires real-time adjustment of acquisition settings in response to detected objects and dynamic biological events. At the core are classification algorithms that commonly depend on customised softwares and are generally designed for narrowly-defined biological applications. In addition, they often require substantial annotated datasets for effective training. We introduce a semi-supervised generative adversarial network (SGAN) for robust cell-cycle stage classification under low-resource conditions, adaptable to diverse cellular structures. The framework combines unlabelled microscopy images with synthetically generated samples to mitigate limited annotation, while preserving stable performance even when the unlabelled subset is class-imbalanced. Tested on the Mitocheck dataset, which features five mitosis classes, the model achieved $93 \pm 2\%$ accuracy using only 80 labelled per class and 600 unlabelled images. The proposed algorithm is generic and can be readily adapted to new labeling schemes, classification targets, cell lines, or microscopy modalities through transfer learning. SGAN is well suited for integration into automated microscopes, enabling efficient and adaptable image analysis across diverse biological and microscopy applications.
Recognizing individual animals over time is central to many ecological and conservation questions, including estimating abundance, survival, movement, and social structure. Recent advances in automated identification from images and even acoustic data suggest that this process could be greatly accelerated, yet their promise has not translated well into ecological practice. We argue that the main barrier is not the performance of the automated methods themselves, but a mismatch between how those methods are typically developed and evaluated, and how ecological data is actually collected, processed, reviewed, and used. Future progress, therefore, will depend less on algorithmic gains alone than on recognizing that the usefulness of automated identification is grounded in ecological context: it depends on what question is being asked, what data are available, and what kinds of mistakes matter. Only by centering these questions can we move toward automated identification of individuals that is not only accurate but also ecologically useful, transparent, and trustworthy.
Motivation: Protein function prediction is a challenging task and an open problem in computational biology. The Critical Assessment of protein Function Annotation (CAFA) is a triennial, community-driven initiative that provides an independent, large-scale evaluation of computational methods for protein function prediction through time-delayed benchmarking experiments. CAFA has played a key role in highlighting high-performing methodologies and fostering detailed analysis and exchange of ideas. However, outside the periodic CAFA challenges, there is no platform for the continuous evaluation of newly developed methods and tracking performance as function annotations accumulate. Results: Here we introduce the Longitudinal Assessment of Protein Function Annotation Models server (LAFA) as a persistent benchmarking system for protein function prediction methods. LAFA provides a continuous evaluation of containerized function prediction methods, enabling up-to-date and robust comparative assessment of method performance under evolving ground truth. LAFA accelerates methodological iteration, supports reproducibility, and offers a more dynamic and fine-grained view of progress in protein function prediction. Code and Data Availability: LAFA is available at this https URL. Detailed evaluation results can be found at this https URL
Generative AI is rapidly transforming how organizations create value and evaluate talent. While large language models enhance baseline output quality, they simultaneously introduce ambiguity in assessing human creativity, as observable artifacts may be partially or fully AI-generated. This paper reconceptualizes creativity as a distributional and process-based property that emerges under shared constraints and competitive incentives. We introduce a quantitative framework for measuring creativity as novelty in synthesis, operationalized through idea generation and idea transformation within embedding space. Empirical evaluation demonstrates that the proposed metrics align with intuitive judgments of creativity while capturing distinctions that surface-level quality assessments miss. We further identify a structural shift toward bimodal distributions of creative output in AI-mediated environments, with implications for hiring, leadership, and competitive strategy. The findings suggest that in the age of generative AI, distinctiveness rather than fluency becomes the primary signal of human creative capability.
Graph-theoretic approaches offer simplicity, interpretability, and low computational cost for molecular property prediction. Among these, the model proposed by Mukwembi and Nyabadza, based on the external activity $D(G)$ and internal activity $\zeta(G)$ indices, achieved strong results on a small flavonoid dataset. However, its ability to generalize to larger and chemically diverse datasets has not been tested. This study evaluates the baseline $D(G)$-$\zeta(G)$ polynomial model on five benchmark datasets from MoleculeNet, covering biological activity (BACE, 1,513 molecules), lipophilicity (LogP synthetic, 14,610 molecules; LogP experimental, 753 molecules), aqueous solubility (ESOL, 1,128 molecules), and hydration free energy (SAMPL, 642 molecules). The baseline model achieves an average $R^2 = 0.24$, confirming limited transferability. To address this, a systematic enhancement framework is proposed, progressively incorporating Ridge regularization, additional graph descriptors, physicochemical properties, ensemble learning with Gradient Boosting, Lasso feature selection, and a hybrid approach combining topological indices with Morgan fingerprints. The enhanced models raise the average best $R^2$ to 0.79, with individual improvements ranging from 165\% to 274\%. All improvements are statistically significant ($p < 0.001$). A direct comparison with a Graph Convolutional Network under identical experimental conditions shows that the enhanced classical models match or outperform deep learning on all five datasets. Comparison with the recent GNN+PGM hybrid of Djagba et al.\ further confirms competitiveness, with the enhanced models achieving the best results on two datasets and tying on one. The entire framework requires no GPU, trains in under five minutes, and uses only open-source tools, making it accessible for researchers in resource-limited settings.
Biological systems are promising substrates for computation because they naturally process environmental information through complex internal dynamics. In this study, we investigate whether bacterial metabolic models can act as physical reservoirs and whether their computational performance can be predicted from dynamical properties linked to separability and similarity. We simulated the growth dynamics of five bacterial species, one yeast species, and 29 Escherichia coli single-gene deletion mutants using dynamic flux balance analysis (dFBA), with glucose and xylose concentrations as inputs and growth curves as reservoir states. Computational performance was assessed on random nonlinear classification tasks using a linear readout, while reservoir properties linked to separability and similarity were characterised through kernel and generalisation ranks computed from growth-curve state matrices. Several microbial models achieved high classification accuracy, showing that bacterial metabolic dynamics can support nonlinear computation. Clear differences were observed between species, with some models converging more rapidly and others reaching higher maximum accuracy, revealing a trade-off between convergence speed and peak performance. In contrast, all E. coli mutants were dominated by the wild-type model, suggesting that gene deletions reduce the dynamical richness required for efficient computation. The difference between kernel and generalisation ranks was generally associated with improved accuracy, but deviations across models and sensitivity at low rank values limited its predictive power in practice. Overall, these results show that bacterial metabolic models constitute promising substrates for reservoir computing and provide a first step towards identifying microbial strains with favourable computational properties for future experimental implementations.
We address a short-wave asymptotic for one class of quasi-linear second order PDE systems involving the cross-diffusion described by the so-called Patlak--Keller--Segel law. It is common to employ these equations for modelling the predator--prey community with the prey-taxis that means the interactions of two species of particles or cells or anything else through which the species called "predators" is capable of moving directionally while searching for the other species called "prey." However, we suppose the predators to be sensitive not to the prey density but to a driving signal produced by the prey. Additionally, the production of the driving signal is assumed to be sensitive to the intensity of an external field, which is independent from the community state. This is what we call the external signal. It can be due to the spatiotemporal inhomogeneity of the environment arising from natural or artificial reasons. We assume that the external signal takes a general short-wave form and construct a complete asymptotic expansion for the short-wave solutions with no restrictions on the spatial dimension or kinetics of inter/intraspecific reactions. Further, we apply the short wave asymptotic to studying the stability or instability induced by the external signal following Kapitza' theory for the upside-down pendulum. Applying the general results to some special classes external signals, we get examples of suppressing the taxical transport, examples of robustness of the species equilibrium to the signal or, oppositely, blurring the borderline in the parametric space between the areas of stability and instability of this equilibrium. These results contribute to filling the gap in the literature, since the theory and techniques for the asymptotic integration of systems described above represent a weakly charted area.
The sequentially Markov coalescent (SMC) is a Markov jump process which models correlations in local genealogies across a chromosome. It has been used as a theoretical tool for studying linkage disequilibrium and identity-by-descent, and it also forms the basis of a class of statistical procedures for estimating population history and inferring ancestry. In this paper, we study the rate at which SMC forgets its initial condition in the pairwise setting. For the embedded jump chain, we prove geometric ergodicity in total variation, with explicit constants. For the continuous process, by contrast, the total variation distance from stationarity decays as $\asymp 1/\ell$ in genetic distance $\ell$. We obtain analogous results for the closely related SMC' process using a novel time-change argument. One application of these results is to justify heuristic approximations used in the literature that treat distant loci as evolving independently.
The central problem in biomedical imaging are batch effects: systematic technical variations unrelated to the biological signal of interest. These batch effects critically undermine experimental reproducibility and are the primary cause of failure of deep learning systems on new experimental batches, preventing their practical use in the real world. Despite years of research, no method has succeeded in closing this performance gap for deep learning models. We propose Control-Stabilized Adaptive Risk Minimization via Batch Normalization (CS-ARM-BN), a meta-learning adaptation method that exploits negative control samples. Such unperturbed reference images are present in every experimental batch by design and serve as stable context for adaptation. We validate our novel method on Mechanism-of-Action (MoA) classification, a crucial task for drug discovery, on the large-scale JUMP-CP dataset. The accuracy of standard ResNets drops from 0.939 $\pm$ 0.005, on the training domain, to 0.862 $\pm$ 0.060 on data from new experimental batches. Foundation models, even after Typical Variation Normalization, fail to close this gap. We are the first to show that meta-learning approaches close the domain gap by achieving 0.935 $\pm$ 0.018. If the new experimental batches exhibit strong domain shifts, such as being generated in a different lab, meta-learning approaches can be stabilized with control samples, which are always available in biomedical experiments. Our work shows that batch effects in bioimaging data can be effectively neutralized through principled in-context adaptation, which also makes them practically usable and efficient.
Oncolytic virotherapy, utilizing genetically modified viruses to combat cancer and trigger anti-cancer immune responses, has garnered significant attention in recent years. In our previous work arXiv:2305.12386, we developed a stochastic agent-based model elucidating the spatial dynamics of infected and uninfected cells within solid tumours. Building upon this foundation, we present a novel stochastic agent-based model to describe the intricate interplay between the virus and the immune system; the agents' dynamics are coupled with a balance equation for the concentration of the chemoattractant that guides the movement of immune cells. We formally derive the continuum limit of the model and carry out a systematic quantitative comparison between this system of PDEs and the individual-based model in two spatial dimensions. Furthermore, we describe the traveling waves of the three populations, with the uninfected proliferative cells trying to escape from the infected cells while immune cells infiltrate the tumour. Simulations show a good agreement between agent-based approaches and numerical results for the continuum model. Some parameter ranges give rise to oscillations of cell number in both models, in line with the behaviour of the corresponding nonspatial model, which presents Hopf bifurcations. Nevertheless, in some situations the behaviours of the two models may differ significantly, suggesting that stochasticity plays a key role in the dynamics. Our results highlight that a too rapid immune response, before the infection is well-established, appears to decrease the efficacy of the therapy and thus some care is needed when oncolytic virotherapy is combined with immunotherapy. This further suggests the importance of clinically improving the modulation of the immune response according to the tumour's characteristics and to the immune capabilities of the patients.
Segmenting cytoskeletal filaments in microscopy images is essential for studying their roles in cellular processes. However, this task is highly challenging due to the fine, densely packed, and intertwined nature of these structures. Imaging limitations further complicate analysis. While deep learning has advanced segmentation of large, well-defined biological structures, its performance often degrades under such adverse conditions. Additional challenges include obtaining precise annotations for curvilinear structures and managing severe class imbalance during training. We introduce a novel noise-adaptive attention mechanism that extends the Squeeze-and-Excitation (SE) module to dynamically adjust to varying noise levels. Integrated into a U-Net decoder with residual encoder blocks, this yields ASE_Res_UNet, a lightweight yet high-performance model. We also developed a synthetic dataset generation strategy that ensures accurate annotations of fine filaments in noisy images. We systematically evaluated loss functions and metrics to mitigate class imbalance, ensuring robust performance assessment. ASE_Res_UNet effectively segmented microtubules in noisy synthetic images, outperforming its ablated variants. It also demonstrated superior segmentation compared to models with alternative attention mechanisms or distinct architectures, while requiring fewer parameters, making it efficient for resource-constrained environments. Evaluation on a newly curated real microscopy dataset and a recently reannotated dataset highlighted ASE_Res_UNet's effectiveness in segmenting microtubules beyond synthetic images. For these datasets, ASE_Res_UNet was competitive with a recent synthetic data-driven approach that shares two cytoskeleton pretrained models. Importantly, ASE_Res_UNet showed strong transferability to other curvilinear structures (blood vessels and nerves) across diverse imaging conditions.
The statistics of correlations are central quantities characterizing the collective dynamics of recurrent neural networks. We derive exact expressions for the statistics of correlations of nonlinear recurrent networks in the limit of a large number N of neurons, including systematic 1/N corrections, in the regime of Gaussian quenched disorder. Our approach uses a path-integral representation of the network stochastic dynamics, which reduces the description to a few collective variables and enables efficient computation. This generalizes previous results on linear networks to include a wide family of nonlinear activation functions, which enter as interaction terms in the path integral. These interactions can resolve the instability of the linear theory and yield a strictly positive participation dimension. We present explicit results for power-law activations, revealing scaling behavior controlled by the network coupling. In addition, we introduce a class of activation functions based on Pade approximants and provide analytic predictions for their correlation statistics. Numerical simulations confirm our theoretical results with excellent agreement. We also compare with previous works that have studied the complementary case with annealed disorder, and based on this we propose a new self-consistent equation for the more general case of colored noise.
Foundation models (FMs) are driving a prominent shift in biomedical imaging from task-specific models to unified backbone models for diverse tasks. This opens an avenue to integrate imaging, pathology, clinical records, and genomics data into a composite system. However, this vision contrasts sharply with modern medicine's trajectory toward more granular sub-specialization. This tension, coupled with data scarcity, domain heterogeneity, and limited interpretability, creates a gap between benchmark success and real-world clinical value. We argue that the immediate role of FMs lies in augmenting, not replacing, clinical expertise. To separate hype from reality, we introduce REAL-FM (Real-world Evaluation and Assessment of Foundation Models), a multi-dimensional framework for assessing data, technical readiness, clinical value, workflow integration, and responsible AI. Using REAL-FM, we find that while FMs excel in pattern recognition, they fall short in causal reasoning, domain robustness, and safety. Clinical translation is hindered by scarce representative data for model training, unverified generalization beyond oversimplified benchmark settings, and a lack of prospective outcome-based validation. We further examine FM reasoning paradigms, including sequential logic, spatial understanding, and symbolic domain knowledge. We envision that the path forward lies not in a monolithic medical oracle, but in coordinated subspecialist AI systems that are transparent, safe, and clinically grounded.
Computer simulations of complex population genetic models are an essential tool for making sense of the large-scale datasets of multiple genome sequences from a single species that are becoming increasingly available. A widely used approach for reducing computing time is to simulate populations that are much smaller than the natural populations that they are intended to represent, by using parameters such as selection coefficients and mutation rates whose products with the population size correspond to those of the natural populations. This approach has come to be known as rescaling, and is justified by the theory of the genetics of finite populations. Recently, however, there have been criticisms of this practice, which have brought to light situations in which it can lead to erroneous conclusions. This paper reviews the theoretical basis for rescaling, and relates it to current practice in population genetics simulations. It shows that some population genetic statistics are scaleable while others are not. Additionally, it shows that there are likely to be problems with rescaling when simulating large chromosomal regions, due to the non-linear relation between the physical distance between a pair of separate nucleotide sites and the frequency of recombination between them. Other difficulties with rescaling can arise in connection with simulations of selection on complex traits, and with populations that reproduce partly by self-fertilization or asexual reproduction. A number of recommendations are made for good practice in relation to rescaling.
Automatic extraction of retinal vascular biomarkers from color fundus images (CFI) is crucial for large-scale studies of the retinal vasculature. We present VascX, an open-source Python toolbox that extracts biomarkers from CFI artery-vein segmentations. VascX starts from vessel segmentation masks, extracts their skeletons, builds undirected and directed vessel graphs, and resolves vessel segments into longer vessels. A comprehensive set of biomarkers is derived, including vascular density, central retinal equivalents (CREs), and tortuosity. Spatially localized biomarkers may be calculated over grids placed relative to the fovea and optic disc. VascX is released via GitHub and PyPI with comprehensive documentation and examples. Our test-retest reproducibility analysis on repeat imaging of the same eye by different devices shows that most VascX biomarkers have moderate to excellent agreement (ICC > 0.5), with important differences in the level of robustness of different biomarkers. Our analyses of biomarker sensitivity to image perturbations and heuristic parameter values support these differences and further characterize VascX biomarkers. Ultimately, VascX provides an explainable and easily modifiable feature-extraction toolbox that complements segmentation to produce reliable retinal vascular biomarkers. Our graph-based biomarker computation stages support reproducible, region-aware measurements suited for large-scale clinical and epidemiological research. By enabling easy extraction of existing biomarkers and rapid experimentation with new ones, VascX supports oculomics research. Its robustness and computational efficiency facilitate scalable deployment in large databases, while open-source distribution lowers barriers to adoption for ophthalmic researchers and clinicians.
Absolute concentration robustness (ACR) means the concentration of certain species stays the same in all the steady states. In this work, we study how conservation laws might effect non-vacuous ACR in reaction networks. The goal is to show whether non-vacuous ACR can be preserved or precluded by adding species that depend on the existing species. We have the following two main results. (i) For networks with conservation laws, we prove a criterion: for a nondegenerate network, augmenting it with one new species that depends on the original species leads to the resulting network having no non-vacuous ACR for any generic choice of rate constants in the new species. (ii) We characterize all non-redundant zero-one networks with dimension of at most two that exhibit non-vacuous ACR for any generic choice of rate constants according to the number of distinct rows in the stoichiometric matrices. An important finding is that if there are at least four distinct rows in the stoichiometric matrix, then the corresponding network has no non-vacuous ACR for any generic choice of rate constants, which implies that many conservation laws prevent non-vacuous ACR in non-redundant zero-one reaction networks.
Identifying spatially contiguous clusters and repeated spatial patterns (RSP) characterized by similar underlying distributions that are spatially apart is a key challenge in modern spatial statistics. Existing constrained clustering methods enforce spatial contiguity but are limited in their ability to identify RSP. We propose a novel nonparametric framework that addresses this limitation by combining constrained clustering with a post-clustering reassigment step based on the maximum mean discrepancy (MMD) statistic. We employ a block permutation strategy within each cluster that preserves local attribute structure when approximating the null distribution of the MMD. We also show that the MMD$^2$ statistic is asymptotically consistent under second-order stationarity and spatial mixing conditions. This two-stage approach enables the detection of clusters that are both spatially distant and similar in distribution. Through simulation studies that vary spatial dependence, cluster sizes, shapes, and multivariate dimensionality, we demonstrate the robustness of our proposed framework in detecting RSP. We further illustrate its applicability through an analysis of spatial proteomics data from patients with triple-negative breast cancer. Overall, our framework presents a methodological advancement in spatial clustering, offering a flexible and robust solution for spatial datasets that exhibit repeated patterns.
Early detection of malignant lung nodules remains limited by reliance on size- and growth-based screening criteria, which can delay diagnosis. We present an integrated AI system that - unlike conventional CADe or CADx approaches - jointly performs nodule detection and malignancy assessment directly at the nodule level from low-dose CT scans within a unified aided decision framework. To address limitations in dataset scale and explainability, we designed an ensemble of shallow deep learning and feature-based specialized models, trained and evaluated on 25,709 scans with 69,449 annotated nodules, with external validation on an independent cohort. The system achieves an area under the receiver operating characteristic curve (AUC) of 0.98 internally and 0.945 on an independent cohort, outperforming radiologists and leading AI models (Sybil, Brock, Google, Kaggle). With a sensitivity of 99.3 percent at 0.5 false positives per scan, it addresses key barriers to AI adoption and demonstrates improved performance relative to both Lung-RADS size-based triage and European volume- and VDT-based screening criteria. The model outperforms radiologists across all nodule sizes and cancer stages - excelling in stage I cancers - and across all growth-based metrics, including volume-doubling time. It also surpasses radiologists by up to one year in diagnosing indeterminate and slow-growing nodules.