New articles on Quantitative Biology


[1] 2606.20767

COLD-CI: A large-scale very high-resolution label polygon dataset for cocoa and non-cocoa classification in Cote d'Ivoire

Spatially explicit information on cocoa cultivation is essential for land-use planning, deforestation monitoring, environmental assessment, and supply-chain analysis. Although several cocoa map products exist, their underlying reference data are often not publicly available, limiting transparency and methodological benchmarking. Here, we present a large-scale, very high-resolution cocoa and non-cocoa label polygon dataset for Cote d'Ivoire (COLD-CI), covering the main cocoa-producing regions as well as contrasting non-cocoa landscapes. COLD-CI consists of 123,736 vector polygons corresponding to a total labelled area of 5,996 km^2, including 58,107 cocoa polygons (1,788 km^2) and 65,629 background polygons (4,208 km^2). Polygon label candidates were first generated through conservative automated filtering of polygons from the West Africa Cocoa dataset and the combination of multiple external thematic datasets. These candidates were subsequently refined and complemented through systematic visual interpretation, manual correction, and digitisation using very high-resolution (0.5 m) satellite imagery. The resulting label polygons capture cocoa planted areas and associated fine-scale internal heterogeneity, as well as a wide range of non-cocoa land-cover types. Independent validation using field-based and expert photointerpreted reference data from the Copernicus4GEOGLAM validation dataset indicated an overall agreement of 99%, with producer's and user's accuracy exceeding 98% for both cocoa and background classes. COLD-CI is released as a vector dataset with associated metadata to support transparent benchmarking, model development, and validation across a wide range of spatial resolutions.


[2] 2606.20844

Relational Gaze Transitions During Encoding Predict Episodic Recall of Naturalistic Scenes

Remembering a visual scene requires organizing distinct details into a cohesive event. This study investigates whether relation-guided gaze transitions provide a behavioural marker of this cognitive organization during episodic encoding and retrieval. By applying scene graph annotations to eye-tracking data, we measured whether gaze moved between objects that were meaningfully related within complex scenes. This approach allowed us to quantify relational scanning within naturalistic environments, moving beyond prior methods that relied on simplified displays or isolated relation types. Participants showed above-chance relational gaze during both initial viewing and blank-screen retrieval, indicating that gaze actively tracks scene structure during first viewing and at recall. Additionally, relational scanning at encoding predicted subsequent free recall of both object and relational details, even after accounting for salience, fixation frequency, meaning, and image-level differences. In contrast, relational scanning at retrieval did not predict recall success, suggesting that relational gaze is most functional to memory during its formation. Together, these findings show that relational gaze can be measured in complex scenes and may serve as a marker of episodic encoding during natural visual exploration.


[3] 2606.21351

Surveying the adaptive landscapes of 10,000 antibodies

Affinity maturation is the Darwinian process by which antibodies improve antigen binding through somatic hypermutation and selection. The adaptive landscape, which defines the set of antibody-specific mutations that improve functional characteristics like antigen binding, has been explored in only a handful of antibodies. Identifying the sites of adaptive mutations in a given antibody sequence, and how these sites vary across the antibody repertoire, can inform the design of therapeutic antibodies. We develop a parameter-free population genetic framework that leverages the statistics of convergent affinity maturation in B cell lineages sharing similar naive sequences, called public clonotypes, to identify beneficial mutations. Applying this framework to more than 10,000 public clonotypes represented by multiple lineages across 20 healthy individuals, we identify widespread signatures of clonotype-dependent selection of individual mutations. We estimate the prevalence and typical fitness effects of mutations across the V gene at the single-site level, uncovering a general tradeoff between prevalence and fitness effect. These inferred landscapes broadly reproduce the statistics of convergent mutation in antibodies specific to SARS-CoV-2 and influenza. Finally, we use our framework to benchmark predictions from existing antibody language models, and show that while these models are dominated by non-selective signatures, a simple renormalization procedure can expose signatures of clonotype-dependent positive selection consistent with our predictions.


[4] 2606.21481

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase

We develop a deterministic mathematical model to quantify the distinct in vitro infection dynamics of Bunyamwera virus (BUNV) and Batai virus (BATV) in A549 cells, incorporating cell division and natural death, continued entry of virions into already-infected cells, and shortening of the eclipse phase driven by re-infection. The model parameters were estimated making use of viral decay data, growth curves at two different inoculum concentrations, and extra-cellular genome copy measurements (for BUNV) via Markov chain Monte Carlo. Genome copy measurements were essential for constraining estimates of the number of cells that can become infected per unit of infectious virus for BUNV. We found that BUNV exhibited substantially longer eclipse and infectious periods than BATV, while BATV showed a higher per-cell virus production rate. Re-infection was predicted to shorten the eclipse phase for both viruses, but the effect was markedly stronger for BUNV. Together, these results provide a quantitative comparison of the in vitro viral kinetics of BUNV and BATV and reveal substantial differences in their replication dynamics.


[5] 2606.21508

Adaptive conduction delays and phase locking in spiking Haken Lighthouse networks

We develop a theory of phase-locked activity in delayed spiking networks using the Haken Lighthouse model as an analytically tractable event-based description of neural dynamics. For networks with fixed delays, we derive self-consistency conditions for phase-locked states and an associated linear stability theory formulated directly in terms of spike-time perturbations. The framework is illustrated for a delayed autapse, a reciprocally coupled two-cell network, and spatially structured rings with distance-dependent coupling and conduction delays, where circulant symmetry allows stability to be decomposed into Fourier modes. We then introduce an activity-dependent white matter plasticity rule in which myelination modulates axonal conduction speed and hence communication delay. This leads naturally to a slow--fast system with state-dependent delays, in which frozen phase-locked branches organise the adaptive dynamics. The plasticity rule selects commensurate delay--period relationships, providing a mechanism for the emergence of synchrony, other frequency-locked states, slow switching between competing phase-locked patterns, and the organisation of heterogeneous delays into discrete delay--period classes. Direct simulations of the event-driven network support the analytical predictions and illustrate how adaptive conduction can reshape the attractor structure of a delayed spiking network and generate long-timescale transitions. These results provide a tractable mathematical framework for studying how activity-dependent myelination may regulate temporal coordination, synchrony, and communication through coherence in spiking neural systems.


[6] 2606.21703

Delay coordinates synchronization and induces abrupt transition in excitable networks

Neuronal communication is inherently time-delayed, due to the finite speed of signal propagation. Although often considered challenging or disruptive, such time delays can also endow neural circuits with useful capabilities. Here, we show that delays in excitatory connections between excitable neurons coordinate their synchronization patterns by creating self-sustained oscillations that may be out-of-phase or in-phase. The emergence of these oscillations leads to an abrupt, explosive, transition to in-phase synchronized regimes due to small changes in connection strength or time-delay. We describe the mechanism underlying these phenomena as an interaction between the neuron's excitable dynamics and the delay in signal transmission, explaining many aspects of how the oscillations emerge. We show this phenomenon in different network connectivities, neuronal models, with and without excitation, with and without noise, highlighting the generality of the mechanism.


[7] 2606.21785

Mostly-monocular responses and other visual functions in a multiscale network model of Macaque V1

Visual signals from the two eyes merge gradually as they pass through the primary visual cortex (V1). Here we use a computational model of Macaque V1 to study the first stage of this integration \ktext{along the magnocellular pathway}, in layer 4C$\alpha$, aiming to infer neuroanatomical origins of binocular response. \ktext{It is known that neurons in layer 4C$\alpha$ are predominantly monocular, \ytext{though} some do exhibit varying degrees of binocularity. We find (1) the emergence of narrow binocular strips along borders of ocular dominance columns (ODC), a finding that aligns with experiments; (2) most consistent with data is when $10-30\%$ of interactions near ODC boundaries are cross-columnar; and (3) feedback from layer 6 is largely monocular.} These results were obtained through systematic hypothesis testing using a multiscale model that is orders of magnitude faster than its biologically-detailed predecessors. We propose that multiscale modeling can be an effective tool for bridging anatomy and function.


[8] 2606.21818

Dynamic Computerized Tumbling-E Testing for Temporal Reliability of Human Sequential Perceptual Decisions

OBJECTIVES: Visual acuity and tumbling-E tasks are often treated as static threshold measures, yet sequential perceptual decisions unfold over time. A computerized tumbling-E task preserves response latency, timeouts, and stimulus-size adaptation, creating a temporal reliability dataset rather than only a chart-line score. This matters for human-AI comparison because the Temporal Hallucination Index (THI) shows how static accuracy can obscure delays, drift, persistence, and unstable convergence. METHODS: We curated trial-level human data from a computerized dynamic tumbling-E task. On each trial, a single E optotype appeared in one of four orientations, participants selected the perceived direction or timed out, and stimulus size was automatically adjusted through an adaptive staircase. Primary outcomes were reaction time, timeout rate, delay rate above a 3-second budget, and observable THI based on delay and timeout components. RESULTS: The final dataset included 1,154 valid trials from 21 human identifiers across 77 sessions. There were 1,078 non-timeout responses and 76 timeouts, giving a 6.6% timeout rate. Non-timeout reaction times centered near 1.5 seconds (mean 1546 ms; median 1506 ms; IQR 1306-1713 ms), with only 3 responses exceeding 3,000 ms. Adaptation was dominated by smaller-next-stimulus transitions (89.2%). Mean arcminutes declined from 29.42 at trial 0 to 5.04 at trial 19, supporting convergence near a 20/20-level optotype without clinical acuity diagnosis. CONCLUSIONS: This dataset converts a tumbling-E visual task into a temporally resolved human perceptual-decision benchmark. Its novel contribution is automatic capture of staircase behavior, response timing, timeouts, and trial-level reliability signals. The human data show fast timing and smooth adaptation toward threshold, establishing a human-only baseline for future comparison with artificial agents.


[9] 2606.21987

Fabrication Of Bilayer Nanofibers From Poly Xylitol Dodecanedioic Acid, Poly Caprolactone, Gelatin Biological Macromolecules And Surface Modification Via Spin Coating Of Capsules For Skin Wound Treatment

In this study, poly xylitol dodecanedioic acid was synthesized from xylitol and dodecanedioic acid monomers in a 1:1 molar ratio with Mw = 4038 g/mol using the polycondensation method. bilayer nanofibers with optimal morphology and average diameter of 271 70 nm were fabricated using electrospinning in voltage of 15 kV and flow rate of 0.5 ml/h, incorporating 15% PXDDA, 15% gelatin, and 20% poly caprolactone polymer solutions. this diameter of nanofibers was perfectly aligned with genetic algorithm in ANN model with a cost value of 0.0054 and R = 0.99 rather than RSM model. to enhance the stability of the electrospun nanofibers, glutaraldehyde was employed as a crosslinking agent. additionally, nitrogen doped activated carbon nanoparticles served as carriers for clindamycin, intended for spin coating between the layers of PXDDA/Gel/PCL nanofibers. Results indicated that an increase in capsule concentration enhanced bilayer nanofiber contact angle while swelling percentage progressively increased in optimal point of CD concentration over 12 hours. furthermore, drug release followed higuchi kinetics, exhibiting high correlation values for the best-fit model. biodegradability, antibacterial efficacy, cell culture assessments, and MTT assay demonstrated optimizing drug concentration improved cell attachment and viability compared to the control sample.


[10] 2606.22001

Evolutionary Entropy Shapes Reproductive Lifespan in Age-Structured Populations

Evolutionary entropy measures the temporal organization of reproductive contributions along the life cycle of an age-structured population. We develop a mathematical and empirical framework showing that, in iteroparous animal populations represented by Leslie-type demographic matrices, reproductive windows are frequently organized near the age classes selected by entropy maximization. Evolutionary entropy complements the classical net reproductive number and asymptotic growth rate: whereas these measure lifetime replacement and growth, entropy measures the temporal dispersion of the growth-adjusted reproductive distribution. Our central result is a reduction principle: under Euler--Lotka normalization, evolutionary entropy and generation time are invariant under multiplicative rescaling of survivorship and fertility on the reproductive interval. The relevant entropy is determined not by absolute survivorship, fertility, or juvenile mortality, but by the normalized post-maturity reproductive distribution. We derive explicit entropy functionals for finite and open-group Leslie models, including geometric reproductive tails. For the geometric regime, governed by we prove a sharp critical threshold separating populations with a unique finite entropy-maximizing endpoint from those whose entropy increases toward an asymptotic value in terms solely of the age at first reproduction. The theory is tested on 130 animal species. Entropy-derived predictions, computed from the demographic matrices alone, are compared with independent life-history variables. Predicted and observed reproductive medians coincide exactly for a majority of species, over 90% are predicted within three reproductive classes, and associations remain strong after phylogenetic correction. These results identify a quantitative regularity across taxa, with geometric reproductive distributions playing a central role.


[11] 2606.22279

Inferring and Predicting Clade-Level Relative Transmission Fitness in Seasonal Influenza A Using Differential Population Growth Rate and Deep Learning

Seasonal influenza A evolves rapidly, allowing newly emerged clades to replace previously dominant lineages and complicate surveillance and vaccine evaluation. Here, we applied the Differential Population Growth Rate (DPGR) framework to GISAID-derived H3N2 and H1N1 surveillance data collected from 1 January 2014 to 12 February 2026, including the 2025-2026 influenza season, to estimate clade-level relative transmission fitness across continents and within the United States. We identified windows of co-circulation with sliding-window regression, reconstructed relative-fitness relationships among clades, and compared inferred growth advantages with independent WHO and CDC surveillance patterns. We further trained subtype-specific convolutional neural networks on complete viral genomes to predict DPGR from sequence, quantified predictive uncertainty with conformal prediction, and used SHAP to localize genomic contributors to fitness. DPGR recovered recurrent lineage turnover in both subtypes and consistently identified the emerging H3N2 subclade K as fitter than the 2025-2026 vaccine-lineage background across multiple regions. Genome-based models predicted DPGR accurately for H3N2 ($R^2 = 0.9577$) and H1N1 ($R^2 = 0.9871$), while interpretation highlighted known haemagglutinin antigenic sites together with contributions from internal genes. These results support DPGR as an interpretable surveillance signal and show that influenza fitness can be linked to genomic prediction and biological interpretation in a unified framework.


[12] 2606.22318

PROMPT: A Pre-registered Randomized Protocol for Component-Level Evaluation of Clinical AI Prompts

BACKGROUND:Prompt engineering shapes medical AI outcomes, but prompt components are rarely tested as clinical interventions. We developed PROMPT (Pre-registered Randomized Outcome Measurement for Prompt Testing), a protocol using pre-specification, randomization, matched controls, dismantling, and decision this http URL:Two pre-registered demonstrations used Claude Sonnet 4.6. Exp 1 used a synthetic tumbling-E orientation task: 630-trial main study, 480-trial dismantling study, and 1,050-trial 2x2 factorial extension. Exp 2 used the same arms on 16 label-masked CBIS-DDSM mammographic crops in four orientations: 256 confirmatory trials and a 64-trial Arm E extension. Matched controls removed the active component while preserving framing, structure, and output this http URL:PROMPT identified beneficial, inactive, harmful, and task-dependent effects. In Exp 1, the full prompt achieved 98.6% orientation accuracy; removing the decoding rule reduced accuracy to 50.1% (difference, +48.5 pp; 95% CI, +43.2 to +53.7; P<0.001). A rule-only arm matched the full prompt (maximum difference, 2.3 pp), identifying the decoding rule as the sole measurable active component. A prohibited-reasoning block assumed to improve safety was inactive, an effect missed by whole-prompt comparison. Scaffolding without the task-specific rule underperformed the vehicle prompt, showing prompt structure alone was harmful. Exp 1 revealed a canonical-RIGHT error phenotype in no-rule arms, consistent with a RIGHT-orientation prior. In Exp 2, the phenotype recurred on mammographic images, but the rule's benefit was attenuated and did not meet the threshold (+14.1 pp; bootstrap 95% CI, -3.1 to +29.7; post-hoc mixed-model 95% CI, +3.5 to +24.6).CONCLUSION:PROMPT revealed component effects missed by whole-prompt evaluations, identifying safety vulnerabilities and performance failures before clinical AI deployment.


[13] 2606.22448

Kinetics of template-directed multistate copolymerization

We consider processes of template-directed multistate copolymerization by molecular machines such as polymerases or ribosomes, having multiple states of conformation or activation. We show that the kinetic equations of these processes can be exactly solved for the mean growth velocity, the sequence probabilities of the grown copy, and the local probabilities and fractions of monomeric units in the copy. Asymptotically, in the long-time limit, the kinetic equations are solved with a matrix factorization ansatz in terms of a backward iteration, forming an iterated matrix function system, and a complementary forward iteration, both running along the template sequence. The iterative method is very significantly faster than usual computational methods, as demonstrated with a numerical example.


[14] 2606.22452

Kinetics of multistate DNA polymerases

In the present paper, we apply the iterative mathematical method previously developed for the kinetics of template-directed multistate copolymerization to the kinetics of DNA replication by polymerases having multiple structural states. In particular, we study a two-state kinetic model for the T7 DNA polymerase. We obtain the mean velocity for the growth of the copy along the template, the error probability of DNA replication by the polymerase, and the local probabilities of base-pair formation along the template sequence. Furthermore, we show that the iterative method is more than a million times faster than usual numerical simulation methods. Results are also obtained in the approximation of homogenization of template heterogeneities.


[15] 2606.22561

quaint: An R Package for detecting introgression across a phylogeny using discordant gene tree topologies

Premise: Hybrid speciation and introgressive hybridization are increasingly recognized as important evolutionary phenomena across the tree of life. One widely used class of methods to detect introgression includes D statistics and related methods which employ the ABBA-BABA test using nucleotide site patterns. Recent studies have applied this theoretical framework to phylogenomic datasets using gene tree topologies instead, but no software packages using this method have been developed. Methods and Results: An R package was developed to facilitate the inference of introgression given a set of gene trees and a species tree. Using an ABBA-BABA framework, this package summarizes patterns of gene tree discordance to infer introgression across large phylogenies. Conclusions: Using gene tree topologies, quaint overcomes the limitations of site-based methods, enabling the detection of introgression across broad phylogenomic contexts. This R package provides an accessible and reproducible tool for researchers investigating reticulate evolution.


[16] 2606.22582

Upstream reciprocity versus downstream reciprocity: Catalyzing cooperation

Why would anyone help a stranger, knowing they may never meet again? Indirect reciprocity offers one of the most compelling evolutionary answers, yet its two canonical forms -- upstream reciprocity (experience-based), and downstream reciprocity (reputation-based) -- have been studied mostly in isolation. Their joint dynamics in finite and structured populations remain largely unexplored. Here, we fill this gap using agent-based simulations in which an agent is behaviourally either defector, upstream reciprocator, or downstream reciprocator, and the agents' population state is temporally updated using different evolutionary update mechanisms. We show that update mechanism plays a surprisingly decisive role in shaping the fate of downstream and especially upstream reciprocators. Whether agents' experiences and reputations are updated globally or locally can shift outcomes from rich behavioural coexistence to the dominance of downstream reciprocators alone. Intriguingly, we uncover a robust structural feature that persists across all the explored update rules and population sizes: an optimal network degree at which upstream reciprocity is maximized, reflecting a fundamental tug-of-war between cooperative clustering and exposure to defectors. Our results highlight that while downstream reciprocity can either foster or inhibit upstream reciprocity depending on the update mechanism, its net effect on cooperation remains largely positive.


[17] 2606.22695

SPIDER -- Stitched Power-spectra for Inferring Directed information flow from incomplete and asynchronous Experimental Recordings

Mapping the directed flow of information between brain regions -- their effective connectivity -- is central to understanding brain function, yet large-scale recordings sample only a fraction of the brain at a time: sessions, animals, and laboratories cover different, partially overlapping regions, usually without a shared temporal reference. Established directed-connectivity methods (Granger causality, dynamic causal modeling, partial directed coherence, PDC) require all regions to be recorded simultaneously and with a common clock. We introduce SPIDER, a non-parametric, frequency-domain framework that recovers directed information flow from such incomplete, asynchronous recordings: it stitches local power-spectral estimates from overlapping channel subsets into a global spectral matrix and obtains frequency-resolved directed interactions by canonical spectral factorization and PDC, without temporal alignment, while nuclear-norm completion fills in never-co-observed region pairs. With consistency guarantees, we validate SPIDER on simulations, two-photon calcium imaging, and the International Brain Laboratory Neuropixels dataset, recovering directed flow among 50 areas from 43 sessions in 12 laboratories never recorded together. Beyond validation, SPIDER reveals what no single recording can: brain-wide spontaneous flow is largely recurrent, but in the theta band it forms a significant feedforward hierarchy with the hippocampal formation at its source. Applied to resting human intracranial EEG (43 patients, non-overlapping coverage), it recovers the same theta-band hierarchy across species and modality. SPIDER makes whole-brain effective-connectivity analysis tractable for multi-session, multi-animal datasets previously incompatible with directed-flow inference.


[18] 2606.22949

When do correlations reflect biological similarity in ecological dynamics?

The structure of competitive ecological communities is shaped by the strength of interactions between species, which in turn reflects their biological similarity. At the same time, the stochastic forcing that drives abundance fluctuations is itself biologically grounded: species that are more similar may be expected to respond more similarly to environmental variation. This motivates the increasingly common use of correlations in abundance time series, particularly in microbial communities, as proxies for biological similarity or niche overlap. Here we analyze the relation between biological similarity and abundance correlations in stochastic community models. We require that the stochastic forcing acting on different species be correlated in proportion to their biological similarity, and ask how such forcing is reflected in abundance correlations. We show that this requirement cannot, in general, be satisfied within the widely used stochastic Lotka-Volterra framework, and that even when it is, abundance correlations carry no information about niche overlap. In contrast, consumer-resource models provide a natural framework for biologically grounded stochasticity. In this setting, however, the interpretation of abundance correlations depends strongly on the pathway through which noise enters the system: direct forcing of consumers and resource-mediated fluctuations encode different biological quantities. These results have implications both for the modeling of stochastic ecological communities and for understanding what can, and cannot, be inferred from correlations in community time series.


[19] 2606.23066

Estimating common synaptic inputs to spinal motor neurons from motor unit spike trains using openhdemg

Common synaptic input is considered a fundamental principle of motor neuron control and represents the dominant component of the neural drive transmitted from the motor neurons to muscle. Recent advances in High-Density surface Electromyography (HDsEMG) and motor unit (MU) decomposition algorithms have enabled the concurrent identification of increasingly large populations of MUs and substantially expanded the possibility of estimating common synaptic input from MU spike trains, making this approach widely used to investigate the neural control of movement in humans. However, multiple analytical approaches are currently available, each relying on different physiological assumptions, mathematical formulations, and parameter choices. The lack of practical guidelines and open-source implementations has also limited the accessibility and reproducibility of these analyses. In this tutorial, we provide a practical, physiologically grounded guide to estimating common synaptic input from populations of MU spike trains using openhdemg, an open-source Python framework. We organize the available methods into three complementary categories: time-domain approaches applied to smoothed discharge rates, frequency-domain approaches based on coherence between cumulative spike trains, and a network-information approach based on nonlinear pairwise dependencies and graph theory. For each method, we describe its physiological interpretation, step-by-step estimation, and systematically examine how key parameter choices influence the resulting estimates, providing practical recommendations for their selection. Finally, we present a complete workflow from HDsEMG decomposition and MU cleaning to common synaptic input estimation, demonstrating that decomposition quality directly affects these estimates.


[20] 2606.23114

Circadian output network can buffer period variability

Circadian rhythms are biological oscillations that govern 24-hour physiological and behavioral processes across most organisms. Recent bioimaging studies have revealed that even individual cells can exhibit circadian rhythms. The period of cellular oscillations can fluctuate due to molecular noise in the circadian clock machinery. Whether regulatory networks downstream of the clock amplify or attenuate clock-derived period fluctuations remains poorly understood. In this study, we numerically observed period variability in a self-sustained oscillator coupled to an output network. Our numerical calculations demonstrated that a serial pathway does not merely relay timing signals but actively shapes rhythmic reliability. The extent of this reduction depended on parameters of both the clock and output systems. For more complex output networks, the shortest-path length from the core oscillator was a major determinant of increased oscillation precision. This noise-buffering effect saturated in long cascades. These results suggest the existence of an intrinsic precision-enhancing mechanism embedded within circadian output networks.


[21] 2606.23148

Bayesian modelling of herd-level infection dynamics in cattle: Local spread as the primary driver of Salmonella Dublin persistence on Öland

Salmonella Dublin (S. Dublin), a zoonotic serotype adapted to cattle, causes animal welfare issues and economic losses. The disease has proven particularly challenging to control in Öland, Sweden. This study uses Bayesian simulation-based inference of bulk tank milk sample results to analyse the S. Dublin infection dynamics in Öland cattle. The infection process was formulated as a dynamic state-space model and particle Markov-chain Monte Carlo methods were applied to infer the underlying infection dynamics and estimate the basic reproduction number ($R_0$) as well as the effective reproduction number ($R_t$). These metrics provide insight into transmission dynamics, enabling assessment of the effectiveness of the current S. Dublin control in Swedish cattle and identification of interventions that may reduce the prevalence. The results show that most holdings on Öland have $R_0 < 1$, indicating that infection is expected to die out after introduction. However, in a subset of holdings $R_0 > 1$, and there the risk for spread of S. Dublin is higher. Furthermore, the analysis reveals that on average, $R_t \approx 1$, suggesting a stable endemic presence unless effective interventions are implemented. In addition, the results show that it is insufficient to restrict the movements of infected cattle on Öland to bring $R_t < 1$, as local spread and within-herd transmission contribute equally to the force of infection (approximately 50% each). These findings demonstrate how Bayesian data-driven analysis can support evidence-based decision making for the control and eradication of S. Dublin in cattle.


[22] 2606.23245

Predator-associated cues promote host riding, and coupling to mobile hosts improves survival in an epizoic limpet

Epizoic limpets may reduce predation risk by riding mobile gastropod hosts, but the behavioral steps leading to host attachment and the benefits of attachment remain poorly understood. We examined these issues in the epizoic limpet $\textit{Lottia tenuisculpta}$ using a host-riding assay, paired-trajectory analyses of pre-riding movement, a survival assay, and mucus-conditioning assays. In the host-riding experiment, involving 156 limpets in 39 chambers, crab-associated cues increased attachment within the observation window from 19 of 80 individuals in the cue-absent treatment to 42 of 76 individuals in the cue-present treatment. In the same assay, the high-frequency tail of the locomotor amplitude spectrum became shallower under cue-present conditions, with the posterior median slope shifting from -0.353 to -0.287. Direct analysis of visible paired host-limpet trajectories further showed stronger distance closure under crab-associated cues. Distance closure was quantified over the final visible five minutes before riding or before the final visible paired frame. In the survival assay, based on 31 valid trials, the fitted model indicated lower survival after attachment on fixed hosts than on mobile hosts: the posterior median hazard ratio for fixed versus mobile hosts was 2.111, and posterior median survival at the end of the observation window was 0.437 on mobile hosts but 0.175 on fixed hosts. In a separate single-limpet locomotion assay, gastropod mucus-conditioned surfaces yielded narrower final cumulative-distance ranges than the no-mucus control. Together, these results indicate that predator-associated cues promote host riding, visible paired trajectories reveal a pre-riding approach component, coupling to mobile hosts improves survival, and host-associated surface cues may narrow solitary-limpet movement.


[23] 2606.23325

The adaptive nature of confirmation bias

In this paper, the phenomenon generally classified as confirmation bias is formulated on the space of square-root probabilities (or equivalently, using the structures of quantum probability). In this framework, observations are modelled by matrices, rather than random variables on a probability space. In the problem of binary hypothesis testing, an optimal evidence choice minimises the expected error probability. We show that the resulting optimal choice of evidence leads to a confirmation bias, thus revealing a surprising aspect of rationality that encompasses confirmation bias. Specifically, in sequential evidence sampling, the implicit optimality leads to two remarkable evolutionary advantages, namely, (a) the decision maker requires only the smallest memory capacity, and (b) the error probability can be reduced exponentially in sample size. A complementary approach based on the framework of active inference -- where the decision maker seeks evidence that provides maximum information -- is then considered. The resulting optimal evidence is shown to agree with the one obtained by minimising error probability. Our framework provides an easy-to-implement protocol for an active quantum inference, whereby the optimal evidence choice for making an inference is sought over the space of matrices.


[24] 2606.23470

From Lab to Landscape: Assessing the Impact of Pesticides on Pollinator Populations Based on Laboratory Data by Combining ALMaSS and BufferGUTS

Pesticides are designed to eradicate pests from crops, fulfilling an important role in the current agricultural system. However, nature conservation requires that pesticide applications are protective for non-target organisms, which provide ecosystem services on the other hand. Environmental risk assessment (ERA) is supposed to strike this balance, but the current use of laboratory derived toxicity thresholds in the landscape context, without consideration of population and landscape dynamics might be too coarse to achieve this task. Here, we propose to overcome this limitation by coupling the Animal, Landscape, and Man Simulation System with the BufferGUTS model for non-target arthropods. We conducted a case study of the solitary bee Osmia bicornis exposed to the pesticide formulation Closer (a.i. sulfoxaflor) to assess the integration. Laboratory survival data of topical and oral exposure to Closer were used to calibrate BufferGUTS models. The resulting parameters were used to parametrise model organisms in ALMaSS simulations to extrapolate the effects of sulfoxaflor at different exposure levels on population dynamics. The integration of BufferGUTS into ALMaSS landscape simulation was achieved with high numerical precision, allowing for the calculation of daily survival probabilities for model organisms in the ALMaSS framework. We found that even extreme application rates only led to negligible population effects in ALMaSS simulations, but an exploratory analysis of pesticide-driven larval mortality showed that effects might be more severe when all life stages are considered. The work demonstrates how mechanistic modelling embedded into individual based modelling frameworks can support ERA by combining exposure and effect in systems-based ERA tools, bridging the gap between controlled laboratory experiments and realistic landscape-scale risk assessments for next generation ERA.


[25] 2606.20733

Dissecting emerging slow rhythms in delay-coupled neural oscillators

Synaptic transmission delays are ubiquitous in neural circuits and can alter the dynamical repertoire of coupled oscillators quantitatively and qualitatively. Here, we demonstrate that delayed coupling in inhibitory networks introduces an effective slow-fast structure in the phase-difference dynamics, generating low-frequency components that are not due to intrinsic cellular properties, and we show that this behavior is not specific to a particular model structure. The origin of this generic phenomenon is analyzed by numerical continuation and bifurcation analysis, which provides a systematic approach to find such delay-induced slow modulating rhythms. We employ phase reduction based on phase response curves to derive a phase-difference model with delay for mutually inhibitory coupled oscillators, where the individual units are given by the FitzHugh-Nagumo model, the Morris-Lecar model, or a next-generation neural mass model derived from quadratic integrate-and-fire neurons. We use phase planes to study multistability and limit cycles, which correspond to slow modulation of fast oscillations in the full model. Treating the synaptic delay as a bifurcation parameter, we apply numerical continuation to construct delay-dependent bifurcation diagrams. The analysis reveals Hopf, heteroclinic, and saddle-node-of-periodics bifurcations that cause and organize slow rhythmic behavior. Our analysis provides a systematic approach to the search for limit cycles in phase-reduction models corresponding to delay-induced slow rhythms in the original model.


[26] 2606.20765

Dataset-Aware Cold-Start Active Learning for Annotation-Efficient 3D Medical Image Segmentation

Deep learning for 3D medical image segmentation requires extensive manual annotations, a major bottleneck in volumetric medical imaging. Active learning aims to reduce this burden by selecting informative samples for annotation, but most methods assume that an initial labeled set is already available. This leaves the cold-start problem largely unresolved: how to select the first volumes from a fully unlabeled pool before any task-specific model is trained. We propose CSCS, a Curriculum-Stratified Cold-Start framework that adapts initial sample selection to the structure of the unlabeled dataset. CSCS combines two self-supervised, label-free signals: local typicality, measuring representativeness in the embedding space, and reconstruction-based uncertainty, used as a proxy for sample difficulty. These signals are combined through a weighted geometric score, where the weighting is determined by a closed-form pacing rule based on the effective annotation budget and the Difficulty-Coverage Ratio, a pool-level statistic measuring the alignment between difficulty and representativeness. We evaluate CSCS on four 3D medical image segmentation benchmarks: BraTS, FeTA, Spleen, and an in-house fetal MRI dataset. Using nnU-Net as downstream segmentation model, CSCS shows consistently competitive performance across datasets and annotation budgets, with the strongest gains in low-to-mid annotation regimes. These results suggest that dataset-aware cold-start initialization can improve the robustness of active learning for 3D medical image segmentation by adapting sample selection to the geometry of the unlabeled pool.


[27] 2606.20906

MMGNN: Multi-level, multi-color graph neural networks for molecular property prediction

Molecular message-passing neural networks commonly propagate chemically diverse interactions through a single graph, which may mix interaction-specific signals and require deep propagation to capture long-range effects. We introduce the Multi-level, Multi-color Graph Neural Network (MMGNN), a hierarchical framework that decomposes a molecular graph into overlapping atom-type-pair-specific subgraphs while preserving atom-level resolution. MMGNN-2D constructs chemical-colored subgraphs from covalent connectivity, whereas MMGNN-3D constructs geometric-colored subgraphs from spatial proximity and augments their edges with distance, angular, and torsional descriptors. Both variants apply a shared communicative message-passing backbone to each subgraph and combine the resulting representations through atom-wise aggregation and molecular readout. We evaluated MMGNN on five classification and three regression benchmarks from MoleculeNet using common scaffold splits and five independent runs. MMGNN-2D achieved the highest macro-average AUC-ROC of 0.838 across the classification datasets and the lowest RMSE on ESOL (0.803). MMGNN-3D obtained the highest mean AUC-ROC on BBBP (0.956) and the lowest RMSE on FreeSolv (1.793), indicating complementary strengths of topological and geometric representations. Structural and leave-one-out analyses further illustrate how the subgraph decomposition affects learned representations and atom-type-pair sensitivities. These results support overlapping interaction-specific graph decomposition as a competitive strategy for molecular property prediction.


[28] 2606.21174

HERO: Hypothesis-Driven Evidence Retrieval from Omics for Multi-Task Breast Cancer Analysis

Matched multi-omics can improve WSI-based biomarker and prognosis prediction, but most existing pipelines use omics as a paral lel feature stream or textual context rather than as an explicit retrieval constraint. HERO asks whether observed omics can be a testable mor phology hypothesis: a sparse pathway-to-morphology prior maps DNA methylation and miRNA into a K-dimensional intent vector m (K=16), TF-IDF retrieval over structured 10 captions selects endpoint-relevant regions, and a cosine gate c=cos(m,v) triggers deterministic deficit driven repair when c<{\tau}c. This closed-loop design bounds VLM calls, reduces reliance on embedding-based semantic matching, and makes every retrieval and verification step lexically auditable. On TCGA-BRCA (930WSIs, patient-level 5-fold CV), HERO sets new state-of-the-art across ER, PR, HER2, subtype, and risk prediction, outperforming both multimodal fusion and VLM-based baselines.


[29] 2606.21314

Dynamic dark-field FFOCT and dynamic reflection differential phase contrast for label-free functional imaging at reflective biomaterial interfaces

Strong reflections from metallic and engineered substrates severely limit label-free functional imaging of living cells at biomaterial interfaces, neural electrodes, and implantable devices. Here we introduce two complementary approaches for recovering intracellular dynamic contrast at highly reflective interfaces. Dynamic dark-field full-field optical coherence tomography (D-dFFOCT) suppresses the dominant substrate reflection and restores intracellular visibility through selective detection of scattered light. In parallel, asymmetric illumination generates a distinct directional dynamic contrast that is most consistently interpreted as dynamic reflection differential phase contrast (D-RDPC). Both approaches reveal intracellular activity that remains poorly visible with conventional dynamic full-field optical coherence tomography. D-RDPC exhibits characteristic signatures of phase-gradient imaging, including contrast reversal upon illumination inversion, enhancement with increasing illumination asymmetry, and recovery of spatial localization through directional Hilbert-transform reconstruction. Together, these results establish new strategies for functional imaging at reflective interfaces and suggest that differential phase contrast signals can support temporal fluctuation analysis.


[30] 2606.21432

Soliton-like Waves in a Two-Dimensional Recurrent Spiking Neural Network with Weighted Spike-Timing-Dependent Plasticity

We construct a minimal but biologically plausible spiking neuron model operating in discrete time, combining multiplicative spike-timing-dependent plasticity (WSTDP), divisive normalization of synaptic integration, homeostatic threshold adaptation, and a one-step refractory period. We show that this normalization admits a biologically plausible dendritic implementation in which each binary junction operates using only locally available information. Assembling excitatory-inhibitory pairs of such neurons into a two-dimensional recurrent network and applying periodic localized stimulation, we find that the network spontaneously gives rise to stable, self-propagating wave packets with the properties of dissipative solitons: they maintain a stable spatial profile, propagate at constant speed, and annihilate upon frontal collision. Their emergence requires a geometric asymmetry between excitatory and inhibitory connection radii, and initial inhibitory synapses stronger than excitatory ones. WSTDP engraves the direction of propagation into the synaptic weight profile, so that the network learns by itself to sustain propagation in one direction while suppressing the reverse. When two sources are active simultaneously, the resulting waves annihilate upon collision, defining a semi-persistent boundary whose position encodes the relative phase and frequency of the two sources. These results provide a minimal computational framework for studying the emergence of cortical traveling waves, activity zone delimitation, and spatial memory from local plasticity rules alone.


[31] 2606.21608

CurvSegFlow: Time-Conditioned Flow Matching for Robust Segmentation of Curvilinear Structures in Noisy Biomedical Images

Accurate segmentation of curvilinear structures remains challenging in biomedical imaging due to their thin geometry, complex topology, and sensitivity to noise. This is particularly critical for microscopy images of cytoskeletal network, where low signal-to-noise ratios and dense filament crossings often lead to fragmented or inaccurate segmentation. In this work, we propose CurvSegFlow, a segmentation framework based on time-conditioned flow matching. Instead of predicting a segmentation mask in a single pass, the method models segmentation as a dynamic process that progressively refines a noisy initialization into the target structure through a learned velocity field. The proposed model combines a U-Net backbone with triple-term loss function and temporal embeddings to guide the refinement process across reconstruction stages. This formulation enables gradual error correction and improves the continuity of thin structures. CurvSegFlow is evaluated on multiple synthetic and real microtubule datasets, as well as on public benchmarks of retinal vessels, corneal nerves and coronary arteries. Across datasets, the method achieves competitive or superior performance compared to established segmentation models, with consistent improvements in precision and structural continuity, particularly under low signal-to-noise conditions. These results show that flow-based iterative refinement provides a robust and general framework for curvilinear structure segmentation. Overall, the proposed approach improves segmentation quality in challenging imaging conditions and generalizes effectively across modalities without architectural changes.


[32] 2606.21681

Heavy-Tailed Dispersal Kernels from Stopped Subdiffusive Fractional Brownian Motion

Subdiffusive fractional Brownian motions produce localized aggregation when particles are stopped at exponentially distributed times. In applications where clumping and long-distance dispersal events are observed simultaneously, such as in some instances of seed dispersal, this model fails to describe the tails of the data. The resulting redistribution kernel has only an exponentially decaying tail, whereas a heavier tail is needed for modeling the long-distance dispersal observed. Here we propose a model in which subdiffusive particles stop at exponentially distributed times, but with a rate parameter that is Gamma distributed. This heterogeneity in stopping rates causes the density of final radial positions to have a heavy-tailed distribution. Our model retains the strong localized clumping characteristic of subdiffusive fractional Brownian motion while simultaneously generating the heavy tails required for realistic long-distance dispersal.


[33] 2606.21940

DevoTG: Temporal Graph Neural Networks for Modeling C. elegans Developmental Connectomics

Understanding how a nervous system wires itself from birth to adulthood is a fundamental challenge in developmental neuroscience. We present DevoTG, a temporal graph framework that applies Temporal Graph Neural Networks (TGNs) to two complementary representations of C. elegans neural development: a Continuous-Time Dynamic Graph (CTDG) of cell division events derived from cell lineage data, and a Discrete-Time Dynamic Graph (DTDG) of the developing synaptic connectome spanning eight reconstructed electron-microscopy datasets. On the lineage prediction task, our TGN achieves a mean test AUC of 0.839 +/- 0.007 (5 seeds; validation AUC 0.937 +/- 0.001), outperforming a static GNN with the identical architecture by 26 AUC points (0.577 +/- 0.080), demonstrating that temporal memory is the decisive factor. Applied to the connectome DTDG, DevoTG identifies three connection stability classes (stable, developmental, and variable) across 225 neurons and 858 to 2,496 connections over development (L1 birth to adult), providing a temporal-graph-theoretic complement to the individual-variability classification of Witvliet et al. Analysis of hub command interneurons AVA, AVB, and AVE reveals their persistent centrality and how their integration roles are progressively reinforced across larval stages. Accompanying interactive visualizations (3D animated networks, centrality heatmaps, and a spatiotemporal lineage graph) make developmental dynamics accessible for biological hypothesis generation. DevoTG is open-source and designed for extension to other developing nervous systems. Code is publicly available at this https URL.


[34] 2606.22138

BioMatrix: Towards a Comprehensive Biological Foundation Model Spanning the Modality Matrix of Sequences, Structures, and Language

We present BioMatrix, the first multimodal foundation model that natively integrates sequences, structures, and natural language for both molecules and proteins within a single decoder-only architecture. Existing biological foundation models pursue native multimodality and broad entity coverage separately: those that fuse multiple modalities under a shared objective remain confined to a single entity type, while those spanning multiple entity types either omit explicit structural modeling or rely on adapter-based designs in which the model cannot natively generate the very modalities it can read. BioMatrix closes this gap by mapping molecular sequences (supporting both SMILES and SELFIES notations), molecular structures, protein sequences, protein structures, and natural language into a shared discrete token space through a unified tokenization scheme, so that all modalities are consumed and produced uniformly under a single next-token prediction objective -- without external encoders, projection adapters, or modality-specific output heads. Built upon the Qwen3 language model (1.7B and 4B), BioMatrix is continually pretrained on 304.4 billion tokens spanning general and domain-specific text, sequence and structure views of molecules and proteins, and cross-modal corpora that interleave biomolecular entities with scientific text and link distinct entities through molecule-protein and protein-protein interaction data. After tuning on a comprehensive suite of downstream applications covering 80 tasks across 6 categories -- encompassing single-entity and multi-entity understanding and generation tasks across and within modalities -- BioMatrix achieves state-of-the-art or competitive performance on 77 out of 80 tasks, demonstrating that a single, natively multimodal generalist model can effectively match or surpass specialized approaches across a wide range of biological tasks.


[35] 2606.22281

Glass-based physical models for tissue mechanics

Techniques from glass art and fabrication provide a controllable physical platform for studying tissue mechanics in simple organisms. Here, we use glass-based physical models to investigate tissue deformation in the marine organism Trichoplax adhaerens. Previous studies have shown that the epithelial tissues in T. adhaerens undergo large deformations and form fracture holes under mechanical loading, exhibiting a ductile-to-brittle transition at fast loading rates. To model these behaviors in a tunable and experimentally accessible system, glass is shaped into tissue-like monolayers in a glass studio, heated to its specific process temperature, and subjected to controlled stretching. Rapid cooling arrests the deformed configurations, providing snapshots of tissue-like strain states under load. Under lateral and radial stretching, we quantify changes in the area and eccentricity of individual "cells" in the glass models, and found that eccentricity increases after stretching. We further use tensegrity-based models to quantify deformations in the cellular geometry of the glass tissues, enabling direct comparison between experiments and simulations. The model captures the principal experimental deformation patterns, but underestimates the magnitude of the observed eccentricity changes. Our results demonstrate that glass-based physical models provide an experimentally accessible platform for studying tissue-scale deformation and mechanical behavior, while supporting interdisciplinary approaches that connect methods in the arts and sciences.


[36] 2606.22440

Data-driven geometric phase in biological locomotion

Geometric phase quantifies net locomotion in dissipative media via gauge theory, but linking this theoretical quantity to noisy, sparse, and weakly periodic biological shape data is challenging. We develop a theory-guided, data-driven Koopman autoencoder to recover the limit cycle embedded in imperfect cyclic data and extract shape gaits and geometric phase from sperm and nematode data. We introduce a geometric phase sensitivity function that quantifies responses to shape perturbations and reveals mechanical information using only gauge-theoretic structure, without assuming mechanical laws.


[37] 2606.22685

Architecture for Health Initiative (Arch4Health): Computational Challenges in Health-Related Applications and the Role of Computer Architecture in Addressing Them

Recent biotechnological advances enable high-throughput, low-cost, and accurate biological data generation. This wealth of data enables unique opportunities for advancing healthcare. Despite these opportunities, efficiently analyzing large-scale biological data poses significant challenges for conventional computing systems. These systems often cannot keep up with the high-throughput rate at which data is generated, and they face additional constraints related to energy efficiency, scalability, privacy, and security. Therefore, to facilitate the wide adoption of recent advances in healthcare, there is a need to optimize the computing systems to enable high-performance, energy-efficient, low-cost, private, and secure analysis of biological data. We introduce the Architecture for Health (Arch4Health) initiative, which aims to (i) identify and analyze key computational challenges in current and future health- and life science-related applications and (ii) explore how computer architects and computing system designers can advance healthcare by addressing these challenges. In this short paper, we first present the motivations behind the Arch4Health initiative and, second, elaborate on its vision and goals, related topics, Arch4Health workshops, and future outlooks.


[38] 2606.22823

Retrieval-Augmented Multimodal Learning for Enzyme-Substrate Interaction Prediction Under Low-Homology Shift

Enzyme substrate interaction (ESI) prediction is a fundamental computational task for biocatalyst discovery and reaction screening in large biochemical spaces. In practical settings, ESI prediction is challenged by sparse positive supervision and low-homology distribution shift, where test enzymes share limited sequence identity with those observed during training. To address these challenges, we propose RAMMESI, a retrieval-augmented multimodal framework for robust ESI prediction. RAMMESI learns explicit pairwise enzyme-substrate representations through directional cross-modal interaction modeling and adaptive fusion. To enhance robustness, RAMMESI retrieves neighboring enzymes at inference time, recombines them with the query substrate, and aggregates the resulting pairwise predictions as contextual evidence. To improve learning under sparse positive supervision, we further adopt an imbalance-aware weighted-BCE objective. Experiments on two ESI benchmarks under sequence-identity-aware splits demonstrate that RAMMESI achieves consistently strong performance, with particular advantages in more challenging low-identity regimes. In addition, the retrieval module improves multiple ESI backbones in a plug-and-play manner, suggesting that retrieval provides a general mechanism for improving robustness under homology shift.


[39] 2606.23122

A Matter of Time: Towards a General Theory of Agency

Agency is often invoked in research on philosophy, biology, and cognitive science without a clear account of how it originates from material organization. Building on temporally parametrized (F, A)-systems, this paper develops a graded organizational theory of agency grounded in relational biology, physical biosemiotics, and process ontology. We argue that self-referential closure cannot be adequately conceived outside time: once the constitutive processes of a semantically closed organization are associated with distinct characteristic timescales, the organization unfolds into an out-of-sync dependency structure that can be formally redescribed as a history-dependent, revisable Asynchronous Dynamic Bayesian Network. This move allows for a principled distinction between autonomy, goal-directedness, agency, and open-endedness. Autonomy arises from precarious closure to efficient causation under material openness; goal-directedness from the maintenance of viability-supporting organization; agency appears when such organization acquires an endogenous anticipatory structure that selectively modulates organism-environment coupling in light of possible futures; open-endedness begins when this anticipatory organization can reconstruct its own future space of possibilities. Our framework reconciles Rosennean anticipation with organizational closure, restricts Markov blankets and active inference to derived formal redescriptions rather than first principles, and reinterprets computational enactivism in non-Fristonian terms. By deriving weaker temporalized organizations, our contribution outlines a hierarchy from proto-agential chemical systems to fully semantically closed agents, with implications for multicellular organisms, synthetic lifeforms, and neuroscience.


[40] 2606.23253

Reduced-Alphabet QUBO/Ising Formulation for Constraint-Driven Cyclic Peptide Sequence Design

Cyclic peptide design requires balancing local residue preferences with constraints from ring-forming chemistry, residue spacing, topology, target compatibility, and developability. Here, we present a reduced-alphabet quadratic unconstrained binary optimization (QUBO)/Ising formulation for constraint-driven cyclic peptide sequence design. Amino acids are grouped into physicochemical or interaction-based residue classes, and peptide positions are represented by binary residue-class assignment variables. The objective combines one-hot sequence validity, cyclization constraints, optional target-compatibility terms, motif and composition rules, and coarse developability proxies. By modifying the relevant constraint terms, the same framework can represent head-to-tail, disulfide-bridged, stapled, and bicyclic peptide designs. A resource-aware eight-class alphabet motivated by MJ interaction-profile clustering is used as a default representation to balance coarse interaction-pattern preservation with encoding cost. The resulting QUBO/Ising objective is solver-agnostic and can be explored using classical or quantum-compatible binary optimization procedures. The model is intended as an early-stage search-space reduction and prioritization layer: it produces low-energy residue-class sequences rather than final molecular candidates, which require amino-acid decoding, cyclization-aware construction, and downstream structural or experimental validation.


[41] 2606.23561

Electrochemical DNA Hairpin Sensors for Differentiating Small Molecule Intercalation from Minor Groove Binding

Small molecule double-stranded DNA intercalators have significant potential for therapeutic applications. However, screening for and confirming a drug candidate's intercalative behavior remains labor-intensive and costly. To address this, we investigated the sequence and biophysical parameters that affect the performance of electrochemical DNA hairpin sensors for streamlined identification of structural intercalators. These sensors utilize oligonucleotide (oligo) sequences that form hairpins upon intercalator binding. The 3prime end of the oligo is modified with alkylthiol linkers for gold electrode surface monolayer self-assembly, while the 5prime end carries a methylene blue redox reporter. Hairpin formation enhances electron transfer between methylene blue and the gold electrode, which can be detected via voltammetry. We tested seven hairpin structures varying in stem length and sequence. Our optimal oligo, HP4, features a four-base-pair stem and responds to five DNA intercalators over a broad detection range, with EC50 in close agreement with published affinity (KD) values for these interactions. We further demonstrate HP4s ability to discriminate intercalator binding from a series of minor groove binders through significant differences in signal gain upon incubation. Altogether, our strategy establishes a platform for identifying intercalative compounds that should support the development of DNA-targeting therapeutics.


[42] 2501.03247

A sub-Riemannian model of neural states in the primary motor cortex

We develop a neurogeometric model for the arm area of motor cortex, which encodes complex motor primitives, ranging from simple movement features like movement direction, to short hand trajectories, termed fragments, and ultimately to more complex patterns known as neural states (Georgopoulos, Hatsopoulos, Kadmon-Harpaz et al). Based on the sub-riemannian framework introduced in 2023, we model the space of fragments as a set of short curves defined by kinematic parameters. We then introduce a geometric kernel that serves as a model for cortical connectivity and use it in a differential equation to describe cortical activity. By applying a grouping algorithm to this cortical activity model, we successfully recover the neural states observed in Kadmon-Harpaz et al, which were based on measured cortical activity. This confirms that the choice of kinematic variables and the distance metric used here are sufficient to explain the phenomena of neural state formation. The modularity of our model reflects the brain's hierarchical structure, where initial groupings in the kinematic space $\mathcal{M}$ lead to more abstract representations. This approach mimics how the brain processes stimuli at different scales, extracting both local and global properties.


[43] 2507.09272

Degeneracy of Two-Dimensional Zero-One Reaction Networks with Up to Three Species

Zero-one biochemical reaction networks are widely recognized for their importance in analyzing signal transduction and cellular decision-making processes. Degenerate networks reveal non-standard behaviors and mark the boundary where classical methods fail. Their analysis is key to understanding exceptional dynamical phenomena in biochemical systems. Therefore, we focus on investigating the degeneracy of zero-one reaction networks. It is known that one-dimensional zero-one networks cannot degenerate. In this work, we identify all degenerate two-dimensional zero-one reaction networks with up to three species by an efficient algorithm. By analyzing the structure of these networks, we arrive at the following conclusion: if a two-dimensional zero-one reaction network with three species is degenerate, then its steady-state system is equivalent to a binomial system.


[44] 2507.15651

From Scarce Functional Labels to Label-Aware Generation in Homologous Protein Families

Accurately annotating and controlling protein function from sequence data remains a major challenge in protein engineering, especially when functional labels are scarce within large homologous families. Here, we study a two-stage light-supervision strategy for fine-grained functional annotation and label-aware sequence generation. First, we compare several sequence representations, including one-hot encodings, Restricted Boltzmann Machines (RBMs), and ESM2-based protein language model embeddings, for predicting intra-family specificity labels from limited supervision. By using train/test splits that explicitly reduce phylogenetic leakage, we show that ESM2-based representations do not systematically outperform family-specific RBM embeddings or even simple one-hot baselines in this regime. Second, we use the inferred annotations to train an annotation-aware RBM capable of generating artificial homologs conditioned on prescribed labels. Across several protein families, we quantify how the number and quality of available labels determine the reliability of conditional generation. Our results show that scarce annotations can support label-aware protein design when they are accurately propagated, while also highlighting the importance of phylogeny-aware evaluation for assessing functional annotation methods within homologous families.


[45] 2507.15871

Context-Dependent Autonomic Responses in Social Anxiety During Cognitive-Emotional Stress

Social anxiety disorder (SAD) is associated with heightened physiological arousal during socially evaluative situations, yet it remains unclear whether similar autonomic responses emerge during non-evaluative cognitive-emotional stress. This study investigated wearable electrodermal activity (EDA) responses in socially anxious (SA) and non-socially anxious (NSA) individuals during an emotionally salient 2-back working memory task involving facial expressions. Fifty participants (25 SA, 25 NSA) completed a resting-state baseline and task condition while EDA signals were acquired using a Shimmer3 GSR+ sensor. EDA features spanning tonic, phasic, sympathetic, spectral, and nonlinear domains were analyzed using mixed ANOVAs and complementary machine learning models. Results showed significant increases in autonomic arousal during task engagement across all participants, confirming that the task induced substantial sympathetic activation. However, no consistent between-group differences were observed, with only transient interaction effects emerging during the initial task phase. Machine learning analysis demonstrated above-chance discrimination between SA and NSA individuals using resting-state EDA (average AUC~=~0.73), whereas classification performance during task engagement declined to near-chance levels (average AUC~$\leq$~0.57). These findings suggest that cognitively demanding emotional tasks, in the absence of explicit social-evaluative threat, elicit comparable autonomic responses regardless of social anxiety status and may obscure subtle resting-state physiological differences between groups. More broadly, our findings highlight the context-dependent nature of wearable autonomic biomarkers for anxiety assessment and digital mental health monitoring. With this manuscript, we release both the code and data publicly.


[46] 2509.00116

Meta-learning ecological priors from large language models explains human learning and decision making

Human cognition is profoundly shaped by the environments in which it unfolds. Yet, it remains an open question whether learning and decision making can be explained as a principled adaptation to the statistical structure of real-world tasks. We introduce ecologically rational analysis, a computational framework that unifies the normative foundations of rational analysis with ecological grounding. Leveraging large language models to generate ecologically valid cognitive tasks at scale, and using meta-learning to derive rational models optimized for these environments, we develop a new class of learning algorithms: Ecologically Rational Meta-learned Inference (ERMI). ERMI internalizes the statistical regularities of naturalistic problem spaces and adapts flexibly to novel situations, without requiring hand-crafted heuristics or explicit parameter updates. We show that ERMI captures human behavior across 15 experiments spanning function learning, category learning, and decision making, outperforming several established cognitive models in trial-by-trial prediction. Our results suggest that much of human cognition may reflect adaptive alignment to the ecological structure of the problems we encounter in everyday life.


[47] 2509.21332

Future of Brain Health: From Developmental Insights to Clinical Translation

This review highlights brain health as a dynamic process shaped by both genetic and environmental influences throughout development. Critical periods provide unique windows of heightened neural plasticity, during which genetic-environmental interactions and parental influences profoundly impact brain maturation. Frameworks such as DOHaD, ACEs, and neurosocial plasticity elucidate how early-life experiences modulate long-term cognitive and emotional outcomes. Brain health science is emerging as a field integrating neuroscience, public health, and social context. Resilience-oriented approaches and predictive processing, offer renewed perspectives on adaptive brain function. Clinically, understanding critical periods and plasticity spanning from fetal life to old age, has implications for early detection, targeted interventions, and resilience-oriented strategies, emphasizing the potential for lifelong optimization of mental health.


[48] 2510.02853

The Principle of Isomorphism: A Theory of Population Activity in Grid Cells and Beyond

Neural population activity organizes into low-dimensional manifolds embedded within high-dimensional state spaces, yet the principles governing the topology and geometry of these manifolds remain elusive. Here, we propose the Principle of Isomorphism (PIso), which posits that the topology of a neural manifold is constrained by the mathematical structure of the computational task it supports. We apply this framework to the mammalian grid cell system through two distinct theoretical lenses: an intrinsic neural metric, which requires a locally flat Riemannian structure, and path integration, which requires a compact connected Abelian Lie group structure. We show that these two routes are both sufficient conditions that converge on the same toroidal latent topology, and that they naturally unify within Euclidean space. Using a minimal feedforward network that constrains population activity to a torus with tunable geometry, we find that hexagonal grid fields emerge only in an intermediate geometric regime, becoming diffuse or square-like otherwise. Our work clarifies the separation between three notions: latent topology, extrinsic embedding geometry, and decoded physical geometry, and identifies the topology of the population code as the more invariant consequence of the task structure, while leaving the precise mechanism that selects hexagonal single-cell firing patterns as an open problem.


[49] 2510.12776

Quantum Generative Modeling of Single-Cell transcriptomes: Capturing Gene-Gene and Cell-Cell Interactions

Single-cell RNA sequencing (scRNA-seq) data simulation is limited by classical methods relying on linear correlations, failing to capture nonlinear dependencies. No existing simulator jointly models gene-gene regulatory interactions and cell-cell communication. We introduce qSimCells, a quantum computing-based simulator that uses entanglement to model intra- and inter-cellular interactions, generating realistic single-cell transcriptomic data from heterogeneous cell populations. Its quantum kernel uses a parameterized circuit with CNOT gates to encode gene regulatory networks (GRNs) and cell-cell communication topologies. By programming the entanglement architecture, the simulator establishes a known generative ground truth for both regulatory and communication pathways. The resulting synthetic data exhibits dependencies arising from the joint probability structure of the quantum circuit. Notably, standard correlation-based analyses (Pearson and Spearman) fail to recover the programmed causal relationships and instead report spurious associations driven by high baseline gene-expression probabilities. Applying cell-cell communication detection serves as an internal consistency check: CellChat correctly identifies the true ligand-receptor pairs when inter-state entanglement is active, revealing a robust, up to ~98-fold relative increase in inferred communication probability. These results demonstrate that the quantum kernel produces high-fidelity benchmark datasets with known ground truth, highlighting the limitations of correlation-based inference and the need for approaches capable of capturing complex structural dependencies underlying gene regulation and cell-cell communication.


[50] 2511.14872

Maximum entropy models of neuronal populations at and off criticality

Empirical evidence of scaling behaviors in neuronal avalanches suggests that neuronal populations in the brain operate near criticality. Departure from scaling in neuronal avalanches has been used as a measure of distance to criticality and linked to brain disorders. A distinct line of evidence for brain criticality has come from thermodynamic signatures in maximum entropy (ME) models. Both of these approaches have been widely applied to the analysis of neuronal data. However, the relationship between deviations from avalanche criticality and thermodynamics of ME models of neuronal populations remains poorly understood. To address this question, we study spontaneous activity of organotypic rat cortex slice cultures in physiological and drug-induced hypo- or hyper-excitable conditions, which are classified as critical, subcritical and supercritical based on avalanche dynamics. We find that static ME models inferred from critical cultures show signatures of criticality in thermodynamic quantities, e.g. specific heat. However, such signatures are also present and equally strong in models inferred from supercritical cultures -- despite their altered dynamics and poor functional performance. On the contrary, ME models inferred from subcritical cultures do not show thermodynamic hints of criticality. Importantly, we confirm these results using an interpretable neural network model that can be tuned to and away from avalanche criticality. Our findings indicate that static maximum entropy models, although not constraining dynamical features, correctly distinguish subcritical from critical/supercritical systems. However, they may not be able to discriminate between avalanche criticality and supercriticality, suggesting that dynamics is relevant to capture the supercritical behavior and distinguish it from criticality.


[51] 2512.01073

The Modeler Schema Theory of Consciousness, with a Falsifiable Experiment

We propose that consciousness arises from a single control agent, the Modeler-schema, which monitors the brain's Modeler as it constructs and updates the internal World Model. As part of that monitoring, the Modeler-schema generates experience by converting the Modeler's outputs into qualia, which are then used for model refinement. The Human agent comprises three cooperating agents-Modeler, Controller, and Targeter-each paired with a regulatory schema agent. Our core prediction is that the Modeler-schema performs a qualia-based consistency check during saccades and may issue bottom-up attention requests when discrepancies are found. To test this prediction, we propose a saccadic change-detection experiment that distinguishes Modeler-generated from Modeler-schema-generated bottom-up attention requests. Locating qualia in the Modeler-schema ties experience, including diffuse awareness (the full sensory field), to model regulation and refinement, explains aphantasia as a selective failure of recalled-sensory quale-conversion, and offers a testable proposal toward solving the Hard Problem of consciousness.


[52] 2603.04748

SeekRBP: Leveraging Sequence-Structure Integration with Reinforcement Learning for Receptor-Binding Protein Identification

Motivation: Receptor-binding proteins (RBPs) initiate viral infection and determine host specificity, serving as key targets for phage engineering and therapy. However, the identification of RBPs is complicated by their extreme sequence divergence, which often renders traditional homology-based alignment methods ineffective. While machine learning offers a promising alternative, such approaches struggle with severe class imbalance and the difficulty of selecting informative negative samples from heterogeneous tail proteins. Existing methods often fail to balance learning from these ``hard negatives'' while maintaining generalization. Results: We present SeekRBP, a sequence--structure framework that models negative sampling as a sequential decision-making problem. By employing a multi-armed bandit strategy, SeekRBP dynamically prioritizes informative non-RBP sequences based on real-time training feedback, complemented by a multimodal fusion of protein language and structural embeddings. Benchmarking demonstrates that SeekRBP consistently outperforms static sampling strategies. Furthermore, a case study on Vibrio phages validates that SeekRBP effectively identifies RBPs to improve host prediction, highlighting its potential for large-scale annotation and synthetic biology applications.


[53] 2603.12662

Minimal Set of Questions for Theories of Consciousness: Toward a Unified Explanatory Framework

A central challenge in consciousness research is the lack of agreement on what a theory of consciousness should explain, which makes it difficult to compare existing theories. We propose a framework for organizing explanatory targets of theories based on a minimal set of seven questions designed to be theoretically neutral, causally and functionally relevant, and applicable across different systems. We focus particularly on the role of causation based on the argument that causal relations cannot be fully specified within standard physical descriptions alone. Introducing an asymmetric causal structure allows internal mechanisms to be represented explicitly and helps distinguish between variable- and structure-level causation. As an example, we apply the proposed framework to analyzing the Dual-Laws Model. The aim of the framework is not to propose a definitive theory but to provide a common basis for analyzing and developing theories of consciousness.


[54] 2603.17090

Intracellular Measurement-Informed Multiscale Modeling for Scalable iPSC Manufacturing

Scalable manufacturing of human induced pluripotent stem cells (iPSCs) is essential for industrial-scale production of cell therapies and regenerative medicines. However, the 3D aggregate cultures used in manufacturing exhibit substantial spatial and metabolic heterogeneity compared with the relatively homogeneous monolayer systems used in laboratory studies, complicating mechanistic understanding and predictive metabolic modeling across culture scales. To address this challenge, we developed a modular multiscale mechanistic foundation model that links molecular, cellular, and macroscopic processes while accounting for spatial and metabolic heterogeneity. The framework integrates extracellular culture dynamics, intracellular metabolic fluxes, and cellular redox states by extending a previously established monolayer kinetic network and coupling it with a biological systems-of-systems (Bio-SoS) multiscale model for aggregate cultures, incorporating explicit redox interactions. Systematic monolayer and aggregate experiments (including multiple isotopic tracers, extracellular metabolite profiling, and two-photon optical redox imaging) were used to improve and validate the model. This integrated framework unifies heterogeneous datasets across culture configurations and enables mechanistic interpretation of metabolic and redox responses across heterogeneous culture scales, providing a quantitative foundation for scalable iPSC biomanufacturing.


[55] 2603.26860

Ecological systems in a modeling perspective

May (1974,1976) opened the debate on whether biological populations might exhibit nonlinear dynamics and chaos. However, it has in general been difficult to verify nonlinear dynamics in biological populations. There are many reports concerning problems with this issue and some of them can be traced back to Hassell, Lawton, and May (1976) and Morris (1990) Our objective is not a discussion of the presence of nonlinear dynamics in biological populations. Instead, we analyze whether ecological census data can be used for validating nonlinearities at all. We choose our models and our situation so that as much as possible can be done rigorously with by hand computations. We consider a clearly nonlinear chemostat based model that is isolated. Some noise must be considered, and we choose a minimal approach: Only noise originating from the fact that ecological populations remain finite is considered, cf. Bailey (1964). Not only the interacting populations but also collected data sets tend to remain finite. Collection of long data sets might be associated with huge costs in ecology. Examples of exceptionally long and carefully studied ecological time series are those collected by Nicholson (1954) and Utida (1957). These data sets contain a few hundred data points, and we use this as a guideline for when an ecological time series should be considered exceptionally long in this chapter.


[56] 2604.16642

Geometric coherence of single-cell CRISPR perturbations reveals regulatory architecture and predicts cellular stress

Genome engineering has achieved sequence-level precision, yet predicting the transcriptomic state a cell will occupy after perturbation remains open. Single-cell CRISPR screens measure how far cells move, but effect magnitude ignores whether the cells move together. We introduce Shesha perturbation stability ($S_p$), which quantifies directional coherence as the mean cosine similarity between individual cell shift vectors and the mean perturbation direction. Across five CRISPR datasets (2,200+ perturbations), stability correlates with magnitude (Spearman $\rho = 0.75$--$0.97$), but discordant cases expose regulatory architecture: pleiotropic regulators such as CEBPA pay a ``geometric tax,'' producing large but incoherent shifts, while lineage-specific factors such as KLF1 produce coordinated responses. $S_p$ and Song et al.'s perturbation-response score (PS) share partial overlap ($\rho_{\text{partial}} = +0.51$ after controlling for magnitude), but $S_p$ provides significant incremental prediction of UPR pathway activation beyond both PS and magnitude ($p < 10^{-18}$). In a split-half reproducibility assay, $S_p$ predicts directional reproducibility beyond magnitude ($\rho_{\text{partial}} = +0.384$) while PS does not ($\rho_{\text{partial}} = -0.193$), with the advantage consistent across all magnitude strata and both datasets. Geometric instability is independently associated with UPR activation across four datasets. $S_p$ is implemented in the open-source shesha-geometry Python package.


[57] 2606.19405

Multi-type branching inference on contact trees with application to COVID-19

Inferring epidemiological parameters from transmission trees is essential for understanding infectious disease dynamics. Existing tree-based likelihood methods, including the multi-type birth-death models originally applied in phylodynamic settings, provide powerful tools, but most assume homogeneous mixing and rarely capture how transmission potential changes as an individual infects more of their contacts. In this work, we develop a likelihood framework that operates directly on transmission trees, in which nodes are individuals and edges are reported transmission events, with no sequence data involved. We derive a likelihood for a stochastic SIR process on a rooted contact tree in which each infected individual is characterised by the total number of effective contacts, and the number of already infected downstream contacts. We obtain closed-form ordinary differential equations for the probability that a clade goes entirely unobserved and for the probability density that it produces an observed (sampled) tip in a given state. The resulting likelihood can be evaluated for a rooted contact tree with known tip states, and we extend it to partially resolved trees by treating internal branching times as latent variables. Validation on simulated outbreaks confirms accurate parameter recovery and well calibrated uncertainty. Application to empirical COVID-19 contact-tracing data from Karnataka, India, demonstrates the framework's utility for real epidemiological settings. By incorporating contact-degree heterogeneity in a multi-type branching likelihood, our work provides a principled baseline for inferring both transmission dynamics and contact structure from fully or partially resolved transmission trees, complementing rather than relying on sequence-based phylodynamic inference


[58] 2407.07357

A polarity-aware multi-relational model for the signed interaction prediction in biological networks

Predicting signed interactions in biological networks is crucial for understanding drug mechanisms and facilitating drug repurposing. While deep graph models have demonstrated success in modeling complex biological systems, existing approaches often fail to distinguish between positive and negative interactions, limiting their utility for precise pharmacological predictions. In this study, we propose a novel deep graph model, PAMR (polarity-aware multi-relational model), designed to predict both polar (e.g., activation, inhibition) and non-polar (e.g., binding, affect) chemical-gene interactions. Our model integrates graph convolutional networks with tensor decomposition to enhance feature representation and incorporates a conflict-aware sampling strategy to resolve polarity ambiguities. We introduce new evaluation metrics, polarity discrimination score (PDS) and CP@100, to assess the model's ability to differentiate interaction types. Experimental results demonstrate that PAMR outperforms baseline models, achieving superior classification accuracy and improved discrimination of polar edges. Specifically, PAMR-CL attains a Macro AUROC of 0.9072 and CP@100 of 0.974, surpassing RGCN, GraphSAGE, TransE, and BioNet baselines. A case study on nicotine further identifies two novel chemical-gene suppression links, S100A6 and SPP1, that are corroborated by independent experimental literature. Furthermore, we analyze the impact of subgraph components on predictive performance, revealing that additional network structures do not always enhance accuracy. These findings highlight the importance of polarity-aware modeling in drug discovery and network pharmacology, providing a scalable computational framework for polarity-aware chemical-gene interaction prediction and network pharmacology analysis.


[59] 2507.07907

A statistical physics framework for optimal learning

Learning is a complex dynamical process shaped by a range of interconnected decisions. Careful design of hyperparameter schedules for artificial neural networks or efficient allocation of cognitive resources by biological learners can dramatically affect performance. Yet, theoretical understanding of optimal learning strategies remains sparse, especially due to the intricate interplay between evolving metaparameters and nonlinear learning dynamics. The search for optimal protocols is further hindered by the high dimensionality of the learning space, often resulting in predominantly heuristic, difficult to interpret, and computationally demanding solutions. Here, we combine statistical physics with control theory in a unified theoretical framework to identify optimal learning protocols in prototypical neural network models. In the high-dimensional limit, we derive closed-form ordinary differential equations that track online stochastic gradient descent through low-dimensional order parameters. We formulate the design of learning protocols as an optimal control problem directly on the dynamics of the order parameters with the goal of minimizing the generalization error. This formulation encompasses a variety of learning scenarios, optimization constraints, and control budgets. We apply it to representative cases, including optimal curricula, adaptive dropout regularization and noise schedules in denoising autoencoders. We find nontrivial yet interpretable strategies highlighting how optimal protocols mediate learning trade-offs. Our results establish a principled foundation for understanding and designing optimal protocols and suggest a path toward a theory of meta-learning grounded in statistical physics.


[60] 2509.23195

Structure leads and dominates comprehension in naturalistic reading

The hierarchical account and statistical or sequential account have long been framed as rival theories in explaining online comprehension. A lot of evidence has shown that both hierarchical and non-hierarchical factors can shape comprehension and the open question is no longer whether hierarchy contributes, but when and how strongly it does. We addressed the question with co-registered EEG and eye-tracking, treating syntactic depth as the variable for operationalizing hierarchical structure. On timing, hierarchical structure influenced reading before the eyes fixated a word: its neural effect emerged as early as 108 ms before fixation onset, over right-central regions, and the scanpath showed an anticipatory bias toward structurally central words. Both the transitional-probability analysis and the regression on fixation-related potentials supported this pre-fixational timing. In the transitional-probability analysis, readers preferentially moved between syntactically central words rather than following serial word order, showing that scanpaths are organized by syntactic depth rather than by linear adjacency. On strength, Bayesian network modeling showed that syntactic depth was the strongest predictor of departures from linear, word-by-word reading, outweighing lexical familiarity and surprisal. Taken together, the results indicate that hierarchical structure anticipatorily guides online comprehension at both the behavioral and neural levels, and dominates the reading path relative to statistical features.


[61] 2512.21988

Region-Specific Calibration Achieves Excellent Inter-Device Reliability for Smartphone Dermatology: A Multi-Device Benchmark on Korean Facial Skin

Background: Smartphone-based dermatology requires inter-device colorimetric reliability that holds across calibration regimes, yet quantitative multi-device benchmarks remain scarce. Materials and Methods: We analyzed matched facial images from 965 Korean subjects captured by a digital single-lens reflex (DSLR) camera, a consumer tablet, and a consumer smartphone, and evaluated two calibration methods against the DSLR reference. The methods are standard global linear Color Correction Matrix (CCM) normalization and region-specific CCM trained per anatomical region, both applied in Commission Internationale de l'Eclairage Lab* (CIELAB) space. Results: Linear CCM reduced inter-device color differences by 61-74% and placed both Melanin Index (intraclass correlation coefficient [ICC] = 0.80) and Individual Typology Angle (ITA, ICC = 0.78) in the good reliability band. Region-specific CCM raised both indices into the excellent reliability band (MI ICC = 0.95, ITA ICC = 0.93), with anatomical region exceeding the source device as the largest pre-calibration variance contributor (analysis-of-variance $\eta^2 = 0.18$ versus 0.12). Conclusion: Consumer-device skin colorimetry therefore achieves clinically useful inter-device reliability using standard calibration, with region-aware calibration the largest remaining source of improvement.


[62] 2604.11915

Can AI Detect Life? Lessons from Artificial Life

Modern machine learning methods have been proposed to detect life in extraterrestrial samples, drawing on their ability to distinguish biotic from abiotic samples based on training models using natural and synthetic organic molecular mixtures. Here we show using Artificial Life that such methods are easily fooled into detecting life with near 100% confidence even if the analyzed sample is not capable of life. This is due to modern machine learning methods' propensity to be easily fooled by out-of-distribution samples. Because extra-terrestrial samples are very likely out of the distribution provided by terrestrial biotic and abiotic samples, using AI methods for life detection is likely to yield significant false positives.


[63] 2604.12683

Brain-DiT: A Universal Multi-state fMRI Foundation Model with Metadata-Conditioned Pretraining

Current fMRI foundation models primarily rely on a limited range of brain states and mismatched pretraining tasks, restricting their ability to learn generalized representations across diverse brain states. We present Brain-DiT, a universal multi-state fMRI foundation model pretrained on 349,898 sessions from 24 datasets spanning resting, task, naturalistic, disease, and sleep states. Unlike prior fMRI foundation models that rely on masked reconstruction in the raw-signal space or a latent space, Brain-DiT adopts metadata-conditioned diffusion pretraining with a Diffusion Transformer (DiT), enabling the model to learn multi-scale representations that capture both fine-grained functional structure and global semantics. Across extensive evaluations and ablations on 7 downstream tasks, we find consistent evidence that diffusion-based generative pretraining is a stronger proxy than reconstruction or alignment, with metadata-conditioned pretraining further improving downstream performance by disentangling intrinsic neural dynamics from population-level variability. We also observe that downstream tasks exhibit distinct preferences for representational scale: ADNI classification benefits more from global semantic representations, whereas age/sex prediction comparatively relies more on fine-grained local structure.


[64] 2604.20824

Stabilizing In-Context Multi-Source Domain Adaptation for Biomedical Images Through Controls

Biomedical imaging data presents enormous potential for deep learning models to predict invaluable properties, such as diseases and drug effects. However, unavoidable alterations of the technical conditions cause batch effects: variations between groups of samples that are not due to any biological signal of interest. Batch effects greatly hinder the generalization abilities of deep learning models, preventing their practical use in the real world. Unsupervised Domain Adaptation (UDA) methods have been proposed to mitigate batch effects, but they usually assume that the data is comprised of only one source domain and one target domain, whereas biological datasets are comprised of multiple domains, both at training and at inference time. While Batch Normalization-based test-time and meta-learning adaptation methods offer a promising mechanism for domain alignment, we show that existing approaches exhibit degraded performance under the usual inference scenarios of small target batch sizes and label shift. We address these limitations by leveraging negative control samples, which are consistently present in every experimental batch in biological datasets, as stable context for adaptation. We propose CS-ARM-BN, a meta-learning BN adaptation method that uses controls both during training and inference to stabilize domain statistics. We perform a suite of experiments of Mechanism-Of-Action (MoA) classification, a crucial task for drug discovery, on the large JUMP-CP imaging dataset. Our experiments show that CS-ARM-BN substantially improves robustness to batch size and class distribution shifts, enabling practical use of deep learning models for biomedical images.


[65] 2605.10310

Positive Alignment: Artificial Intelligence for Human Flourishing

Existing alignment research is dominated by concerns about safety and preventing harm: safeguards, controllability, and compliance. This paradigm of alignment parallels early psychology's focus on mental illness: necessary but incomplete. What we call Positive Alignment is the development of AI systems that (i) actively support human and ecological flourishing in a pluralistic, polycentric, context-sensitive, and user-authored way while (ii) remaining safe and cooperative. It is a distinct and necessary agenda within AI alignment research. We argue that several existing failures of alignment (e.g., engagement hacking, loss of human autonomy, failures in truth-seeking, low epistemic humility, error correction, lack of diverse viewpoints, and being primarily reactive rather than proactive) may be better addressed through positive alignment, including cultivating virtues and maximizing human flourishing. We highlight a range of challenges, open questions, and technical directions (e.g., data filtering and upsampling, pre- and post-training, evaluations, collaborative value collection) for different phases of the LLM and agents lifecycle. We end with design principles for promoting disagreement and decentralization through contextual grounding, community customization, continual adaptation, and polycentric governance; that is, many legitimate centers of oversight rather than one institutional or moral chokepoint.