Bayesian inference provides a principled framework for understanding brain function, while neural activity in the brain is inherently spike-based. This paper bridges these two perspectives by designing spiking neural networks that simulate Bayesian inference through message passing for Bernoulli messages. To train the networks, we employ spike-timing-dependent plasticity, a biologically plausible mechanism for synaptic plasticity which is based on the Hebbian rule. Our results demonstrate that the network's performance closely matches the true numerical solution. We further demonstrate the versatility of our approach by implementing a factor graph example from coding theory, illustrating signal transmission over an unreliable channel.
Building multiscale biological models requires integrating independently developed submodels, which involves sharing variables and coordinating execution. Most existing tools focus on isolated mechanisms and numerical methods, but rarely specify model interfaces: which variables are read or written, how they are translated, or how updates are synchronized. We present Process Bigraphs, a framework for composing and simulating multiscale biological models. Process Bigraphs generalize architectural principles from the Vivarium software into a shared specification that defines process interfaces, hierarchical data structures, composition patterns, and orchestration patterns. The paper describes the organization of the framework and explains how it improves model clarity, reuse, and extensibility; formal definitions are provided in the Supplementary Materials. We introduce Vivarium 2.0 as an open-source implementation of the Process Bigraph framework and demonstrate its utility with Spatio-Flux, a standalone library for microbial ecosystem simulations that integrate kinetic ODEs, dynamic flux balance analysis, and spatial processes. We conclude by discussing implications for emerging standards in multiscale modeling. Availability and implementation: Vivarium 2.0 is an open-source suite of libraries including: (1) bigraph-schema for hierarchical, JSON-based data typing; (2) process-bigraph for defining process interfaces and executing composite simulations; (3) bigraph-viz for interactive visualization of system structure and data flow; and (4) spatio-flux, the reference application used in this work. Detailed descriptions are provided in the Supplementary Materials. All software is available at this https URL
Membrane particles such as proteins and lipids organize into zones that perform unique functions. Here, I introduce a topological and category-theoretic framework to represent particle and zone intra-scale interactions and inter-scale coupling. This involves carefully demarcating between different presheaf- or sheaf-assigned data levels to preserve functorial structure and account for particle and zone generalized poses. The framework can accommodate Hamiltonian mechanics, enabling dynamical modeling. This amounts to a versatile mathematical formalism for membrane structure and multiscale coupling.
Measurements of cell size dynamics have established the adder principle as a robust mechanism of cell size homeostasis. In this framework, cells add a nearly constant amount of size during each cell cycle, independent of their size at birth. Theoretical studies have shown that the adder principle can be achieved when cell-cycle progression is coupled to cell size. Here, we extend this framework by considering a general growth law modeled as a Hill-type function of cell size. This assumption introduces growth saturation to the model, such that very large cells grow approximately linearly rather than exponentially. Additionally, to capture the sequential nature of division, we implement a stochastic multi-step adder model in which cells progress through internal regulatory stages before dividing. From this model, we derive exact analytical expressions for the moments of cell size distributions. Our results show that stronger growth saturation increases the mean cell size in steady state, while slightly reducing fluctuations compared to exponential growth. Importantly, despite these changes, the adder property is preserved. This emphasizes that the reduction in size variability is a consequence of~the growth law rather than simple scaling with mean size. Finally, we analyze stochastic clonal proliferation and find that growth saturation influences both single-cell size statistics and variability across populations. Our results provide a generalized framework for connecting multi-step adder mechanisms with proliferation dynamics, extending size control theory beyond exponential growth.
Polymerase chain reaction (PCR) is fundamental to molecular biology, yet conventional thermocyclers pose significant challenges for emerging applications such as DNA data storage, where full automation, contamination control, and cost-effectiveness are critical. Here, we introduce a disruptive approach that revisits the original water bath-based PCR method and integrates it with modern robotic liquid-handling technology. Our system performs amplification entirely within sealed pipette tips using automated immersion and withdrawal in a single temperature-controlled oil bath, eliminating the need for sophisticated thermal management while enabling precise temperature control across denaturation, annealing, and extension steps. We demonstrate that this approach achieves amplification efficiency and sequencing fidelity comparable to high-performance thermocyclers when applied to DNA-encoded datasets. The platform minimizes reagent waste, reduces contamination risks through complete tip isolation, and enables full sample recovery. This modular, automation-ready design provides a scalable and cost-effective solution for PCR workflows in DNA data storage, high-throughput diagnostics, and distributed laboratory settings.
We present SeedProteo, a diffusion-based model for de novo all-atom protein design. We demonstrate how to repurpose a cutting-edge folding architecture into a powerful generative design framework by effectively integrating self-conditioning features. Extensive benchmarks highlight the model's capabilities across two distinct tasks: in unconditional generation, SeedProteo exhibits superior length generalization and structural diversity, maintaining robustness for long sequences and complex topologies; in binder design, it achieves state-of-the-art performance among open-source methods, attaining the highest in-silico design success rates, structural diversity and novelty.
Highly accurate biomolecular structure prediction is a key component of developing biomolecular foundation models, and one of the most critical aspects of building foundation models is identifying the recipes for scaling the model. In this work, we present SeedFold, a folding model that successfully scales up the model capacity. Our contributions are threefold: first, we identify an effective width-scaling strategy for the Pairformer to increase representation capacity; second, we introduce a novel linear triangular attention that reduces computational complexity to enable efficient scaling; finally, we construct a large-scale distillation dataset to substantially enlarge the training set. Experiments on FoldBench show that SeedFold outperforms AlphaFold3 on most protein-related tasks.
We develop an extended Dynamical Mean Field Theory framework to analyze gene regulatory networks (GRNs) incorporating epigenetic modifications. Building on the Hopfield network model analogy to spin glass systems, our approach introduces dynamic terms representing DNA methylation and histone modification to capture their regulatory influence on gene expression. The resulting formulation reduces high-dimensional GRN dynamics to effective stochastic equations, enabling the characterization of both stable and oscillatory states in epigenetically regulated systems. This framework provides a tractable and quantitative method for linking gene regulatory dynamics with epigenetic control, offering new theoretical insights into developmental processes and cell fate decisions.
Running is a fundamental form of human locomotion and a key task for evaluating neuromuscular control and lower-limb coordination. In recent years, muscle synergy analysis based on surface electromyography (sEMG) has become an important approach in this area. This review focuses on muscle synergies during running, outlining core neural control theories and biomechanical optimization hypotheses, summarizing commonly used decomposition methods (e.g., PCA, ICA, FA, NMF) and emerging autoencoder-based approaches. We synthesize findings on the development and evolution of running-related synergies across the lifespan, examine how running surface, speed, foot-strike pattern, fatigue, and performance level modulate synergy patterns, and describe characteristic alterations in populations with knee osteoarthritis, patellofemoral pain, and stroke. Current evidence suggests that the number and basic structure of lower-limb synergies during running are relatively stable, whereas spatial muscle weightings and motor primitives are highly plastic and sensitive to task demands, fatigue, and pathology. However, substantial methodological variability remains in EMG channel selection, preprocessing pipelines, and decomposition algorithms, and direct neurophysiological validation and translational application are still limited. Future work should prioritize standardized processing protocols, integration of multi-source neuromusculoskeletal data, nonlinear modeling, and longitudinal intervention studies to better exploit muscle synergy analysis in sports biomechanics, athletic training, and rehabilitation medicine.
The analysis of the interaction matrix between two distinct sets is essential across diverse fields, from pharmacovigilance to transcriptomics. Not all interactions are equally informative: a marker gene associated with a few specific biological processes is more informative than a highly expressed non-specific gene associated with most observed processes. Identifying these interactions is challenging due to background connections. Furthermore, data heterogeneity across sources precludes universal identification criteria. To address this challenge, we introduce \textsf{this http URL}, a method for identifying specificity by detecting structural breaks in entity interactions. Rank-based representation of the interaction matrix ensures invariance to heterogeneous data and allows for integrating data from diverse sources. To automatically locate the boundary between specific interactions and background activity, we employ model fitting. We demonstrate the applicability of \textsf{this http URL} on the GSE112026 -- transnational data from head and neck cancer. A computationally efficient \textsf{R} implementation is available at this https URL.
The perfect phylogeny mixture (PPM) model is useful due to its simplicity and applicability in scenarios where mutations can be assumed to accumulate monotonically over time. It is the underlying model in many tools that have been used, for example, to infer phylogenetic trees for tumor evolution and reconstruction. Unfortunately, the PPM model gives rise to substantial ambiguity -- in that many different phylogenetic trees can explain the same observed data -- even in the idealized setting where data are observed perfectly, i.e. fully and without noise. This ambiguity has been studied in this perfect setting by Pradhan et al. 2018, which proposed a procedure to bound the number of solutions given a fixed instance of observation data. Beyond this, studies have been primarily empirical. Recent work (Myers et al. 2019) proposed adding extra constraints to the PPM model to tackle ambiguity. In this paper, we first show that the extra constraints of Myers et al. 2019, called longitudinal constraints (LC), often fail to reduce the number of distinct trees that explain the observations. We then propose novel alternative constraints to limit solution ambiguity and study their impact when the data are observed perfectly. Unlike the analysis in Pradhan et al. 2018, our theoretical results regarding both the inefficacy of the LC and the extent to which our new constrains reduce ambiguity are not tied to a single observation instance. Rather, our theorems hold over large ensembles of possible inference problems. To the best of our knowledge, we are the first to study degeneracy in the PPM model in this ensemble-based theoretical framework.
Biological systems encode function not primarily in steady states, but in the structure of transient responses elicited by time-varying stimuli. Overshoots, biphasic dynamics, adaptation kinetics, fold-change detection, entrainment, and cumulative exposure effects often determine phenotypic outcomes, yet are poorly captured by classical steady-state or dose-response analyses. This paper develops an input-output perspective on such "dynamic phenotypes," emphasizing how qualitative features of transient behavior constrain underlying network architectures independently of detailed parameter values. A central theme is the role of sign structure and interconnection logic, particularly the contrast between monotone systems and architectures containing antagonistic pathways. We show how incoherent feedforward (IFF) motifs provide a simple and recurrent mechanism for generating non-monotonic and adaptive responses across multiple levels of biological organization, from molecular signaling to immune regulation and population dynamics. Conversely, monotonicity imposes sharp impossibility results that can be used to falsify entire classes of models from transient data alone. Beyond step inputs, we highlight how periodic forcing, ramps, and integral-type readouts such as cumulative dose responses offer powerful experimental probes that reveal otherwise hidden structure, separate competing motifs, and expose invariances such as fold-change detection. Throughout, we illustrate how control-theoretic concepts, including monotonicity, equivariance, and input-output analysis, can be used not as engineering metaphors, but as precise mathematical tools for biological model discrimination. Thus we argue for a shift in emphasis from asymptotic behavior to transient and input-driven dynamics as a primary lens for understanding, testing, and reverse-engineering biological networks.
Sequential structure is a key feature of multiple domains of natural cognition and behavior, such as language, movement and decision-making. Likewise, it is also a central property of tasks to which we would like to apply artificial intelligence. It is therefore of great importance to develop frameworks that allow us to evaluate sequence learning and processing in a domain agnostic fashion, whilst simultaneously providing a link to formal theories of computation and computability. To address this need, we introduce two complementary software tools: SymSeq, designed to rigorously generate and analyze structured symbolic sequences, and SeqBench, a comprehensive benchmark suite of rule-based sequence processing tasks to evaluate the performance of artificial learning systems in cognitively relevant domains. In combination, SymSeqBench offers versatility in investigating sequential structure across diverse knowledge domains, including experimental psycholinguistics, cognitive psychology, behavioral analysis, neuromorphic computing and artificial intelligence. Due to its basis in Formal Language Theory (FLT), SymSeqBench provides researchers in multiple domains with a convenient and practical way to apply the concepts of FLT to conceptualize and standardize their experiments, thus advancing our understanding of cognition and behavior through shared computational frameworks and formalisms. The tool is modular, openly available and accessible to the research community.
We study the joint dynamics of membrane potential and time since the last spike in a population of integrate-and-fire neurons using a population density framework. This leads to a two-dimensional Fokker-Planck equation that captures the evolution of the full neuronal state, along with a one-dimensional hierarchy of equations for the moments of the inter-spike interval (ISI). The formalism allows us to characterize the time-dependent ISI distribution, even when the population is far from stationarity, such as under time-varying external input or during network oscillations. By performing a perturbative expansion around the stationary state, we also derive an analytic expression for the linear response of the ISI distribution to weak input modulations.
High-resolution voxel-based micro-finite element ($\mu$FE) models derived from $\mu$CT imaging enable detailed investigation of bone mechanics but remain computationally challenging at anatomically relevant scales. This study presents a comprehensive $\mu$FE framework for large-scale biomechanical analysis of an intact New Zealand White (NZW) rabbit femur, integrating advanced segmentation, scalable finite element solvers, and experimental validation using predominantly open-source libraries. Bone geometries were segmented from $\mu$CT data using the MIA clustering algorithm and converted into voxel-based $\mu$FE meshes, which were solved using the open-source MFEM library with algorithms designed for large-scale linear elasticity systems. The numerical solutions were verified by comparing with a commercial finite element solver, and by evaluating the performance of full assembly and element-by-element formulations within MFEM. Models containing over $8\times10^{8}$ DOFs were solved using moderate HPC resources, demonstrating the feasibility of anatomically realistic $\mu$FE simulations at this scale. Resolution effects were investigated by comparing models with voxel sizes of 20, 40, and 80 $\mu$m, revealing that 40 $\mu$m preserves boundary displacement and principal strain distributions with minimal bias while significantly reducing computational cost. Sensitivity analyses further showed that segmentation parameters influence the global mechanical response. Finally, $\mu$FE predictions were coupled with Digital Image Correlation measurements on an NZW rabbit femur under compression to calibrate effective bone material properties at the micron scale. The results demonstrate that large-scale, experimentally informed $\mu$FE modeling can be achieved using open-source tools, providing a robust foundation for preclinical assessment of bone mechanics and treatment-related risks.
Neural circuits exhibit structured connectivity, including an overrepresentation of reciprocal connections between neuron pairs. Despite important advances, a full understanding of how such partial symmetry in connectivity shapes neural dynamics remains elusive. Here we ask how correlations between reciprocal connections in a random, recurrent neural network affect phase-space complexity, defined as the exponential proliferation rate (with network size) of the number of fixed points that accompanies the transition to chaotic dynamics. We find a striking pattern: partial anti-symmetry strongly amplifies complexity, while partial symmetry suppresses it. These opposing trends closely track changes in other measures of dynamical behavior, such as dimensionality, Lyapunov exponents, and transient path length, supporting the view that fixed-point structure is a key determinant of network dynamics. Thus, positive reciprocal correlations favor low-dimensional, slowly varying activity, whereas negative correlations promote high-dimensional, rapidly fluctuating chaotic activity. These results yield testable predictions about the link between connection reciprocity, neural dynamics and function.
Fourier ptychographic microscopy (FPM) is a promising quantitative phase imaging technique that enables high-resolution, label-free imaging over a large field-of-view. Here, we present the first application of FPM for the quantitative analysis of human brain organoid slices, providing a powerful, cost-effective, and label-free enhancement to the current gold-standard fluorescence microscopy. Brain organoids, prepared as thin (5 micrometer) slices, were imaged with a custom-built FPM system consisting of a standard light microscope (4x, 0.2 NA objective) and a 7x7 LED array. This configuration achieved a synthetic numerical aperture of 0.54 and a spatial resolution of approximately 488 nm across an area of 2.077 x 3.65 mm. Fluorescence microscopy was used in parallel for neurons, astrocytes, and nuclei labeling, providing rich fluorescence imaging. Moreover, we designed an automated method to merge classical resolution fluorescence images to visualize the whole brain organoid and align it with the numerically increased space-bandwidth product FPM image. The provided alignment method enables rich phase-fluorescence correlative imaging. Based on the segmentation performed on the stitched fluorescence images, we devised a quantitative phase analysis revealing a higher mean optical thickness of the nuclei versus astrocytes and neurons. Notably, nuclei located in neurogenic regions consistently exhibited significantly higher phase values (optical path difference) compared to nuclei elsewhere, suggesting cell-type-specific biophysical signatures. The label-free, quantitative, and high-throughput capabilities of the FPM approach demonstrated here make it a powerful and accessible tool for future structural and functional studies of whole-section brain organoid development and disease modeling studies.
This study presents a large-scale predictive modeling framework for logP prediction using 426850 bioactive compounds rigorously curated from the intersection of three authoritative chemical databases: PubChem, ChEMBL, and eMolecules. We developed a novel computational infrastructure to address the data integration challenge, reducing processing time from a projected over 100 days to 3.2 hours through byte-offset indexing architecture, a 740-fold improvement. Our comprehensive analysis revealed critical insights into the multivariate nature of lipophilicity: while molecular weight exhibited weak bivariate correlation with logP, SHAP analysis on ensemble models identified it as the single most important predictor globally. We systematically evaluated multiple modeling approaches, discovering that linear models suffered from inherent heteroskedasticity that classical remediation strategies, including weighted least squares and Box-Cox transformation, failed to address. Tree-based ensemble methods, including Random Forest and XGBoost, proved inherently robust to this violation, achieving an R-squared of 0.765 and RMSE of 0.731 logP units on the test set. Furthermore, a stratified modeling strategy, employing specialized models for drug-like molecules (91 percent of dataset) and extreme cases (nine percent), achieved optimal performance: an RMSE of 0.838 for the drug-like subset and an R-squared of 0.767 for extreme molecules, the highest of all evaluated approaches. These findings provide actionable guidance for molecular design, establish robust baselines for lipophilicity prediction using only 2D descriptors, and demonstrate that well-curated, descriptor-based ensemble models remain competitive with state-of-the-art graph neural network architectures.
Muller's ratchet, in its prototype version, models a haploid, asexual population whose size~$N$ is constant over the generations. Slightly deleterious mutations are acquired along the lineages at a constant rate, and individuals carrying less mutations have a selective advantage. In the classical variant, an individual's selective advantage is proportional to the difference between the population average and the individual's mutation load, whereas in the ratchet with {\em tournament selection} only the signs of the differences of the individual mutation loads matter. In a parameter regime which leads to slow clicking (i.e. to a loss of the currently fittest class at a rate $\ll 1/N$) we prove that the rescaled process of click times of the tournament ratchet converges as $N\to \infty$ to a Poisson process. Central ingredients in the proof are a thorough analysis of the metastable behaviour of a two-type Moran model with selection and deleterious mutation (which describes the size of the fittest class up to its extinction time) and a lower estimate on the size of the new fittest class at a clicktime.
We use large language models (LLMs) to uncover long-ranged structure in English texts from a variety of sources. The conditional entropy or code length in many cases continues to decrease with context length at least to $N\sim 10^4$ characters, implying that there are direct dependencies or interactions across these distances. A corollary is that there are small but significant correlations between characters at these separations, as we show from the data independent of models. The distribution of code lengths reveals an emergent certainty about an increasing fraction of characters at large $N$. Over the course of model training, we observe different dynamics at long and short context lengths, suggesting that long-ranged structure is learned only gradually. Our results constrain efforts to build statistical physics models of LLMs or language itself.
The self-simulational theory of temporal extension describes an information-theoretically formalized mechanism by which the width of subjective temporality emerges from the architecture of self-modelling. In this paper, the perspective of the free energy principle will be assumed to cast the emergence of subjective temporal extension from first principles of the physics of self-organization and to formalize subjective temporal extension using information geometry. Using active inference, a deep parametric generative model of temporal inference is simulated, which realizes the described dynamics on a computational level. Two variations of time-perception naturally emerge from the simulated computational model. This concerns the intentional binding effect (i.e., the compression of the temporal interval between voluntarily initiated actions and subsequent sensory consequences) and empirically documented alterations of subjective time experience in deep states of meditative absorption (i.e., in minimal phenomenal experience). Generally, numerous systematic and domain-specific alterations of subjective temporal experience are computationally explained in a unified manner, as enabled by integration with current active inference accounts mapping onto the respective domains. This concerns, next to attentional and central tendency effects, the temporality-modulating role of valence, impulsivity, boredom, flow-states, near death-experiences, and various psychopathologies, amongst others. The self-simulational theory of temporal extension, from the perspective of the free energy principle, explains how the width of the subjective temporal moment emerges and varies from first principles, accounting for why sometimes, subjective time seems to fly, and sometimes, moments feel like eternities; with the computational mechanism being readily deployable synthetically.
We consider a multitype Galton-Watson process that allows for the mutation and reversion of individual types in discrete and continuous time. In this setting, we explicitly compute the time evolution of quantities such as the mean and distributions of different types. This allows us in particular to estimate the proportions of different types in the long run, as well as the distribution of the first time of occurrence of a given type as the tree size or time increases. Our approach relies on the recursive computation of the joint distribution of types conditionally to the value of the total progeny. In comparison with the literature on related multitype models, we do not rely on approximations.
Photon Absorption Remote Sensing (PARS) enables label-free imaging of subcellular morphology by observing biomolecule specific absorption interactions. Coupled with deep-learning, PARS produces label-free virtual Hematoxylin and Eosin (H&E) stained images in unprocessed tissues. This study evaluates the diagnostic performance of PARS virtual H&E images in excisional skin biopsies, including Squamous (SCC), Basal (BCC) Cell Carcinoma, and normal skin. Sixteen unstained formalin-fixed paraffin-embedded skin excisions were PARS imaged, virtually H&E stained, then chemically stained and imaged at 40x. Seven fellowship trained dermatopathologists assessed all images. Example PARS and chemical H&E whole-slide images from this study are available at the BioImage Archive (this https URL). Concordance analysis indicates 95.5% agreement between primary diagnoses from PARS versus H&E images (Cohen's k=0.93). Inter-rater reliability was near-perfect for both image types (Fleiss' k=0.89 for PARS, k=0.80 for H&E). For subtype classification, agreement was near-perfect 91% (k=0.73) for SCC and was perfect for BCC. For malignancy confinement (e.g., cancer margins), agreement was 92% between PARS and H&E (k=0.718). During assessment dermatopathologists could not reliably distinguish image origin (PARS vs. H&E), and diagnostic confidence was equivalent. Inter-rater reliability for PARS virtual H&E was consistent with reported histologic evaluation benchmarks. These results indicate that PARS virtual histology may be diagnostically equivalent to chemical H&E staining in dermatopathology diagnostics, while enabling assessment directly from unlabeled slides. In turn, the label-free PARS virtual H&E imaging workflow may preserve tissue for downstream analysis while producing data well-suited for AI integration potentially accelerating and enhancing skin cancer diagnostics.
Prenatal maternal stress alters maternal-fetal heart rate coupling, as demonstrated by the Fetal Stress Index derived from bivariate phase-rectified signal averaging. Here, we extend this framework using information-theoretical measures to elucidate underlying mechanisms. In 120 third-trimester pregnancies (58 stressed, 62 control), we computed transfer entropy (TE), entropy rate (ER), and sample entropy (SE) under multiple conditioning paradigms, employing mixed linear models for repeated measures. We identify dual coupling mechanisms at the short-term (0.5 - 2.5 s), but not long-term (2.5 - 5 s) time scales: (1) stress-invariant state-dependent synchronization, with maternal decelerations exerting approximately 60% coupling strength on fetal heart rate complexity - a fundamental coordination conserved across demographics; and (2) stress-sensitive temporal information transfer (TE), showing exploratory associations with maternal cortisol that require replication. A robust sex-by-stress interaction emerged in TE from mixed models, with exploratory female-specific coupling patterns absent in males. Universal acceleration predominance was observed in both maternal and fetal heart rates, stronger in fetuses and independent of sex or stress. We provide insight into the dependence of these findings on the sampling rate of the underlying data, identifying 4 Hz, commonly used for ultrasound-derived fetal heart rate recordings, as the necessary and sufficient sampling rate regime to capture the information flow. Information-theoretical analysis reveals that maternal-fetal coupling operates through complementary pathways with differential stress sensitivity, extending the Fetal Stress Index by elucidating causal foundations. Future studies should explore additional information-theoretical conditional approaches to resolve stress-specific and time-scale-specific differences in information flow.
Unsupervised patient stratification is essential for disease subtype discovery, yet, despite growing evidence of molecular heterogeneity of non-oncological diseases, popular methods are benchmarked primarily using cancers with mutually exclusive molecular subtypes well-differentiated by numerous biomarkers. Evaluating 22 unsupervised methods, including clustering and biclustering, using simulated and real transcriptomics data revealed their inefficiency in scenarios with non-mutually exclusive subtypes or subtypes discriminated only by few biomarkers. To address these limitations and advance precision medicine, we developed UnPaSt, a novel biclustering algorithm for unsupervised patient stratification based on differentially expressed biclusters. UnPaSt outperformed widely used patient stratification approaches in the de novo identification of known subtypes of breast cancer and asthma. In addition, it detected many biologically insightful patterns across bulk transcriptomics, proteomics, single-cell, spatial transcriptomics, and multi-omics datasets, enabling a more nuanced and interpretable view of high-throughput data heterogeneity than traditionally used methods.
Recent advances in artificial neural networks for machine learning, and language modeling in particular, have established a family of recurrent neural network (RNN) architectures that, unlike conventional RNNs with vector-form hidden states, use two-dimensional (2D) matrix-form hidden states. Such 2D-state RNNs, known as Fast Weight Programmers (FWPs), can be interpreted as a neural network whose synaptic weights (called fast weights) dynamically change over time as a function of input observations, and serve as short-term memory storage; corresponding synaptic weight modifications are controlled or programmed by another network (the programmer) whose parameters are trained (e.g., by gradient descent). In this Primer, we review the technical foundations of FWPs, their computational characteristics, and their connections to transformers and state space models. We also discuss connections between FWPs and models of synaptic plasticity in the brain, suggesting a convergence of natural and artificial intelligence.
We investigate recurrent neural networks with asymmetric interactions and demonstrate that the inclusion of self-couplings or sparse excitatory inter-module connections leads to the emergence of a densely connected manifold of dynamically accessible stable configurations. This representation manifold is exponentially large in system size and is reachable through simple local dynamics, despite constituting a subdominant subset of the global configuration space. We further show that learning can be implemented directly on this structure via a fully local, gradient-free mechanism that selectively stabilizes a single task-relevant network configuration. Unlike error-driven or contrastive learning schemes, this approach does not require explicit comparisons between network states obtained with and without output supervision. Instead, transient supervisory signals bias the dynamics toward the representation manifold, after which local plasticity consolidates the attained configuration, effectively shaping the latent representation space. Numerical evaluations on standard image classification benchmarks indicate performance comparable to that of multilayer perceptrons trained using backpropagation. More generally, these results suggest that the dynamical accessibility of fixed points and the stabilization of internal network dynamics constitute viable alternative principles for learning in recurrent systems, with conceptual links to statistical physics and potential implications for biologically motivated and neuromorphic computing architectures.