Histone H3K27M mutation status defines a clinically aggressive subgroup of pediatric diffuse midline glioma and informs prognosis and trial eligibility, but confirmation usually requires tissue sampling from eloquent midline structures. We evaluated whether radiomics from routinely available T2-weighted MRI can provide an adjunctive screening signal in a heterogeneous referral-style cohort, where scans are often acquired externally and T2-weighted imaging is the only consistently available sequence. Ninety-eight pediatric patients with tissue-confirmed status were analyzed (73 mutation-positive, 25 wild-type). Expert tumor segmentations defined the regions of interest for PyRadiomics feature extraction after isotropic resampling, dual skull stripping, and multi-scale filtering. We systematically ablated preprocessing, correlation pruning with repeated recursive feature elimination, tumor volume, and TabDDPM synthetic minority augmentation across 100 stratified train/test splits with real-only test sets. Pure radiomics achieved accuracy 0.664 and F1-score 0.784. The best pipeline used preprocessing, feature selection, and volume with CatBoost, achieving accuracy 0.730$\pm$0.068 and F1-score 0.826$\pm$0.044. TabDDPM improved TabPFN to F1-score 0.81$\pm$0.05 at 200 augmented rows. These results support T2-weighted radiomics as a moderate screening and triage aid, not a replacement for tissue-based diagnosis.
Transcriptional gene regulatory networks (GRNs) depict the directed relationships between regulators and target genes, determining gene expression patterns in a cell-type-specific manner. Single-cell multi-omics technologies, such as single-cell RNA sequencing (scRNA-seq) and single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq), enable high-resolution measurement of cell-type-specific gene expression and regulation in an unprecedented way. However, tools for inferring cell-type-specific GRNs and modeling their dynamics remain scarce. To facilitate the inference and analysis of cell-type-specific GRNs in contexts such as cellular development or disease progression, where cell lineage structure and dynamics are important, we developed a multi-task learning framework, single-cell Multi-Task Network Inference (scMTNI). scMTNI and its associated network analyses tools offer a comprehensive package to define cell-type-specific GRNs and examine their dynamics. This book chapter describes the scMTNI tool and demonstrates its application to an existing cellular reprogramming single cell multi-modal dataset to infer cell-type-specific GRNs and identify key regulators of cellular fate transitions during cellular reprogramming.
Antibody productivity and glycosylation quality in CHO cultures arise from a dynamically evolving metabolic environment, yet models often work in isolation or at a single scale. Here, we present a multiscale mechanistic framework linking molecular, cellular, and process levels to predict how inputs shape bioprocess trajectories. The framework is grounded on a single-cell kinetic model that couples metabolic and glycosylation networks governing yield and critical quality attributes (CQAs). A stochastic single-cell model describes environment-dependent transitions among growth, production, and decline, capturing population heterogeneity. We further introduce cumulative variation in the oxygen uptake rate, integrating total metabolic adjustment over time, as a compact biomarker for predicting metabolic shifts. Unlike population-averaged approaches, the model propagates cell-resolved metabolic states (including ammonia-regulated Golgi pH, nucleotide sugar availability, manganese cofactors, and synthesis rates) into glycan processing. The framework was evaluated using CHO-K1 fed-batch cultures producing VRC01 IgG1 under targeted ammonia stress, matched control conditions, and a pyramid-feeding strategy with tighter control. It accurately predicts trajectories of cell density, metabolites, productivity, and glycosylation, including increased G0F and reduced galactosylation under ammonia stress, and quantifies how metabolic heterogeneity drives variability in productivity and CQAs. This work provides a unified foundation for predictive biomanufacturing and advanced process control.
Despite increasing scale and resolution, many biological measurements remain destructive, revealing only spatial information rather than the dynamics it encodes. By combining flexible representations with mechanistic constraints, physics-informed machine learning offers a promising route to inferring these dynamics from static snapshots. Motivated by subcellular imaging of gene expression, we ask when a static spatial pattern of molecules can identify spatially varying diffusivity, creation, destruction, and boundary exchange, and how different inference schemes perform on the task. A structural identifiability analysis shows that distributed sources are non-identifiable, whereas a point source such as a transcription site can restore identifiability. These limits are further shaped by seemingly innocuous modeling choices: the boundary conditions, the spatial regularity of the underlying dynamics, and even the stochastic calculus convention. We then adapt several physics-informed schemes, differing in how they represent the solution and enforce the governing equations, and demonstrate effective inference from a single snapshot. Physics-informed approaches can thus recover spatial heterogeneities of biological dynamics from static data, but their use should be accompanied and guided by careful identifiability analysis for meaningful interpretation of the results.
Accurately reconstructing gene regulatory networks (GRNs) is essential for understanding transcriptional processes in development and disease. MERLIN-SUITE (this https URL) represents a collection of algorithmic extensions based on MERLIN (Modular regulatory network learning with per gene information) a probabilistic framework that infers gene-specific and module-specific regulatory programs of co-regulated modules, capturing both detailed and modular aspects of transcriptional networks. While expression-based inference is effective, it often aligns poorly with experimentally validated regulatory interactions. MERLIN-P addresses this by integrating external regulatory priors, such as motif, ChIP, and perturbation data, to enhance biological relevance and predictive accuracy. MERLIN-P-TFA further advances the framework by incorporating regularized estimation of latent transcription factor activity (TFA), overcoming the limitation that TF mRNA levels may not represent protein activity. By integrating expression data, prior knowledge, and activity-aware modeling, this unified approach supports robust GRN reconstruction in both bulk and single-cell datasets. This chapter presents the MERLIN-SUITE with a focus on MERLIN-P-TFA and demonstrates its use on a single-cell, multi-modal dataset of mouse cellular reprogramming to infer GRNs and identify key regulators.
Sociability toward humans is a key adaptive trait in free-ranging dogs, enabling them to access resources while navigating risks associated with human interactions. In this study, we investigated whether operant conditioning shapes sociability in Indian free-ranging dogs and whether learned responses generalize to unfamiliar individuals. We experimentally exposed 58 dog groups to either positive or a threatening cue over five consecutive days and assessed their behaviour using approach proportion, approach latency, and demeanor across repeated interactions with a familiar experimenter, followed by a test with an unfamiliar individual. Using Bayesian generalized linear mixed models, we found that cue type and repeated exposure significantly influenced sociability. Dogs exposed to a positive cue showed increased approach behaviour and reduced approach latency over time, along with increased affiliative demeanor. In contrast, dogs exposed to threatening cues exhibited reduced approach behaviour, increased approach latency, and a shift toward neutral and less affiliative responses across days. Importantly, positive cues partially generalized across individuals, as dogs showed increased approach toward an unfamiliar experimenter, although this was accompanied by hesitation to approach. In contrast, threatening cues did not generalize in the same way; dogs did not reduce their approach toward unfamiliar individuals but displayed increased approach latency, indicating heightened caution. Our findings demonstrate that operant conditioning plays a crucial role in shaping dog-human interactions, with asymmetric generalization of positive and threatening experiences.
Classifying heterogeneous omics data remains a fundamental challenge in computational biology, particularly in high-dimensional, small-sample settings where nonlinear interactions dominate and class imbalance further complicates reliable prediction of minority phenotypes. While traditional kernel methods rely on feature abundance, they fail to leverage the known interaction landscapes of biological systems. In this work, we propose a structured Gaussian process classification framework that integrates graph-encoded biological pathways directly into the kernel construction. By propagating information along known interaction networks and combining this with abundance-derived features, the resulting classifier captures both quantitative measurements and topological context. We benchmark our proposed methodology on three publicly available gut and fecal microbiome datasets. To address severe class imbalance, we evaluate complementary strategies, including data-level resampling, threshold calibration, and confusion-matrix-based adjustments, and report minority-class performance alongside accuracy. The hybrid approach yields a performance gain over unstructured baselines and matches the performance of established benchmarks for similar datasets. Furthermore, the probabilistic nature of the framework naturally provides calibrated predictive uncertainty, enabling robust differentiation between confident predictions and ambiguous samples.
Understanding the mechanistic function of a gene is a critical starting point for biology. However, for much of the human proteome that knowledge is scattered across thousands of primary papers or remains poorly established, while the curated databases biologists rely on can lag years behind recent literature. Large language models can now read and synthesize that literature on demand, but doing so faithfully for many genes is an expensive, non-reproducible retrieval session that does not scale across users. Here, we present Affinage, an LLM pipeline that performs this retrieval and mechanistic reasoning once per gene--from the primary literature alone--and stores the result as a reusable, structured annotation. A biologist-designed reading pass extracts only direct experimental evidence, and a synthesis pass reasons over those findings alone. Applied across the genome, Affinage annotates 19,293 human protein-coding genes. This analysis provides mechanism for thousands of genes whose UniProt function is empty or a stub, beating the curated reference on 99.1% of head-to-head genes as scored by a cross-family LLM judge. Affinage also delineates the 10% of the proteome that remains mechanistically uncharacterized and will serve as a continuously-updated, literature-grounded census of gene function. All records are released openly at this https URL . More broadly, Affinage serves as an example of how domain experts can encode their expertise into scalable LLM pipelines to improve the publicly available data that guides biological hypotheses and experimentation.
NA methylation profiling has become a powerful approach for central nervous system (CNS) tumor classification, yet important challenges remain regarding cross-cohort transferability, methodological correctness, and robust multiclass evaluation. In this work, we propose a novel and methodologically rigorous machine-learning approach for methylation-based CNS tumor classification that combines Sparse Random Projection for dimensionality reduction with multinomial logistic regression for classification. We evaluate the proposed approach in the same general experimental setting established by a widely used reference classifier. On the 2,801-sample reference cohort, our method achieves a mean accuracy of 96\% under stratified 3-fold cross-validation. On the independent 1,104-sample clinical evaluation cohort, it reaches 86\% accuracy at the 91-class level and 93\% when predictions are evaluated at the methylation class family level. These results improve upon the corresponding state-of-the-art reference figures of 82\% class-level concordance and 88\% family-level concordance, yielding absolute gains of approximately 4 and 5 percentage points, respectively. This improvement is clinically relevant: in a diagnostic setting, a 5-point increase in correct tumor classification can directly affect cancer subtype assignment and, in turn, influence treatment selection and downstream clinical decision-making. Our results show that the proposed model, grounded in stronger methodological practice in machine learning, consistently outperforms the previous state of the art across evaluation settings and can materially improve the reliability of CNS tumor classification.
Quantum mechanical (QM) cluster models provide an effective framework for mechanistic studies of enzymatic reactions but remain computationally demanding. Neural network potentials (NNPs) offer a promising route to reduce this cost, but enzymes present challenges beyond small molecules, including large system sizes, implicit-solvent environments, substantial polarization, and charge transfer. Here, we present an integrated software framework for efficient NNP training for mechanistic studies of enzymes, demonstrated on QM cluster models of S-adenosyl-L-methionine-dependent methyltransferases (MTases). Our Enerzyme code introduces modular electrostatics-aware NNP architectures and combines automated QM-cluster construction with reactive dataset generation. The Enerzymette subpackage automates reaction pathway exploration at both NNP and DFT levels. We show that iterative flexible scans and nudged elastic band calculations impose stricter requirements on NNPs than conventional dataset metrics. Nevertheless, NNPs trained on fewer than 1,000 system-specific datapoints reproduce reaction energetics and transition-state structures for MTase clusters containing up to 545 atoms with near-chemical accuracy. Direct supervision of atomic charges and consistent dielectric screening substantially improve simulation stability and accuracy, while multitask-learned atomic charges capture charge transfer and polarization trends and provide chemically meaningful descriptors of reactivity. Finally, transferability across chemically diverse catechol O-methyltransferase substrates indicates that NNPs learn generalizable reactivity patterns as training data expand across multiple enzymes. Together, these results establish a foundation for accelerating enzyme mechanistic studies and guide future NNP development for biomolecular reactivity.
Deep multimodal brain-encoding models now predict fMRI responses to naturalistic video with high accuracy. Whether their predicted neural signals also forecast behavioral engagement is unknown. We run TRIBE, the winning model of the 2025 Algonauts brain-encoding challenge (Llama-3.2 + V-JEPA2 + Wav2Vec-BERT), on 48 YouTube videos and reduce its predicted cortical response to a per-second engagement curve, the global field power. Correlated against each video's "most replayed" heatmap, a passively-collected proxy for which moments viewers return to, the curve shows no evidence of predicting re-watch behavior. The pooled position-controlled partial correlation is +0.058 (95% CI [-0.04, 0.15]; one-sample t(47)=1.21, p=0.23), indistinguishable from zero and not significantly above simple loudness and motion baselines (loudness +0.04, paired p=0.74). The raw correlation is also near zero; the moderate values reported for music videos reflect a genre-specific intro/onset-replay artifact rather than content prediction, and do not generalize. The null holds across six cortical-network readouts and under an autocorrelation-preserving permutation test. We release the code, the video-ID manifest, and an acquisition method that works despite YouTube's SABR-only streaming.
3D plant phenotyping is notoriously known to be procedure-complicated and of low throughput due to the extensive multi-view imaging, the fragile 3D reconstruction pipeline, and the additional cost from reconstructed geometry to phenotypic extraction. These limitations are further amplified in low-cost data acquisition, where smartphone videos or sparsely sampled multi-view images provide limited view overlap and self-occlusion. In this work, we show that the conventional 3D plant phenotyping pipeline could be streamlined and significantly accelerated with 3D Foundation Models (3DFMs), and particularly, present one of the first cross-crop 3D phenotyping frameworks powered by 3DFMs. The framework replaces COLMAP-style sparse initialization with 3DFM-based feed-forward geometric recovery, combines geometry-constrained 3D Gaussian Splatting for dense reconstruction, enables few-view reconstruction through iterative view synthesis and refinement, and converts reconstructed geometry into measurable organs through 2D-to-3D semantic transfer, metric scale recovery, and organ instance separation. We further construct a cross-crop dataset with smartphone-based image acquisition, diverse plant morphologies, and manual annotations for segmentation and phenotypic evaluation. Experiments across 26 plant sequences show that 3D Foundation Models reduce the average reconstruction time from 6.52 minutes to 1.58 seconds while maintaining high reconstruction quality and phenotyping accuracy. These results suggest a fresh technical route for high-throughput 3D plant phenotyping, from low-cost image acquisition to fast reconstruction, perception, scale recovery, and phenotypic measurement.
Using molecular large language models (LLMs) as a unified framework for understanding molecular structures and functions is emerging as a new trend in tasks such as molecular design and drug discovery. However, these models struggle to fully capture the visual representation of molecular structures, limiting their potential. While existing molecular vision-language models (VLMs) show promise, they still face challenges in structural alignment and lack the necessary topological modeling for accurate molecular understanding. To address this, we propose MolSight, a graph-aware vision-language model framework designed to enhance the understanding of molecular images by VLMs. MolSight integrates a Molecular Topology Module to inject chemical-bond adjacency information into vision tokens, and a Molecular Grounding Module to align visual features with chemical symbolic semantics. Our experiments demonstrate that MolSight significantly outperforms existing VLMs, molecular LLMs, and specialized tools across multiple chemical visual understanding tasks, achieving a new level of molecular image reasoning.
Understanding how neuronal population activity changes during development and after stimulation is essential for studying neuronal network dynamics. This work examines how visual informatics can summarize high-dimensional spiking activity while retaining information that is biologically interpretable. We develop a framework based on Minimum-Distortion Embedding (MDE), and compare it with Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE). In addition to evaluating the embeddings by visual separation, we quantify whether they preserve the cosine-shape radius within each condition and the pairwise distances between condition centroids. Our \emph{in silico} experiments show that MDE with a cosine metric captures the trajectory of simulated network maturation and preserves the contraction of the activity cloud as connectivity increases. Complementary \emph{in vitro} experiments on human cortical cultures show a coherent developmental trajectory from Day In VITRO 23 (DIV23) to DIV64. We also study weak and strong stimulation in simulation, and long-term potentiation stimulation in primary cortical cultures. In the stimulation experiments, MDE separates activity phases more clearly than PCA and preserves transient changes in within-phase variability that are missed by PCA. These results show that metric selection is central to dimensionality reduction of neuronal data. In particular, cosine distance between population activity vectors provides embeddings that better reflect changes in population activity patterns than Euclidean distance. The proposed framework provides a quantitative way to visualize network development and stimulation-induced changes in neuronal activity.
Decellularized uterine extracellular matrix (dUECM) is promising for uterine tissue engineering because of its inherent bioactivity and structural complexity. However, transforming dUECM into porous, functional 3D constructs remains challenging. This study aimed to (1) synthesize dUECM using a modified decellularization protocol and formulate it into a hydrogel ink, and (2) fabricate 3D-printed constructs to support human uterine myometrial cell growth in vitro. Porcine uterine tissues were decellularized using 1% Triton X-100 with varying concentrations of sodium dodecyl sulfate (SDS) (0.1-1.5%) for 48-72 h. The resulting dUECM was characterized using DNA and glycosaminoglycan (GAG) quantification, Picrosirius Red-polarized light microscopy, histology, scanning electron microscopy, FTIR, Raman spectroscopy, and thermogravimetric analysis. To prepare the ink, dUECM powder was enzymatically digested with pepsin and blended with 2% and 3% alginate. Constructs were fabricated using extrusion-based 3D printing and assessed for filament fidelity, swelling, degradation, and mechanical properties. Biocompatibility was evaluated using hTERT-HM myometrial cells through MTT assays, Live/Dead staining, and alpha-SMA immunohistochemistry. The optimal protocol (1% Triton X-100 + 1% SDS for 48 h) reduced DNA to 51.3 +/- 9 ng/mg while retaining high GAGs (54.9 +/- 7.6 ug/mg). Preservation of the ECM structure was confirmed by spectroscopy. The 3% Alg + 1.5% dUECM hydrogel exhibited suitable printability (1.5 +/- 0.2), swelling (47 +/- 12%), degradation resistance (94 +/- 18% mass retention), and mechanical strength (323 to 175 kPa over 14 days), with high viability and proliferation (258 +/- 13%). The developed dUECM-based hydrogel supports 3D bioprinting with strong mechanical and biological performance, offering a promising platform for uterine tissue engineering.
Dynamical systems theory provides a mathematical framework for describing how interacting biological components evolve over time and space, from molecular oscillators to large-scale biological patterns. Such systems often involve nonlinear feedbacks, delays, and multiscale interactions, making mechanistic model construction increasingly challenging as experimental measurements become richer and higher-dimensional. This has motivated the development of data-driven approaches that infer model structure directly from data, offering alternative routes to constructing dynamical models. In this review, we discuss and compare data-driven approaches for model discovery in biological dynamical systems, focusing on three major methodological families: regression-based methods, network-based architectures, and decomposition techniques. We compare how these approaches address three core objectives: forecasting future behavior, identifying interactions between system components, and characterizing qualitative dynamical solutions such as steady states, oscillations, and transitions between them. To enable a direct comparison, representative methods are applied to a common benchmark - the Oregonator model - a minimal nonlinear oscillator that captures shared design principles of chemical and biological systems. By highlighting practical strengths, limitations, and degrees of interpretability, this review aims to guide researchers in selecting appropriate tools for analyzing complex, nonlinear, and high-dimensional biological dynamics.
Earth s gravity fundamentally shapes human behaviour. The brain encodes this force as an internal model of gravity, enabling the prediction and interpretation of gravitational effects during perception and action. Understanding how this model adapts to altered gravity is critical for predicting human performance in spaceflight. We present a computational framework for modelling neurophysiological adaptation across diverse gravitational environments. The framework has two components trained on open-access data from altered-gravity studies, particularly parabolic flights. The first component (CorticalG) employs a lightweight multilayer perceptron neural network to predict gravity-dependent changes in EEG frequency bands, estimating cortical state under different gravitational loads. The second component (PhysioG) uses independent Gaussian process models to capture broader physiological responses, including heart rate variability, electrodermal activity, and motor control. To complement the quantitative modelling, we simulated subjective experience across gravitational environments using the Large Language Model (LLM) Claude 3.5 Sonnet. Physiological outputs prompted the model to generate narratives describing alertness, bodily awareness, and cognitive state across zero gravity, partial gravity of the Moon and Mars, and hypergravity. This framework provides a novel approach for investigating human adaptation to spaceflight. It offers a predictive tool to assess performance and resilience, supporting the design of future space exploration missions.
Cultures of neurons grown on multi-electrode arrays have become a common experimental preparation for investigating developing neural networks. Experiment and simulation have shown that these developing networks eventually exhibit bursting behavior in which the entire culture participates for short periods of time, with inter-burst intervals in which the network is comparatively quiescent. This paper extends previous simulation results by examining the spatiotemporal patterns of such bursting. We show that these bursts originate at a small number of network locations and propagate as waves of activity. We demonstrate that this type of activity does not require fine tuning of neuron or network parameters. We also examine how this activity changes during development and the dependence of such activity and its triggering on both local and global network properties.
Connectome-constrained models now reproduce neural activity in several systems, yet each inherits a circuit's degree and weight statistics along with its exact wiring, leaving open which dynamical properties the wiring fixes beyond those statistics. We separate the two by running the complete larval Drosophila connectome, 2'825 neurons in its strongly connected core, as a frozen leaky-tanh rate operator with no single-neuron parameter fitted, and comparing it against a degree-and-weight-matched rewiring ensemble and further controls. All properties are the operator's, attributable to wiring and weights alone, not a measurement of larval activity. The gross dynamical signature is largely fixed by coarse statistics: the operator is non-normal and near-linear, with gain and effective dimensionality within a few percent of the ensemble. Two spatially resolved properties break this pattern. First, under sparse afferent drive the operator confines activity to a fifth of the core, against two thirds for the ensemble and all of it for a random graph, a confinement that survives a cell-class-preserving null. Second, the mushroom body concentrates the leading driving modes far beyond the ensemble, surviving size-matched, singular-subspace, and family-wise controls, while its leverage elsewhere stays diffuse. Both depend on the exact placement of synapses. A connectome's gross operator behavior is therefore largely a property of its degree and weight statistics, while the routing of input and the identity of the dominant driving modes are written into its exact wiring. We propose this decomposition as a general tool: it measures whether the exact wiring of any structured network carries information beyond its degree and weight statistics, a question any use of biological connectivity as an architectural prior eventually faces.
Self-supervised learning has become an increasingly important paradigm in the domain of machine intelligence. Furthermore, evidence for self-supervised adaptation, such as contrastive formulations, has emerged in recent computational neuroscience and brain-inspired research. Nevertheless, current work on self-supervised learning relies on biologically implausible credit assignment -- in the form of backpropagation of errors -- and feedforward inference, typically a forward-locked pass. Predictive coding, in its mechanistic form, offers a biologically plausible means to sidestep these backprop-specific limitations. However, unsupervised predictive coding rests on learning a generative model of raw input (akin to "generative AI" approaches), which entails predicting a potentially high dimensional input; on the other hand, supervised predictive coding, which learns a mapping between inputs to target labels, requires human annotation, and thus incurs the drawbacks of supervised learning. In this work, we present a scheme for self-supervised learning, specifically for an emerging research sub-domain that we label as neuroscience-informed self-supervised learning (NeuroSSL), within a neurobiologically plausible framework that appeals to the free energy principle, constructing a new form of predictive coding that we call meta-representational predictive coding (MPC). MPC sidesteps the need for learning a generative model of sensory input (e.g., pixel-level features) by learning to predict representations of the input across parallel streams, resulting in an encoder-only learning and inference scheme. This formulation notably rests on active inference (in the form of sensory glimpsing) to drive the learning of representations, i.e., the representational dynamics are driven by sequences of decisions made by the model to sample informative portions of its sensorium.
Even during fixation the human eye is constantly in low amplitude motion, jittering over small angles in random directions at up to 100Hz. This motion results in all features of the image on the retina constantly traversing a number of cones, yet objects which are stable in the world are perceived to be stable, and any object which is moving in the world is perceived to be moving. A series of experiments carried out over a dozen years revealed the psychophysics of visual stabilization to be more nuanced than might be assumed, say, from the mechanics of stabilization of camera images, or what might be assumed to be the simplest solution from an evolutionary perspective. The psychophysics revealed by the experiments strongly implies a specific set of operations on retinal signals resulting in the observed stabilization behavior. The presentation is in two levels. First is a functional description of the action of the mechanism that is very likely responsible for the experimentally observed behavior. Second is a more speculative proposal of circuit-level neural elements that might implement the functional behavior.
Many systems involve numerous interacting parts and the whole system can have properties that the individual parts do not. I take this novelty as the defining characteristic of an emergent property. Other characteristics associated with emergence discussed include universality, order, complexity, unpredictability, irreducibility, diversity, self-organisation, discontinuities, and singularities. Emergent phenomena are widespread across physics, biology, social sciences, and computing, and are central to major scientific and societal challenges. Understanding emergence involves considering the stratification of reality across different scales (energy, time, length, complexity), each with its distinct ontology and epistemology, leading to semi-autonomous scientific disciplines. A central challenge is bridging the gap between macroscopic emergent properties and microscopic component interactions. Identifying an intermediate mesoscopic scale where new, weakly interacting entities or modular structures emerge is key. Theoretical approaches, such as effective theories (describing phenomena at a specific scale) and toy models (simplified systems for analysis), are vital. The Ising model exemplifies how toy models can elucidate emergence characteristics. Emergence is central to condensed matter physics, chaotic systems, fluid dynamics, nuclear physics, quantum gravity, neural networks, protein folding, and social segregation. An emergent perspective should influence scientific strategy by shaping research questions, methodologies, priorities, and resource allocation. An elusive goal is the design and control of emergent properties.
Mental imagery vividness is a stable individual trait, yet whether imagined scenarios share relational structure across human and synthetic large language model (LLM) populations remains unknown. We applied psychological network analysis to vividness ratings from two validated questionnaires: the Vividness of Visual Imagery Questionnaire (VVIQ-2) and the Plymouth Sensory Imagery Questionnaire (PSIQ), across geographically and linguistically distinct human samples (Florida, Poland, and London; total N = 2,743) and six large language models (LLMs; Gemma3-12B/27B, their quantization-aware counterparts, Llama3.3-70B, and Llama4-16x17B). Imagination networks were constructed as regularized partial correlation graphs, with node centrality and community structure compared across populations using Pearson correlations and the Adjusted Rand Index (ARI). Human networks showed robust cross-population centrality correlations for expected influence, strength, and closeness (r = 0.31-0.93), and community detection recovered clusters aligned with VVIQ-2 scene contexts (ARI = 0.27-0.40) and PSIQ sensory modalities (ARI = 0.87-1.0). Betweenness centrality was unstable across all populations, consistent with its sensitivity to individual experiential history. LLMs failed to replicate human network structure: LLM-human centrality correlations were weak and largely non-significant after correction, and most LLM configurations produced degenerate single-cluster topologies (median ARI = 0). This failure was consistent across model architectures, parameter scales (12B-272B), and conversational conditions. We posit that these findings may be driven by human imagination networks reflecting memory organization accumulated through embodied experience, a representational structure that linguistic training alone does not reproduce regardless of model scale and conversational memory.
Background: Smartphone-based dermatology requires inter-device colorimetric reliability that holds across calibration regimes, yet quantitative multi-device benchmarks remain scarce. Materials and Methods: We analyzed matched facial images from 965 Korean subjects captured by a digital single-lens reflex (DSLR) camera, a consumer tablet, and a consumer smartphone, and evaluated two calibration methods against the DSLR reference. The methods are standard global linear Color Correction Matrix (CCM) normalization and region-specific CCM trained per anatomical region, both applied in Commission Internationale de l'Eclairage Lab* (CIELAB) space. Results: Linear CCM reduced inter-device color differences by 61-74% and placed both Melanin Index (intraclass correlation coefficient [ICC] = 0.80) and Individual Typology Angle (ITA, ICC = 0.78) in the good reliability band. Region-specific CCM raised both indices into the excellent reliability band (MI ICC = 0.95, ITA ICC = 0.93), with anatomical region exceeding the source device as the largest pre-calibration variance contributor (analysis-of-variance $\eta^2 = 0.18$ versus 0.12). Conclusion: Consumer-device skin colorimetry therefore achieves clinically useful inter-device reliability using standard calibration, with region-aware calibration the largest remaining source of improvement.
Accurate sensing of chemical concentrations is essential for numerous biological processes. The accuracy of this sensing, for small numbers of molecules, is limited by shot noise. Corresponding theoretical limits on sensing precision, as a function of sensing duration, have been well-studied in the context of quasi-static and randomly fluctuating concentrations. However, during development and in many other cases, concentration profiles are not random but exhibit predictable spatiotemporal patterns. We propose that leveraging prior knowledge of these structured profiles can improve and accelerate concentration sensing by utilizing information from current molecular binding events to predict future concentrations. By framing the constrained sensing problem as Bayesian inference over an allowed class of spatiotemporal profiles, we derive new theoretical limits on sensing accuracy. Our analysis reveals that maximum a posteriori (MAP) estimation can outperform the classical Berg-Purcell and maximum-likelihood (Poisson counting) limits, achieving a sensing precision of $\delta c/c = 1/\sqrt{a^2N}$, where $N$ is the number of binding events, and $a > 1$ in certain cases. Thus knowledge of the statistical structure of concentration profiles enhances sensing precision, providing a potential explanation for the rapid yet highly accurate cell fate decisions observed during development.