We introduce a novel predictive coding framework for studying attachment theory. Building off an established model of attachment, the dynamic-maturational model (DMM), as well as the neuroanatomical Embodied Predictive Interoception Coding (EPIC) model of interoception and emotion, we not only elucidate how neural processes can shape attachment strategies, but also explore how early attachment experiences can shape those processes in the first place. Returning to John Bowlby's original vision for attachment theory, our framework is based on four simple, empirically-supported principles that can easily be interpreted with predictive coding. We apply our framework to further our understanding of the attachment strategies in the DMM. Specifically, we propose that the type A strategies (analogous to "avoidant" or "dismissive" attachment) involve the suppression of interoceptive prediction errors as an adaptive response to maltreatment, relieving stress in the short-term at the cost of interoceptive awareness in the long-term. Furthermore, we propose that type C strategies (analogous to "ambivalent/resistant" or "preoccupied" attachment) involve the suppression of exteroceptive prediction errors to reflect the unreliability of external cues, motivating the obsessive seeking of information through increased vigilance and histrionic displays of affect. Finally, we explore the implications of our proposals, making several novel hypotheses that could have implications for the treatment of attachment-related psychopathology.
This paper explores foundational questions about the relationship of qualia to natural selection. The primary result is a derivation of specific formal conditions under which structural systems subject to natural selection can convey consistent effects in an associated qualitative domain, placing theoretical and empirical constraints on theories of consciousness. In order to achieve this result, information-theoretic measures are developed to quantify the mutual determinability between structure and quality, quantifying fidelity between the two domains. The fidelities represented by that space are then incorporated into the Price Equation to yield key bounds on the transmission of selective effects between domains. Finally, transmission of higher-order structures between domains is explored. Placement within a broader philosophical context can be found in the companion paper Structure & Quality.
This paper explores the hard problem of consciousness from a different perspective. Instead of drawing distinctions between the physical and the mental, an exploration of a more foundational relationship is examined: the relationship between structure and quality. Information-theoretic measures are developed to quantify the mutual determinability between structure and quality, including a novel Q-S space for analyzing fidelity between the two domains. This novel space naturally points toward a five-fold categorization of possible relationships between structural and qualitative properties, illustrating each through conceptual and formal models. The ontological implications of each category are examined, shedding light on debates around functionalism, emergentism, idealism, panpsychism, and neutral monism. This new line of inquiry has established a framework for deriving theoretical constraints on qualitative systems undergoing evolution that is explored in my companion paper, Qualia & Natural Selection.
This paper introduces a biomathematical model designed to describe the internal dynamics of dream formation and spontaneous cognitive processes. The model incorporates neurocognitive factors such as dissatisfaction, acceptance, forgetting, and mental activity, each of which is linked to established neural systems. We formulate a system of differential equations to simulate interactions among these variables and validate the model using simulated neural data. Our results demonstrate biologically plausible cognitive patterns consistent with findings from EEG and fMRI studies, particularly related to the default mode network (DMN), anterior cingulate cortex (ACC), and hippocampal memory mechanisms.
Quantum cognition has made it possible to model human cognitive processes very effectively, revealing numerous parallels between the properties of conceptual entities tested by the human mind and those of microscopic entities tested by measurement apparatuses. The success of quantum cognition has also made it possible to formulate an interpretation of quantum mechanics, called the conceptuality interpretation, which ascribes to quantum entities a conceptual nature similar to that of human concepts. The present work fits into these lines of research by analyzing a cognitive version of single, double, and triple-slit experiments. The data clearly show the formation of the typical interference fringes between the slits as well as the embryos of secondary fringes. Our analysis also shows that while quantum entities and human concepts may share a same conceptual nature, the way they manifest it in specific contexts can be quite different. This is also evident from the significant deviation from zero observed for the Sorkin parameter, indicating the presence of strong irreducible third-order interference contributions in human decision.
Autonomous AI is no longer a hard-to-reach concept, it enables the agents to move beyond executing tasks to independently addressing complex problems, adapting to change while handling the uncertainty of the environment. However, what makes the agents truly autonomous? It is agentic reasoning, that is crucial for foundation models to develop symbolic logic, statistical correlations, or large-scale pattern recognition to process information, draw inferences, and make decisions. However, it remains unclear why and how existing agentic reasoning approaches work, in comparison to biological reasoning, which instead is deeply rooted in neural mechanisms involving hierarchical cognition, multimodal integration, and dynamic interactions. In this work, we propose a novel neuroscience-inspired framework for agentic reasoning. Grounded in three neuroscience-based definitions and supported by mathematical and biological foundations, we propose a unified framework modeling reasoning from perception to action, encompassing four core types, perceptual, dimensional, logical, and interactive, inspired by distinct functional roles observed in the human brain. We apply this framework to systematically classify and analyze existing AI reasoning methods, evaluating their theoretical foundations, computational designs, and practical limitations. We also explore its implications for building more generalizable, cognitively aligned agents in physical and virtual environments. Finally, building on our framework, we outline future directions and propose new neural-inspired reasoning methods, analogous to chain-of-thought prompting. By bridging cognitive neuroscience and AI, this work offers a theoretical foundation and practical roadmap for advancing agentic reasoning in intelligent systems. The associated project can be found at: https://github.com/BioRAILab/Awesome-Neuroscience-Agent-Reasoning .
We envision the "virtual eye" as a next-generation, AI-powered platform that uses interconnected foundation models to simulate the eye's intricate structure and biological function across all scales. Advances in AI, imaging, and multiomics provide a fertile ground for constructing a universal, high-fidelity digital replica of the human eye. This perspective traces the evolution from early mechanistic and rule-based models to contemporary AI-driven approaches, integrating in a unified model with multimodal, multiscale, dynamic predictive capabilities and embedded feedback mechanisms. We propose a development roadmap emphasizing the roles of large-scale multimodal datasets, generative AI, foundation models, agent-based architectures, and interactive interfaces. Despite challenges in interpretability, ethics, data processing and evaluation, the virtual eye holds the potential to revolutionize personalized ophthalmic care and accelerate research into ocular health and disease.
The scarcity of high-quality multimodal biomedical data limits the ability to effectively fine-tune pretrained Large Language Models (LLMs) for specialized biomedical tasks. To address this challenge, we introduce MINT (Multimodal Integrated kNowledge Transfer), a framework that aligns unimodal large decoder models with domain-specific decision patterns from multimodal biomedical data through preference optimization. While MINT supports different optimization techniques, we primarily implement it with the Odds Ratio Preference Optimization (ORPO) framework as its backbone. This strategy enables the aligned LLMs to perform predictive tasks using text-only or image-only inputs while retaining knowledge learnt from multimodal data. MINT leverages an upstream multimodal machine learning (MML) model trained on high-quality multimodal data to transfer domain-specific insights to downstream text-only or image-only LLMs. We demonstrate its effectiveness through two key applications: (1) Rare genetic disease prediction from texts, where MINT uses a multimodal encoder model, trained on facial photos and clinical notes, to generate a preference dataset for aligning a lightweight Llama 3.2-3B-Instruct. Despite relying on text input only, the MINT-derived model outperforms models trained with SFT, RAG, or DPO, and even outperforms Llama 3.1-405B-Instruct. (2) Tissue type classification using cell nucleus images, where MINT uses a vision-language foundation model as the preference generator, containing knowledge learnt from both text and histopathological images to align downstream image-only models. The resulting MINT-derived model significantly improves the performance of Llama 3.2-Vision-11B-Instruct on tissue type classification. In summary, MINT provides an effective strategy to align unimodal LLMs with high-quality multimodal expertise through preference optimization.
We propose a model for the evolutionary ecology of words as one attempt to extend evolutionary game theory and agent-based models by utilizing the rich linguistic expressions of Large Language Models (LLMs). Our model enables the emergence and evolution of diverse and infinite options for interactions among agents. Within the population, each agent possesses a short word (or phrase) generated by an LLM and moves within a spatial environment. When agents become adjacent, the outcome of their interaction is determined by the LLM based on the relationship between their words, with the loser's word being replaced by the winner's. Word mutations, also based on LLM outputs, may occur. We conducted preliminary experiments assuming that ``strong animal species" would survive. The results showed that from an initial population consisting of well-known species, many species emerged both gradually and in a punctuated equilibrium manner. Each trial demonstrated the unique evolution of diverse populations, with one type of large species becoming dominant, such as terrestrial animals, marine life, or extinct species, which were ecologically specialized and adapted ones across diverse extreme habitats. We also conducted a long-term experiment with a large population, demonstrating the emergence and coexistence of diverse species.
Newcastle Disease Virus (NDV), classified as Avian orthoavulavirus 1 (avian paramyxovirus type 1), is a promising oncolytic agent that selectively targets and destroys cancer cells while sparing normal tissues. Its oncoselectivity exploits cancer-specific defects in antiviral defenses, particularly impaired Type I interferon signaling, and dysregulated apoptotic pathways, enabling robust viral replication and cytotoxicity in malignancies such as breast, colorectal, and melanoma. NDV induces intrinsic and extrinsic apoptosis through caspase activation and triggers immunogenic cell death via damage-associated molecular patterns, stimulating potent antitumours immune responses. Additionally, NDVs potential as a vaccine vector, expressing tumours-associated antigens, offers prospects for prophylactic and therapeutic cancer applications. This review provides a comprehensive analysis of NDVs morphology, classification, and molecular biology, focusing on its viral entry and replication mechanisms in host cells. It explores NDVs interactions with cancer cells, emphasizing its ability to induce cytotoxicity and immune activation. Understanding these mechanisms is critical for optimizing NDVs oncolytic potential and advancing its clinical translation. Future directions include enhancing NDV through genetic engineering, combining it with therapies like immune checkpoint inhibitors, and developing personalized medicine approaches tailored to tumours genomic profiles. These advancements position NDV as a versatile therapeutic agent in oncolytic virotherapy.
Duplicate marking is a critical preprocessing step in gene sequence analysis to flag redundant reads arising from polymerase chain reaction(PCR) amplification and sequencing artifacts. Although Picard MarkDuplicates is widely recognized as the gold-standard tool, its single-threaded implementation and reliance on global sorting result in significant computational and resource overhead, limiting its efficiency on large-scale datasets. Here, we introduce FastDup: a high-performance, scalable solution that follows the speculation-and-test mechanism. FastDup achieves up to 20x throughput speedup and guarantees 100\% identical output compared to Picard MarkDuplicates. FastDup is a C++ program available from GitHub (https://github.com/zzhofict/FastDup.git) under the MIT license.
We implemented a dynamic agent-based network model to simulate the spread of mpox in a United States-based MSM population. This model allowed us to implement data-informed dynamic network evolution to simulate realistic disease spreading and behavioral adaptations. We found that behavior change, the reduction in one-time partnerships, and widespread vaccination are effective in preventing the transmission of mpox and that earlier intervention has a greater effect, even when only a high-risk portion of the population participates. With no intervention, 16% of the population was infected (25th percentile, 75th percentiles of simulations: 15.3%, 16.6%). With vaccination and behavior change in only the 25% of individuals most likely to have a one-time partner, cumulative infections were reduced by 30%, or a total reduction in nearly 500 infections. Earlier intervention further reduces cumulative infections; beginning vaccination a year before the outbreak results in only 5.5% of men being infected, averting 950 infections or nearly 10% of the total population in our model. We also show that sustained partnerships drive the early outbreak, while one-time partnerships drive transmission after the first initial weeks. The median effective reproductive number, Rt, at t = 0 days is 1.30 for casual partnerships, 1.00 for main, and 0.6 for one-time. By t = 28, the median Rt for one-time partnerships has more than doubled to 1.48, while it decreased for casual and main partnerships: 0.46 and 0.29, respectively. With the ability to model individuals' behavior, mechanistic networks are particularly well suited to studying sexually transmitted infections, the spread and control of which are often governed by individual-level action. Our results contribute valuable insights into the role of different interventions and relationship types in mpox transmission dynamics.
Drug resistance presents a major challenge in cancer therapy. Single cell profiling offers insights into cellular heterogeneity, yet the application of large-scale foundation models for predicting drug response in single cell data remains underexplored. To address this, we developed scDrugMap, an integrated framework featuring both a Python command-line interface and a web server for drug response prediction. scDrugMap evaluates a wide range of foundation models, including eight single-cell models and two large language models, using a curated dataset of over 326,000 cells in the primary collection and 18,800 cells in the validation set, spanning 36 datasets and diverse tissue and cancer types. We benchmarked model performance under pooled-data and cross-data evaluation settings, employing both layer freezing and Low-Rank Adaptation (LoRA) fine-tuning strategies. In the pooled-data scenario, scFoundation achieved the best performance, with mean F1 scores of 0.971 (layer freezing) and 0.947 (fine-tuning), outperforming the lowest-performing model by over 50%. In the cross-data setting, UCE excelled post fine-tuning (mean F1: 0.774), while scGPT led in zero-shot learning (mean F1: 0.858). Overall, scDrugMap provides the first large-scale benchmark of foundation models for drug response prediction in single-cell data and serves as a user-friendly, flexible platform for advancing drug discovery and translational research.
Predicting enzymatic reactions is crucial for applications in biocatalysis, metabolic engineering, and drug discovery, yet it remains a complex and resource-intensive task. Large Language Models (LLMs) have recently demonstrated remarkable success in various scientific domains, e.g., through their ability to generalize knowledge, reason over complex structures, and leverage in-context learning strategies. In this study, we systematically evaluate the capability of LLMs, particularly the Llama-3.1 family (8B and 70B), across three core biochemical tasks: Enzyme Commission number prediction, forward synthesis, and retrosynthesis. We compare single-task and multitask learning strategies, employing parameter-efficient fine-tuning via LoRA adapters. Additionally, we assess performance across different data regimes to explore their adaptability in low-data settings. Our results demonstrate that fine-tuned LLMs capture biochemical knowledge, with multitask learning enhancing forward- and retrosynthesis predictions by leveraging shared enzymatic information. We also identify key limitations, for example challenges in hierarchical EC classification schemes, highlighting areas for further improvement in LLM-driven biochemical modeling.
In a continental-scale fish abundance study, a major challenge in deriving an absolute abundance estimate lies in the fact that regional surveys deploy different gear types, each with its unique field of view, producing gear-specific relative abundance data. Thus, data from regional surveys in the study must be converted from the gear-specific relative scale to an absolute scale before being combined to estimate a continental scale absolute abundance. In this paper, we develop a tool that takes gear-based data as input, and produces as output the required conversion, with associated uncertainty. Methodologically, this tool is operationalized from a Bayesian hierarchical model which we develop in an inferential context that is akin to the change-of-support problem often encountered in spatial studies; the actual context here is to reconcile abundance data at various gear-specific scales, some being relative, and others, absolute. We consider data from a small-scale calibration experiment in which 2 to 4 underwater video camera types, as well as an acoustic echosounder, were simultaneously deployed on each of 21 boat trips. While acoustic fish signals are recorded along transects on the absolute scale, they are subject to confounding from acoustically similar species, thus requiring an externally derived correction factor. Conversely, a camera allows visual distinction between species but records data on a gear-specific relative scale. Our statistical modeling framework reflects the relationship among all 5 gear types across the 21 trips, and the resulting model is used to derive calibration formulae to translate relative abundance data to the corrected absolute abundance scale whenever a camera is deployed alone. Cross-validation is conducted using mark-recapture abundance estimates. We also briefly discuss the case when one camera type is deployed alongside the echosounder.
Generating molecules that bind to specific protein targets via diffusion models has shown good promise for structure-based drug design and molecule optimization. Especially, the diffusion models with binding interaction guidance enables molecule generation with high affinity through forming favorable interaction within protein pocket. However, the generated molecules may not form interactions with the highly conserved residues, which are important for protein functions and bioactivities of the ligands. Herein, we developed a new 3D target-aware diffusion model DiffDecip, which explicitly incorporates the protein-ligand binding interactions and evolutionary conservation information of protein residues into both diffusion and sampling process, for molecule optimization through scaffold decoration. The model performance revealed that DiffDecip outperforms baseline model DiffDec on molecule optimization towards higher affinity through forming more non-covalent interactions with highly conserved residues in the protein pocket.
This study systematically evaluates 27 frontier Large Language Models on eight diverse biology benchmarks spanning molecular biology, genetics, cloning, virology, and biosecurity. Models from major AI developers released between November 2022 and April 2025 were assessed through ten independent runs per benchmark. The findings reveal dramatic improvements in biological capabilities. Top model performance increased more than 4-fold on the challenging text-only subset of the Virology Capabilities Test over the study period, with the top model now performing twice as well as expert virologists. Several models now match or exceed expert-level performance on other challenging benchmarks, including LAB-Bench CloningScenarios and the biology subsets of GPQA and WMDP. Contrary to expectations, chain-of-thought did not substantially improve performance over zero-shot evaluation, while extended reasoning features in o3-mini and Claude 3.7 Sonnet typically improved performance as predicted by inference scaling. Benchmarks such as PubMedQA and the MMLU and WMDP biology subsets exhibited performance plateaus well below 100%, suggesting benchmark saturation and errors in the underlying benchmark data. The analysis highlights the need for more sophisticated evaluation methodologies as AI systems continue to advance.