The stock market is extremely difficult to predict in the short term due to high market volatility, changes caused by news, and the non-linear nature of the financial time series. This research proposes a novel framework for improving minute-level prediction accuracy using semantic sentiment scores from top ten different large language models (LLMs) combined with minute interval intraday stock price data. We systematically constructed a time-aligned dataset of AAPL news articles and 1-minute Apple Inc. (AAPL) stock prices for the dates of April 4 to May 2, 2025. The sentiment analysis was achieved using the DeepSeek-V3, GPT variants, LLaMA, Claude, Gemini, Qwen, and Mistral models through their APIs. Each article obtained sentiment scores from all ten LLMs, which were scaled to a [0, 1] range and combined with prices and technical indicators like RSI, ROC, and Bollinger Band Width. Two state-of-the-art such as Reformer and Mamba were trained separately on the dataset using the sentiment scores produced by each LLM as input. Hyper parameters were optimized by means of Optuna and were evaluated through a 3-day evaluation period. Reformer had mean squared error (MSE) or the evaluation metrics, and it should be noted that Mamba performed not only faster but also better than Reformer for every LLM across the 10 LLMs tested. Mamba performed best with LLaMA 3.3--70B, with the lowest error of 0.137. While Reformer could capture broader trends within the data, the model appeared to over smooth sudden changes by the LLMs. This study highlights the potential of integrating LLM-based semantic analysis paired with efficient temporal modeling to enhance real-time financial forecasting.
We provide series expansions for the tempered stable densities and for the price of European-style contracts in the exponential Lévy model driven by the tempered stable process. These formulas recover several popular option pricing models, and become particularly simple in some specific cases such as bilateral Gamma process and one-sided TS process. When compared to traditional Fourier pricing, our method has the advantage of being hyperparameter free. We also provide a detailed numerical analysis and show that our technique is competitive with state-of-the-art pricing methods.
Previous research indicates that zero tillage technology offers a profitable alternative to crop residue burning, with significant potential to reduce agricultural emissions and contribute to improvements in air quality and public health. Yet, empirical evidence on the link between zero tillage adoption and residue burning remains scarce, adding to the difficulties policy makers face in this context. This study addresses this gap by integrating high-resolution satellite imagery with household survey data from India to examine the empirical relationship between zero tillage and residue burning. We compare different methods for constructing burn indicators from remote-sensing data and assess their predictive power against survey-based measures. Our findings reveal a robust negative association between zero tillage and crop residue burning, with reductions in the incidence of burning of 50% or more across both survey data and satellite-derived indicators. By providing insights into optimal geospatial data integration methods, our study also makes a methodological contribution that can inform future research and support evidence-based policy interventions for more sustainable agricultural practices.
This study investigates the application of machine learning techniques, specifically Neural Networks, Random Forests, and CatBoost for option pricing, in comparison to traditional models such as Black-Scholes and Heston Model. Using both synthetically generated data and real market option data, each model is evaluated in predicting the option price. The results show that machine learning models can capture complex, non-linear relationships in option prices and, in several cases, outperform both Black-Scholes and Heston models. These findings highlight the potential of data-driven methods to improve pricing accuracy and better reflect market dynamics.
This paper investigates the impact of the adoption of generative AI on financial stability. We conduct laboratory-style experiments using large language models to replicate classic studies on herd behavior in trading decisions. Our results show that AI agents make more rational decisions than humans, relying predominantly on private information over market trends. Increased reliance on AI-powered trading advice could therefore potentially lead to fewer asset price bubbles arising from animal spirits that trade by following the herd. However, exploring variations in the experimental settings reveals that AI agents can be induced to herd optimally when explicitly guided to make profit-maximizing decisions. While optimal herding improves market discipline, this behavior still carries potential implications for financial stability. In other experimental variations, we show that AI agents are not purely algorithmic, but have inherited some elements of human conditioning and bias.
Equity markets have long been regarded as unpredictable, with intraday price movements treated as stochastic noise. This study challenges that view by introducing the Extended Samuelson Model (ESM), a natural science-based framework that captures the dynamic, causal processes underlying market behavior. ESM identifies peaks, troughs, and turning points across multiple timescales and demonstrates temporal compatibility: finer timeframes contain all signals of broader ones while offering sharper directional guidance. Beyond theory, ESM translates into practical trading strategies. During intraday sessions, it reliably anticipates short-term reversals and longer-term trends, even under the influence of breaking news. Its eight market states and six directional signals provide actionable guardrails for traders, enabling consistent profit opportunities. Notably, even during calm periods, ESM can capture 10-point swings in the S&P 500, equivalent to $500 per E-mini futures contract. These findings resonate with the state-based approaches attributed to Renaissance Technologies' Medallion Fund, which delivered extraordinary returns through systematic intraday trading. By bridging normal conditions with crisis dynamics, ESM not only advances the scientific understanding of market evolution but also provides a robust, actionable roadmap for profitable trading.
This paper examines the impact of temperature shocks on European Parliament elections. We combine high-resolution climate data with results from parliamentary elections between 1989 and 2019, aggregated at the NUTS-2 regional level. Exploiting exogenous variation in unusually warm and hot days during the months preceding elections, we identify the effect of short-run temperature shocks on voting behaviour. We find that temperature shocks reduce ideological polarisation and increase vote concentration, as voters consolidate around larger, more moderate parties. This aggregated pattern is explained by a gain in support of liberal and, to a lesser extent, social democratic parties, while right-wing parties lose vote share. Consistent with a salience mechanism, complementary analysis of party manifestos shows greater emphasis on climate-related issues in warmer pre-electoral contexts. Overall, our findings indicate that climate shocks can shift party systems toward the centre and weaken political extremes.
Wiesel and Zhang [2023] established that two probability measures $\mu,\nu$ on $\mathbb{R}^d$ with finite second moments are in convex order (i.e. $\mu \preceq_c \nu$) if and only if $W_2(\nu,\rho)^2-W_2(\mu,\rho)^2 \leq \int |y|^2\nu(dy) - \int |x|^2\mu(dx).$ Let us call a measure $\rho$ maximizing $W_2(\nu,\rho)^2-W_2(\mu,\rho)^2$ the optimal $\rho$. This paper summarizes key findings by Wiesel and Zhang, develops new algorithms enhancing the search of optimal $\rho$, and builds on the paper through constructing a model-independent arbitrage strategy and developing associated numerical methods via the convex function recovered from the optimal $\rho$ through Brenier's theorem. In addition to examining the link between convex order and arbitrage through the lens of optimal transport, the paper also gives a brief survey of functionally generated portfolio in stochastic portfolio theory and offers a conjecture of the link between convex order and arbitrage between two functionally generated portfolios.
The Santa Fe model is an established econophysics model for describing stochastic dynamics of the limit order book from the viewpoint of the zero-intelligence approach. While its foundation was studied by combining a dimensional analysis and a mean-field theory by E. Smith et al. in Quantitative Finance 2003, their arguments are rather heuristic and lack solid mathematical foundation; indeed, their mean-field equations were derived with heuristic arguments and their solutions were not explicitly obtained. In this work, we revisit the mean-field theory of the Santa Fe model from the viewpoint of kinetic theory -- a traditional mathematical program in statistical physics. We study the exact master equation for the Santa Fe model and systematically derive the Bogoliubov-Born-Green-Kirkwood-Yvon (BBGKY) hierarchical equation. By applying the mean-field approximation, we derive the mean-field equation for the order-book density profile, parallel to the Boltzmann equation in conventional statistical physics. Furthermore, we obtain explicit and closed expression of the mean-field solutions. Our solutions have several implications: (1)Our scaling formulas are available for both $\mu\to 0$ and $\mu\to \infty$ asymptotics, where $\mu$ is the market-order submission intensity. Particularly, the mean-field theory works very well for small $\mu$, while its validity is partially limited for large $\mu$. (2)The ``method of image'' solution, heuristically derived by Bouchaud-Mézard-Potters in Quantitative Finance 2002, is obtained for large $\mu$, serving as a mathematical foundation for their heuristic arguments. (3)Finally, we point out an error in E. Smith et al. 2003 in the scaling law for the diffusion constant due to a misspecification in their dimensional analysis.
Text-to-SQL, the task of translating natural language questions into SQL queries, has long been a central challenge in NLP. While progress has been significant, applying it to the financial domain remains especially difficult due to complex schema, domain-specific terminology, and high stakes of error. Despite this, there is no dedicated large-scale financial dataset to advance research, creating a critical gap. To address this, we introduce a curated financial dataset (FINCH) comprising 292 tables and 75,725 natural language-SQL pairs, enabling both fine-tuning and rigorous evaluation. Building on this resource, we benchmark reasoning models and language models of varying scales, providing a systematic analysis of their strengths and limitations in financial Text-to-SQL tasks. Finally, we propose a finance-oriented evaluation metric (FINCH Score) that captures nuances overlooked by existing measures, offering a more faithful assessment of model performance.
Battery Energy Storage Systems (BESS) are a cornerstone of the energy transition, as their ability to shift electricity across time enables both grid stability and the integration of renewable generation. This paper investigates the profitability of different market bidding strategies for BESS in the Central European wholesale power market, focusing on the day-ahead auction and intraday trading at EPEX Spot. We employ the rolling intrinsic approach as a realistic trading strategy for continuous intraday markets, explicitly incorporating bid--ask spreads to account for liquidity constraints. Our analysis shows that multi-market bidding strategies consistently outperform single-market participation. Furthermore, we demonstrate that maximum cycle limits significantly affect profitability, indicating that more flexible strategies which relax daily cycling constraints while respecting annual limits can unlock additional value.
Dependence among multiple lifetimes is a key factor for pricing and evaluating the risk of joint life insurance products. The dependence structure can be exposed to model uncertainty when available data and information are limited. We address robust pricing and risk evaluation of joint life insurance products against dependence uncertainty among lifetimes. We first show that, for some class of standard contracts, the risk evaluation based on distortion risk measure is monotone with respect to the concordance order of the underlying copula. Based on this monotonicity, we then study the most conservative and anti-conservative risk evaluations for this class of contracts. We prove that the bounds for the mean, Value-at-Risk and Expected shortfall are computed by a combination of linear programs when the uncertainty set is defined by some norm-ball centered around a reference copula. Our numerical analysis reveals that the sensitivity of the risk evaluation against the choice of the copula differs depending on the risk measure and the type of the contract, and our proposed bounds can improve the existing bounds based on the available information.
We explore a link between stochastic volatility (SV) and path-dependent volatility (PDV) models. Using assumed density filtering, we map a given SV model into a corresponding PDV representation. The resulting specification is lightweight, improves in-sample fit, and delivers robust out-of-sample forecasts. We also introduce a calibration procedure for both SV and PDV models that produces standard errors for parameter estimates and supports joint calibration of SPX/VIX smile.
This work outlines the modeling steps for developing a tool aimed at supporting policymakers in guiding policies toward more sustainable wheat production. In the agricultural sector,policies affect a highly diverse set of farms, which differ across several dimensions such as size,land composition, local climate, and irrigation availability. To address this significant heterogeneity, we construct an Agent-Based Model (ABM). The model is initialized using a representative survey of Italian farms, which captures their heterogeneity. The ABM is then scaled to include a number of farms comparable to those operating nationwide. To capture broader dynamics, the ABM is integrated with two additional components:a global model of international wheat markets and a tool for assessing the environmental impacts of wheat production. This integrated framework enables us to account for the feedback loop between global prices and local production while evaluating the environmental implications of policy measures.
Domain-specific quantitative reasoning remains a major challenge for large language models (LLMs), especially in fields requiring expert knowledge and complex question answering (QA). In this work, we propose Expert Question Decomposition (EQD), an approach designed to balance the use of domain knowledge with computational efficiency. EQD is built on a two-step fine-tuning framework and guided by a reward function that measures the effectiveness of generated sub-questions in improving QA outcomes. It requires only a few thousand training examples and a single A100 GPU for fine-tuning, with inference time comparable to zero-shot prompting. Beyond its efficiency, EQD outperforms state-of-the-art domain-tuned models and advanced prompting strategies. We evaluate EQD in the financial domain, characterized by specialized knowledge and complex quantitative reasoning, across four benchmark datasets. Our method consistently improves QA performance by 0.6% to 10.5% across different LLMs. Our analysis reveals an important insight: in domain-specific QA, a single supporting question often provides greater benefit than detailed guidance steps.
What role do non-elected bureaucrats play when elections provide imperfect accountability and create incentives for pandering? We develop a model where politicians and bureaucrats interact to implement policy. Both can either be good, sharing the voters' preferences over policies, or bad, intent on enacting policies that favor special interests. Our analysis identifies the conditions under which good bureaucrats choose to support, oppose, or force pandering. When bureaucrats wield significant influence over policy decisions, good politicians lose their incentives to pander, a shift that ultimately benefits voters. An intermediate level of bureaucratic influence over policymaking can be voter-optimal: large enough to prevent pandering but small enough to avoid granting excessive influence to potentially bad bureaucrats.
This paper proposes a ridgeless kernel method for solving infinite-horizon, deterministic, continuous-time models in economic dynamics, formulated as systems of differential-algebraic equations with asymptotic boundary conditions (e.g., transversality). Traditional shooting methods enforce the asymptotic boundary conditions by targeting a known steady state -- which is numerically unstable, hard to tune, and unable to address cases with steady-state multiplicity. Instead, our approach solves the underdetermined problem without imposing the asymptotic boundary condition, using regularization to select the unique solution fulfilling transversality among admissible trajectories. In particular, ridgeless kernel methods recover this path by selecting the minimum norm solution, coinciding with the non-explosive trajectory. We provide theoretical guarantees showing that kernel solutions satisfy asymptotic boundary conditions without imposing them directly, and we establish a consistency result ensuring convergence within the solution concept of differential-algebraic equations. Finally, we illustrate the method in canonical models and demonstrate its ability to handle problems with multiple steady states.
In a continuous-time economy, this paper formulates the Epstein-Zin preference for discounted dividends received by an investor as an Epstein-Zin singular control utility. We introduce a backward stochastic differential equation with an aggregator integrated with respect to a singular control, prove its well-posedness, and show that it coincides with the Epstein-Zin singular control utility. We then establish that this formulation is equivalent to a robust dividend policy chosen by the firm's executive under the Maenhout's ambiguity-averse preference. In particular, the robust dividend policy takes the form of a threshold strategy on the firm's surplus process, where the threshold level is characterized as the free boundary of a Hamilton-Jacobi-Bellman variational inequality. Therefore, dividend-caring investors can choose firms that match their preferences by examining stock's dividend policies and financial statements, whereas executives can make use of dividend to signal their confidence, in the form of ambiguity aversion, on realizing the earnings implied by their financial statements.
This research expands the existing literature on Bitcoin (BTC) price misalignments by incorporating transaction-level data from a peer-to-peer (P2P) exchange, this http URL (LB). It examines how broader economic and regulatory factors influence cryptocurrency markets and highlights the role of cryptocurrencies in facilitating international capital movements. By constructing shadow exchange rates (SERs) for national currencies against the US dollar based on BTC prices, we calculate discrepancies between these SERs and their official exchange rates (OERs), referred to as BTC premiums. We analyze various factors driving the BTC premiums on LB, including those sourced from the BTC blockchain, mainstream centralized BTC exchanges, and international capital transfer channels. Unlike in centralized markets, our results indicate that the microstructure of the BTC blockchain does not correlate with BTC premiums in the P2P market. Regarding frictions from international capital transfers, we interpret remittance costs as indicators of inefficiencies in traditional capital transfer systems. For constrained currencies subject to severe capital controls and managed exchange rate regimes, increased transaction costs in conventional currency exchange channels almost entirely translate into higher BTC premiums. Additionally, our analysis suggests that BTC premiums can serve as short-term predictors of future exchange rate depreciation for unconstrained currencies.
The rapid rise of e-commerce has transformed consumer behavior, prompting questions about how online adoption influences offline shopping. We examine whether consumers who adopt a retailer's online shopping channels become more price-sensitive in their subsequent offline purchases with that retailer. Using transaction-level data from a large Brazilian pet supplies retailer operating both online and offline, we compare "adopters" -- customers who began shopping online after a period of offline-only purchasing -- with "non-adopters" who remained offline-only. We estimate a discrete choice logit model with individual-level heterogeneity, based on an algorithm that can handle both high-dimensional fixed effects and price endogeneity. We then apply a staggered difference-in-differences approach to the estimated price elasticities and obtain the Average Treatment Effect on the Treated (ATT). We find that offline price sensitivity increases significantly after online adoption in three out of four product categories, particularly in items with low switching costs, such as pet hygiene. These results underscore the importance of recognizing cross-channel effects in consumer behavior and contribute to the literature on pricing and multichannel retailing by identifying online adoption as a key driver of offline price sensitivity.
We build on theoretical results from the mechanism design literature to analyze empirical models of second-degree price discrimination (2PD). We show that for a random-coefficients discrete choice ("BLP") model to be suitable for studying 2PD, it must capture the covariance between two key random effects: (i) the "baseline" willingness to pay (affecting all product versions), and (ii) the perceived differentiation between versions. We then develop an experimental design that, among other features, identifies this covariance under common data constraints in 2PD environments. We implement this experiment in the field in collaboration with an international airline. Estimating the theoretically motivated empirical model on the experimental data, we demonstrate its applicability to 2PD decisions. We also show that test statistics from our design can enable qualitative inference on optimal 2PD policy even before estimating a demand model. Our methodology applies broadly across second-degree price discrimination settings.
With the growth of artificial skills, organizations are increasingly confronting the problem of optimizing skill policy decisions guided by economic principles. This paper addresses the underlying complexity of this challenge by developing an in-silico framework based on Monte Carlo simulations grounded in empirical realism to analyze the economic impact of human and machine skills, individually or jointly deployed, in the execution of tasks presenting varying levels of complexity. Our results provide quantitative support for the established notions that automation tends to be the most economically-effective strategy for tasks characterized by low-to-medium generalization difficulty, while automation may struggle to match the economic utility of human skills in more complex scenarios. Critically, our simulations highlight that, when a high level of generalization is required and the cost of errors is high, combining human and machine skills can be the most effective strategy, but only if genuine augmentation is achieved. In contrast, when failing to realize this synergy, the human-machine policy is severely penalized by the inherent costs of its dual skill structure, causing it to destroy value and become the worst choice from an economic perspective. The takeaway for decision-makers is unambiguous: in complex and critical contexts, simply allocating human and machine skills to a task may be insufficient, and a human-machine skill policy is neither a silver-bullet solution nor a low-risk compromise. Rather, it is a critical opportunity to boost competitiveness that demands a strong organizational commitment to enabling augmentation. Also, our findings show that improving the cost-effectiveness of machine skills over time, while useful, does not replace the fundamental need to focus on achieving augmentation.
We introduce the first version of the AI Productivity Index (APEX), a benchmark for assessing whether frontier AI models can perform knowledge work with high economic value. APEX addresses one of the largest inefficiencies in AI research: outside of coding, benchmarks often fail to test economically relevant capabilities. APEX-v1.0 contains 200 test cases and covers four domains: investment banking, management consulting, law, and primary medical care. It was built in three steps. First, we sourced experts with top-tier experience e.g., investment bankers from Goldman Sachs. Second, experts created prompts that reflect high-value tasks in their day-to-day work. Third, experts created rubrics for evaluating model responses. We evaluate 23 frontier models on APEX-v1.0 using an LM judge. GPT 5 (Thinking = High) achieves the highest mean score (64.2%), followed by Grok 4 (61.3%) and Gemini 2.5 Flash (Thinking = On) (60.4%). Qwen 3 235B is the best performing open-source model and seventh best overall. There is a large gap between the performance of even the best models and human experts, highlighting the need for better measurement of models' ability to produce economically valuable work.
Lying at the interface between Network Science and Machine Learning, node embedding algorithms take a graph as input and encode its structure onto output vectors that represent nodes in an abstract geometric space, enabling various vector-based downstream tasks such as network modelling, data compression, link prediction, and community detection. Two apparently unrelated limitations affect these algorithms. On one hand, it is not clear what the basic operation defining vector spaces, i.e. the vector sum, corresponds to in terms of the original nodes in the network. On the other hand, while the same input network can be represented at multiple levels of resolution by coarse-graining the constituent nodes into arbitrary block-nodes, the relationship between node embeddings obtained at different hierarchical levels is not understood. Here, building on recent results in network renormalization theory, we address these two limitations at once and define a multiscale node embedding method that, upon arbitrary coarse-grainings, ensures statistical consistency of the embedding vector of a block-node with the sum of the embedding vectors of its constituent nodes. We illustrate the power of this approach on two economic networks that can be naturally represented at multiple resolution levels: namely, the international trade between (sets of) countries and the input-output flows among (sets of) industries in the Netherlands. We confirm the statistical consistency between networks retrieved from coarse-grained node vectors and networks retrieved from sums of fine-grained node vectors, a result that cannot be achieved by alternative methods. Several key network properties, including a large number of triangles, are successfully replicated already from embeddings of very low dimensionality, allowing for the generation of faithful replicas of the original networks at arbitrary resolution levels.