New articles on Electrical Engineering and Systems Science


[1] 2604.00048

Whittaker-Henderson smoother for long satellite image time series interpolation

Whittaker smoother is a widely adopted solution to pre-process satellite image time series. Yet, two key limitations remain: the smoothing parameter must be tuned individually for each pixel, and the standard formulation assumes homoscedastic noise, imposing uniform smoothing across the temporal dimension. This paper addresses both limitations by casting the Whittaker smoother as a differentiable neural layer, in which the smoothing parameter is inferred by a neural network. The framework is further extended to handle heteroscedastic noise through a time-varying regularization, allowing the degree of smoothing to adapt locally along the time series. To enable large-scale processing, a sparse, memory-efficient, and fully differentiable implementation is proposed, exploiting the symmetric banded structure of the underlying linear system via Cholesky factorization. Benchmarks on GPU demonstrate that this implementation substantially outperforms standard dense linear solvers, both in speed and memory consumption. The approach is validated on SITS acquired over the French metropolitan territory between 2016 and 2024. Results confirm the feasibility of large-scale heteroscedastic Whittaker smoothing, though reconstruction differences with the homoscedastic baseline remain limited, suggesting that the transformer architecture used for smoothing parameter estimation may lack the temporal acuity needed to capture abrupt noise variations such as singleday cloud contamination.


[2] 2604.00070

Brain MR Image Synthesis with Multi-contrast Self-attention GAN

Accurate and complete multi-modal Magnetic Resonance Imaging (MRI) is essential for neuro-oncological assessment, as each contrast provides complementary anatomical and pathological information. However, acquiring all modalities (e.g., T1c, T1n, T2, T2f) for every patient is often impractical due to time, cost, and patient discomfort, potentially limiting comprehensive tumour evaluation. We propose 3D-MC-SAGAN (3D Multi-Contrast Self-Attention generative adversarial network), a unified 3D multi-contrast synthesis framework that generates high-fidelity missing modalities from a single T2 input while explicitly preserving tumour characteristics. The model employs a multi-scale 3D encoder-decoder generator with residual connections and a novel Memory-Bounded Hybrid Attention (MBHA) block to capture long-range dependencies efficiently, and is trained with a WGAN-GP critic and an auxiliary contrast-conditioning branch to produce T2f, T1n, and T1c volumes within a single unified network. A frozen 3D U-Net-based segmentation module introduces a segmentation-consistency constraint to preserve lesion morphology. The composite objective integrates adversarial, reconstruction, perceptual, structural similarity, contrast-classification, and segmentation-guided losses to align global realism with tumour-preserving structure. Extensive evaluation on 3D brain MRI datasets demonstrates that 3D-MC-SAGAN achieves state-of-the-art quantitative performance and generates visually coherent, anatomically plausible contrasts with improved distribution-level realism. Moreover, it maintains tumour segmentation accuracy comparable to fully acquired multi-modal inputs, highlighting its potential to reduce acquisition burden while preserving clinically meaningful information.


[3] 2604.00119

Contracting Neural Networks: Sharp LMI Conditions with Applications to Integral Control and Deep Learning

This paper studies contractivity of firing-rate and Hopfield recurrent neural networks. We derive sharp LMI conditions on the synaptic matrices that characterize contractivity of both architectures, for activation functions that are either non-expansive or monotone non-expansive, in both continuous and discrete time. We establish structural relationships among these conditions, including connections to Schur diagonal stability and the recovery of optimal contraction rates for symmetric synaptic matrices. We demonstrate the utility of these results through two applications. First, we develop an LMI-based design procedure for low-gain integral controllers enabling reference tracking in contracting firing rate networks. Second, we provide an exact parameterization of weight matrices that guarantee contraction and use it to improve the expressivity of Implicit Neural Networks, achieving competitive performance on image classification benchmarks with fewer parameters.


[4] 2604.00135

Temperature Control of Digital Glass Forming Processes

Digital Glass Forming (DGF) is a new manufacturing process for low-batch glass fabrication. The work zone temperature in DGF processes must be maintained in the glass's working range to ensure good fabrication. If the temperature is too low, the filament will not wet to the substrate or previously deposited material and, if the temperature is too high, the filament may disengage from the substrate or previously deposited material, or it may partially vaporize. In this work, a real-time temperature control system capable of synchronizing process parameter, thermal camera, and visual camera data for the DGF process is introduced. A process parameter map for a scan velocity of 0.5 mm/s is constructed, as is a data-driven dynamic temperature process model. A digital controller is designed to regulate the work zone temperature. The temperature controller is a closed loop tracking controller that adjusts the commanded laser power to regulate the measured temperature. Two sets of experiments are conducted to analyze the controller performance. In the first set of experiments, single tracks on a substrate are fabricated with constant laser power and with the closed loop temperature controller. It is seen that the closed loop controller is able to extend the process parameter map into regions where using a constant laser power will result in a failed build. In the second set of experiments, walls are fabricated. Using constant laser power results in a failed build (i.e., material vaporization at the corners and the filament prematurely detaching from the substrate) as the temperature process dynamics change with layer and at the corners. The closed loop controller successfully fabricated the wall without vaporization at the corners and premature filament detachment as the controller adjusts the laser power to account for the changing temperature process dynamics.


[5] 2604.00150

Data-Driven Reachability of Nonlinear Lipschitz Systems via Koopman Operator Embeddings

Data-driven safety verification of robotic systems often relies on zonotopic reachability analysis due to its scalability and computational efficiency. However, for nonlinear systems, these methods can become overly conservative, especially over long prediction horizons and under measurement noise. We propose a data-driven reachability framework based on the Koopman operator and zonotopic set representations that lifts the nonlinear system into a finite-dimensional, linear, state-input-dependent model. Reachable sets are then computed in the lifted space and projected back to the original state space to obtain guaranteed over-approximations of the true dynamics. The proposed method reduces conservatism while preserving formal safety guarantees, and we prove that the resulting reachable sets over-approximate the true reachable sets. Numerical simulations and real-world experiments on an autonomous vehicle show that the proposed approach yields substantially tighter reachable set over-approximations than both model-based and linear data-driven methods, particularly over long horizons.


[6] 2604.00173

Advanced Capacity Accreditation of Future Energy System Resources with Deep Uncertainties

The electric power sector has seen an increased penetration of renewable energy sources (RESs) that could strain the system reliability due to their inherent uncertainties in availability and controllability. Effective load carrying capability (ELCC) is widely used to quantify the reliability contributions of these RESs. However, existing ELCC methods can over- or under-estimate their contributions and often neglect or simplify other critical factors such as transmission constraints and evolving climate trends, leading to inaccurate capacity credit (CC) allocations and inefficient reliability procurement in capacity markets. To address these limitations, this paper proposes TRACED (TRansmission And Climate Enhanced Delta) -- an advanced capacity accreditation approach that integrates transmission constraints and climate-adjusted system conditions into a Delta ELCC evaluation. Case studies on a modified IEEE-118 bus system with high RES and energy storage penetrations demonstrate that TRACED produces portfolio-consistent CC allocations by capturing resource interactions and avoiding the double-counting of shared reliability benefits inherent in marginal ELCC, which may otherwise lead to under-procurement of reliability resources. Results further demonstrate that transmission congestion and evolving climate trends have mutual impacts on CC allocation, justifying their necessary integration into TRACED.


[7] 2604.00179

Finite-Time Analysis of Projected Two-Time-Scale Stochastic Approximation

We study the finite-time convergence of projected linear two-time-scale stochastic approximation with constant step sizes and Polyak--Ruppert averaging. We establish an explicit mean-square error bound, decomposing it into two interpretable components, an approximation error determined by the constrained subspace and a statistical error decaying at a sublinear rate, with constants expressed through restricted stability margins and a coupling invertibility condition. These constants cleanly separate the effect of subspace choice (approximation errors) from the effect of the averaging horizon (statistical errors). We illustrate our theoretical results through a number of numerical experiments on both synthetic and reinforcement learning problems.


[8] 2604.00186

Agentic AI and Occupational Displacement: A Multi-Regional Task Exposure Analysis of Emerging Labor Market Disruption

This paper extends the Acemoglu-Restrepo task exposure framework to address the labor market effects of agentic artificial intelligence systems: autonomous AI agents capable of completing entire occupational workflows rather than discrete tasks. Unlike prior automation technologies that substitute for individual subtasks, agentic AI systems execute end-to-end workflows involving multi-step reasoning, tool invocation, and autonomous decision-making, substantially expanding occupational displacement risk beyond what existing task-level analyses capture. We introduce the Agentic Task Exposure (ATE) score, a composite measure computed algorithmically from O*NET task data using calibrated adoption parameters--not a regression estimate--incorporating AI capability scores, workflow coverage factors, and logistic adoption velocity. Applying the ATE framework across five major US technology regions (Seattle-Tacoma, San Francisco Bay Area, Austin, New York, and Boston) over a 2025-2030 horizon, we find that 93.2% of the 236 analyzed occupations across six information-intensive SOC groups (financial, legal, healthcare, healthcare support, sales, and administrative/clerical) cross the moderate-risk threshold (ATE >= 0.35) in Tier 1 regions by 2030, with credit analysts, judges, and sustainability specialists reaching ATE scores of 0.43-0.47. We simultaneously identify seventeen emerging occupational categories benefiting from reinstatement effects, concentrated in human-AI collaboration, AI governance, and domain-specific AI operations roles. Our findings carry implications for workforce transition policy, regional economic planning, and the temporal dynamics of labor market adjustment


[9] 2604.00201

Scalable machine learning-based approaches for energy saving in densely deployed Open RAN

Densely deployed base stations are responsible for the majority of the energy consumed in Radio access network (RAN). While these deployments are crucial to deliver the required data rate in busy hours of the day, the network can save energy by switching some of them to sleep mode and maintain the coverage and quality of service with the other ones. Benefiting from the flexibility provided by the Open RAN in embedding machine learning (ML) in network operations, in this work we propose Deep Reinforcement Learning (DRL)-based energy saving solutions. Firstly we propose 3 different DRL-based methods in the form of xApps which control the Active/Sleep mode of up to 6 radio units (RUs) from Near Real time RAN Intelligent Controller (RIC). We also propose a further scalable federated DRL-based solution with an aggregator as an rApp in None Real time RIC and local agents as xApps. Our simulation results present the convergence of the proposed methods. We also compare the performance of our federated DRL across three layouts spanning 6--24 RUs and 500--1000\,m regions, including a composite multi-region scenario. The results show that our proposed federated TD3 algorithm achieves up to 43.75\% faster convergence, more than 50\% network energy saving and 37. 4\% lower training energy versus centralized baselines, while maintaining the quality of service and improving the robustness of the policy.


[10] 2604.00215

Agentic AI for Clinical Urgency Mapping and Queue Optimization in High-Volume Outpatient Departments: A Simulation-Based Evaluation

Outpatient departments (OPDs) in Indian public hospitals face severe overcrowding, with daily volumes reaching 200--8,000 patients~\cite{aiims2020annual}. The prevailing First-Come-First-Served (FCFS) token system treats all patients equally regardless of clinical urgency, leading to dangerous delays for critical cases. We present an agentic AI framework integrating six components: voice-based multilingual symptom capture (modeled), LLM-powered severity prediction, load-aware physician assignment, adaptive queue optimization with urgency drift detection, a multi-objective orchestrator, and a Patient Memory System for longitudinal context-aware triage. Evaluated through discrete-event simulation of a District Hospital in Jabalpur (Madhya Pradesh) with 368 synthetic patients over 30 runs, the framework achieves 94.2\% critical patients seen within 10 minutes (vs.~30.8\% under FCFS), detects $\sim$236 simulated urgency drift events per session (modeled via stochastic deterioration probabilities), identifies $\sim$11.9 additional hidden-critical cases via patient memory, and recomposes queue urgency distribution from 13/36/158/161 (Critical/High/Medium/Low) to $\sim$25/178/115/50 through continuous reassessment, while maintaining comparable throughput ($\sim$40.4 patients/hour).


[11] 2604.00224

Learning Compact Terrain-Context Representations for Feasibility-Aware Offline Reinforcement Learning in UAV Relaying Networks

Offline reinforcement learning (RL) is an attractive tool for unmanned aerial vehicle (UAV) systems, where online exploration is costly and raises safety concerns. In terrain-aware UAV relaying, agents may observe high-dimensional inputs such as terrain and land-cover maps, which describe the propagation environment, but complicate offline learning from fixed datasets. This paper investigates the impact of compact state representations on offline RL for UAV relaying. End-to-end service is jointly constrained by UAV--user access links and a base-station--to--UAV backhaul link, yielding feasibility limits driven by user mobility and independent of UAV control. To distinguish feasibility limits from control-induced sub-optimality, a candidate-set feasibility upper bound (CS-FUB) is introduced, which estimates the maximum achievable user coverage over a restricted set of UAV placements. To address high-dimensional terrain context, map-like observations are compressed into low-dimensional latent representations using a variational autoencoder (VAE) and policies are trained via Conservative Q-Learning (CQL). Simulation results show that training CQL directly on raw high-dimensional terrain-context states leads to slow convergence and large feasibility gaps. In contrast, VAE-encoded representations improve learning stability, enable earlier convergence to feasible relay configurations, and reduce sub-optimality relative to physical limits. Comparisons with autoencoder and linear compression baselines further demonstrate the benefit of structured representation learning for effective offline RL in terrain-aware UAV systems.


[12] 2604.00225

Pupil Design for Computational Wavefront Estimation

Establishing a precise connection between imaged intensity and the incident wavefront is essential for emerging applications in adaptive optics, holography, computational microscopy, and non-line-of-sight imaging. While prior work has shown that breaking symmetries in pupil design enables wavefront recovery from a single intensity measurement, there is little guidance on how to design a pupil that improves wavefront estimation. In this work we introduce a quantitative asymmetry metric to bridge this gap and, through an extensive empirical study and supporting analysis, demonstrate that increasing asymmetry enhances wavefront recoverability. We analyze the trade-offs in pupil design, and the impact on light throughput along with performance in noise. Both large-scale simulations and optical bench experiments are carried out to support our findings.


[13] 2604.00246

Harmonization mitigates diffusion MRI scanner effects in infancy: insights from the HEALthy Brain and Childhood Development (HBCD) study

The HEALthy Brain and Childhood Development (HBCD) Study is an ongoing longitudinal initiative to understand population-level brain maturation; however, large-scale studies must overcome site-related variance and preserve biologically relevant signal. In addition to diffusion-weighted magnetic resonance imaging images, the HBCD dataset offers analysis-ready derivatives for scientists to conduct their analysis, including scalar diffusion tensor (DTI) metrics in a predetermined set of bundles. The purpose of this study is to characterize HBCD-specific site effects in diffusion MRI data, which have not been systematically reported. In this work, we investigate the sensitivity of HBCD bundle metrics to scanner model-related variance and address these variations with ComBat-GAM harmonization within the current HBCD data release 1.1 across six scanner models. Following ComBat-GAM, we observe zero statistically significant differences between the distributions from any scanner model following FDR correction and reduce Cohen's f effect sizes across all metrics. Our work underscores the importance of rigorous harmonization efforts in large-scale studies, and we encourage future investigations of HBCD data to control for these effects.


[14] 2604.00251

Evaluation of neuroCombat and deep learning harmonization for multi-site magnetic resonance neuroimaging in youth with prenatal alcohol exposure

In cases of prevalent diseases and disorders, such as Prenatal Alcohol Exposure (PAE), multi-site data collection allows for increased study samples. However, multi-site studies introduce additional variability through heterogeneous collection materials, such as scanner and acquisition protocols, which confound with biologically relevant signals. Neuroscientists often utilize statistical methods on image-derived metrics, such as volume of regions of interest, after all image processing to minimize site-related variance. HACA3, a deep learning harmonization method, offers an opportunity to harmonize image signals prior to metric quantification; however, HACA3 has not yet been validated in a pediatric cohort. In this work, we investigate HACA3's ability to remove site-related variance and preserve biologically relevant signal compared to a statistical method, neuroCombat, and pair HACA3 processing with neuroCombat to evaluate the efficacy of multiple harmonization methods in a pediatric (age 7 to 21) population across three unique scanners with controls and cases of PAE with downstream MaCRUISE volume metrics. We find that HACA3 qualitatively improves inter-site contrast variations, but statistical methods reduce greater site-related variance within the MaCRUISE volume metrics following an ANCOVA test, and HACA3 relies on follow-up statistical methods to approach maximal biological preservation in this context.


[15] 2604.00254

Dissipation-assisted stabilization of periodic orbits via actuated exterior impacts in hybrid mechanical systems with symmetry

Impulsive mechanical systems exhibit discontinuous jumps in their state, and when such jumps are triggered by spatial events, the geometry of the impact surface carries information about the controllability of the hybrid dynamics. For mechanical systems defined on principal $G$-bundles, two qualitatively distinct types of impacts arise: interior impacts, associated with events on the shape space, and exterior impacts, associated with events on the fibers. A key distinction is that interior impacts preserve the mechanical connection, whereas exterior impacts generally do not. In this paper, we exploit this distinction by allowing actuation through exterior impacts. We study the pendulum-on-a-cart system, derive controlled reset laws induced by moving-wall impacts, and analyze the resulting periodic motions. Our results show that reset action alone does not provide a convincing stabilizing regime, whereas the addition of dissipation in the continuous flow yields exponentially stable periodic behavior for suitable feedback gains.


[16] 2604.00263

Feature-level Site Leakage Reduction for Cross-Hospital Chest X-ray Transfer via Self-Supervised Learning

Cross-hospital failure in chest X-ray models is often attributed to domain shift, yet most work assumes invariance without measuring it. This paper studies how to measure site leakage directly and how that measurement changes conclusions about transfer methods. We study multi-site self-supervised learning (SSL) and feature-level adversarial site confusion for cross-hospital transfer. We pretrain a ResNet-18 on NIH and CheXpert without pathology labels. We then freeze the encoder and train a linear pneumonia classifier on NIH only, evaluating transfer to RSNA. We quantify site leakage using a post hoc linear probe that predicts acquisition site from frozen backbone features $f$ and projection features $z$. Across 3 random seeds, multi-site SSL improves RSNA AUC from 0.6736 $\pm$ 0.0148 (ImageNet initialization) to 0.7804 $\pm$ 0.0197. Adding adversarial site confusion on $f$ reduces measured leakage but does not reliably improve AUC and increases variance. On $f$, site probe accuracy drops from 0.9890 $\pm$ 0.0021 (SSL-only) to 0.8504 $\pm$ 0.0051 (CanonicalF), where chance is 0.50. On $z$, probe accuracy drops from 0.8912 $\pm$ 0.0092 to 0.7810 $\pm$ 0.0250. These results show that measuring leakage changes how transfer methods should be interpreted: multi-site SSL drives transfer, while adversarial confusion exposes the limits of invariance assumptions.


[17] 2604.00277

Hybrid Energy-Based Models for Physical AI: Provably Stable Identification of Port-Hamiltonian Dynamics

Energy-based models (EBMs) implement inference as gradient descent on a learned Lyapunov function, yielding interpretable, structure-preserving alternatives to black-box neural ODEs and aligning naturally with physical AI. Yet their use in system identification remains limited, and existing architectures lack formal stability guarantees that globally preclude unstable modes. We address this gap by introducing an EBM framework for system identification with stable, dissipative, absorbing invariant dynamics. Unlike classical global Lyapunov stability, absorbing invariance expands the class of stability-preserving architectures, enabling more flexible and expressive EBMs. We extend EBM theory to nonsmooth activations by establishing negative energy dissipation via Clarke derivatives and deriving new conditions for radial unboundedness, exposing a stability-expressivity tradeoff in standard EBMs. To overcome this, we introduce a hybrid architecture with a dynamical visible layer and static hidden layers, prove absorbing invariance under mild assumptions, and show that these guarantees extend to port-Hamiltonian EBMs. Experiments on metric-deformed multi-well and ring systems validate the approach, showcasing how our hybrid EBM architecture combines expressivity with sound and provable safety guarantees by design.


[18] 2604.00283

Data-Driven Reachability Analysis via Diffusion Models with PAC Guarantees

We present a data-driven framework for reachability analysis of nonlinear dynamical systems that requires no explicit model. A denoising diffusion probabilistic model learns the time-evolving state distribution of a dynamical system from trajectory data alone. The predicted reachable set takes the form of a sublevel set of a nonconformity score derived from the reconstruction error, with the threshold calibrated via the Learn Then Test procedure so that the probability of excluding a reachable state is bounded with high probability. Experiments on three nonlinear systems, a forced Duffing oscillator, a planar quadrotor, and a high-dimensional reaction-diffusion system, confirm that the empirical miss rate remains below the Probably Approximately Correct (PAC) bound while scaling to state dimensions beyond the reach of classical grid-based and polynomial methods.


[19] 2604.00286

Certified Set Convergence for Piecewise Affine Systems via Neural Lyapunov Functions

Safety-critical control of piecewise affine (PWA) systems under bounded additive disturbances requires guarantees not for individual states but for entire state sets simultaneously: a single control action must steer every state in the set toward a target, even as sets crossing mode boundaries split and evolve under distinct affine dynamics. Certifying such set convergence via neural Lyapunov functions couples the Lipschitz constants of the value function and the policy, yet certified bounds for expressive networks exceed true values by orders of magnitude, creating a certification barrier. We resolve this through a three-stage pipeline that decouples verification from the policy. A value function from Hamilton-Jacobi backward reachability, trained via reinforcement learning, is the Lyapunov candidate. A permutation-invariant Deep Sets controller, distilled via regret minimization, produces a common action. Verification propagates zonotopes through the value network, yielding verified Lyapunov upper bounds over entire sets without bounding the policy Lipschitz constant. On four benchmarks up to dimension six, including systems with per-mode operator norms exceeding unity, the framework certifies set convergence with positive margin on every system. A spectrally constrained local certificate completes the terminal guarantee, and the set-actor is the only tested method to achieve full strict set containment, at constant-time online cost.


[20] 2604.00287

From Net Load Modifiers to Firm Capacity: The Role of Distributed Energy Resources in Resource Adequacy

Distributed energy resources (DERs) such as rooftop solar, battery storage, and demand response offer substantial potential for power system reliability, yet integrating them into resource adequacy (RA) frameworks as firm capacity contributors remains difficult across jurisdictions. Existing analyses often treat these barriers as isolated technical problems at individual stages of the RA participation process, overlooking the cross-stage dependencies that prevent reforms at one stage from producing scalable participation. This paper introduces a four-gate compliance pathway (entry and classification, metering and verification, accreditation, and enforcement), preceded by an upstream forecasting layer, as a unified lens for tracing where DER capacity value is lost at the institutional interfaces between these stages. Using a document-grounded comparative synthesis of tariff provisions, compliance protocols, and regulatory documents across five jurisdictions spanning U.S. capacity markets and European capacity remuneration mechanisms, we show that these barriers persist despite substantial variation in market design and regulatory structure, indicating that the problem is structural rather than jurisdiction-specific. We identify three cross-stage coupling mechanisms that explain why gate-level reforms have repeatedly failed to scale DER participation, and derive coordination principles for end-to-end compliance redesign. The central finding is that compliance architecture, rather than DER technology itself, is the binding constraint on translating DER capability into firm RA contributions.


[21] 2604.00305

Set-Based Value Function Characterization and Neural Approximation of Stabilization Domains for Input-Constrained Discrete-Time Systems

Analyzing nonlinear systems with stabilizable controlled invariant sets (CISs) requires accurate estimation of their domains of stabilization (DOS) together with associated stabilizing controllers. Despite extensive research, estimating DOSs for general nonlinear systems remains challenging due to fundamental theoretical and computational limitations. In this paper, we propose a novel framework for estimating DOSs for controlled input-constrained discrete-time systems. The DOS is characterized via newly introduced value functions defined on metric spaces of compact sets. We establish the fundamental properties of these value functions and derive the associated Bellman-type (Zubov-type) functional equations. Building on this characterization, we develop a physics-informed neural network (NN) framework that learns the value functions by embedding the derived functional equations directly into the training process. The proposed methodology is demonstrated through two numerical examples, illustrating its ability to accurately estimate DOSs and synthesize stabilizing controllers from the learned value functions.


[22] 2604.00309

Nonlinear Moving-Horizon Estimation Using State- and Control-Dependent Models

This paper presents a state- and control-dependent moving-horizon estimation (SCD-MHE) algorithm for nonlinear discrete-time systems. Within this framework, a pseudo-linear representation of nonlinear dynamics is leveraged utilizing state- and control-dependent coefficients, where the solution to a moving-horizon estimation problem is iteratively refined. At each discrete time step, a quadratic program is executed over a sliding window of historical measurements. Moreover, system matrices are consecutively updated based upon prior iterates to capture nonlinear regimes. In contrast to the extended Kalman filter (EKF) and the unscented Kalman filter (UKF), nonlinearities and bounds are accommodated within a structured optimization framework, thereby circumventing the reliance on local Jacobian matrices. Furthermore, theoretical analysis is presented to establish the convergence of the iterative sequence, and bounded estimation errors are mathematically guaranteed under uniform observability conditions. Finally, comparative numerical experiments utilizing a quadrotor vertical kinematics system demonstrate that the SCD-MHE achieves superior estimation accuracy relative to the EKF, the UKF, and a fully nonlinear moving-horizon estimator, while reducing per-step computational latency by over an order of magnitude.


[23] 2604.00314

Prompt-Guided Prefiltering for VLM Image Compression

The rapid progress of large Vision-Language Models (VLMs) has enabled a wide range of applications, such as image understanding and Visual Question Answering (VQA). Query images are often uploaded to the cloud, where VLMs are typically hosted, hence efficient image compression becomes crucial. However, traditional human-centric codecs are suboptimal in this setting because they preserve many task-irrelevant details. Existing Image Coding for Machines (ICM) methods also fall short, as they assume a fixed set of downstream tasks and cannot adapt to prompt-driven VLMs with an open-ended variety of objectives. We propose a lightweight, plug-and-play, prompt-guided prefiltering module to identify image regions most relevant to the text prompt, and consequently to the downstream task. The module preserves important details while smoothing out less relevant areas to improve compression efficiency. It is codec-agnostic and can be applied before conventional and learned encoders. Experiments on several VQA benchmarks show that our approach achieves a 25-50% average bitrate reduction while maintaining the same task accuracy. Our source code is available at this https URL.


[24] 2604.00329

Phase Relationship between Spinal Motion and Limb Support Determines High-speed Running Performance in a Cheetah Model with Asymmetric Spinal Stiffness

Cheetahs are characterized by large spinal flexion and extension during high-speed running, yet the dynamical role of the phase relationship between spinal motion and limb support remains unclear. We aimed to clarify how this phase relationship affects running performance, focusing on the effect of asymmetric spinal stiffness. Using a simple planar cheetah model with asymmetric torsional spinal stiffness, we numerically searched for periodic bounding solutions over a range of stiffness parameters and compared their ground reaction forces, horizontal velocities, and stability. We obtained both cheetah-like solutions, in which the spine extends after hindlimb liftoff and flexes after forelimb liftoff, and non-cheetah-like solutions, in which the spine flexes after hindlimb liftoff and extends after forelimb liftoff. Under asymmetric spinal stiffness, cheetah-like solutions reduced ground reaction forces while maintaining horizontal velocity more effectively than non-cheetah-like solutions. The phase relationship between spinal motion and stance timing is a key determinant of high-speed running performance. These findings provide a dynamical understanding of cheetah locomotion and suggest design principles for spined legged robots.


[25] 2604.00334

Event-Triggered Adaptive Taylor-Lagrange Control for Safety-Critical Systems

This paper studies safety-critical control for nonlinear systems under sampled-data implementations of the controller. The recently proposed Taylor--Lagrange Control (TLC) method provides rigorous safety guarantees but relies on a fixed discretization-related parameter, which can lead to infeasibility or unsafety in the presence of input constraints and inter-sampling effects. To address these limitations, we propose an adaptive Taylor--Lagrange Control (aTLC) framework with an event-triggered implementation, where the discretization-related parameter defines the discretization time scale and is selected online as state-dependent rather than fixed. This enables the controller to dynamically balance feasibility and safety by adjusting the effective time scale of the Taylor expansion. The resulting controller is implemented as a sequence of Quadratic Programs (QPs) with input constraints. We further introduce a selection rule to choose the discretization-related parameter from a finite candidate set, favoring feasible inputs and improved safety. Simulation results on an adaptive cruise control (ACC) problem demonstrate that the proposed approach improves feasibility, guarantees safety, and achieves smoother control actions compared to TLC while requiring a single automatically tuned parameter.


[26] 2604.00338

Willems' Fundamental Lemma with Large Noisy Fragmented Dataset

Willems' Fundamental Lemma enables parameterizing all trajectories generated by a Linear Time-Invariant (LTI) system directly from data. However, this lemma relies on the assumption of noiseless measurements. In this paper, we provide an approach that enables the applicability of Willems' Fundamental Lemma with a large noisy-input, noisy-output fragmented dataset, without requiring prior knowledge of the noise distribution. We introduce a computationally tractable and lightweight algorithm that, despite processing a large dataset, executes in the order of seconds to estimate the invariants of the underlying system, which is obscured by noise. The simulation results demonstrate the effectiveness of the proposed method.


[27] 2604.00379

Demand response potential evaluation of a zero carbon hydrogen metallurgy system considering shaft furnace's flexibility

The increasing penetration of intermittent renewable energy sources and the retirement of thermal units have widened the power system flexibility gap. Industrial demand response (DR) driven by real-time pricing is widely regarded as a viable solution. In this paper, we propose a framework to quantify the DR potential of a zero-carbon hydrogen metallurgy system (ZCHMS) considering shaft furnace's flexibility. First, we model the shaft furnace as a constrained flexible load and validate the model via simulation, achieving a root mean square error of 4.48\% of the rated load. Second, we formulate a DR potential evaluation method that determines baseline and DR-based production scheduling schemes by minimizing operating cost subject to production orders. Finally, the numerical results show that compared with the baseline, DR-based ZCHMS reduces operating cost by 6.6\%, incentivizing demand-side management in ironmaking and strengthening power-ironmaking synergies.


[28] 2604.00380

Data-Attributed Adaptive Control Barrier Functions: Safety-Certified Training Data Curation via Influence Analysis

Learning-based adaptation of Control Barrier Function (CBF) parameters offers a promising path toward safe autonomous navigation that balances conservatism with performance. Yet the accuracy of the underlying safety predictor is ultimately constrained by training data quality, and no prior work has formally characterized how prediction errors propagate through the adaptive pipeline to degrade closed-loop safety guarantees. We introduce Data-Attributed Adaptive CBF (DA-CBF), a framework that integrates TracIn-based data attribution into adaptive CBF learning. Our theoretical contributions are fourfold: (i) corrected two-sided bounds relating the safety-loss surrogate to the CBF constraint margin; (ii) a safety margin preservation theorem showing that prediction error induces quantifiable margin degradation and, via a smooth parameter selector, yields a genuine closed-loop forward invariance guarantee not conditioned on a fixed trajectory; (iii) a CBF-QP constraint perturbation bound that links prediction accuracy directly to recursive feasibility; and (iv) a principled leave-one-out justification for influence-based data curation under explicit smoothness assumptions. On a DynamicUnicycle2D benchmark, DA-CBF reduces prediction RMSE by 35.6\%, expands the certified safe operating set by 39\%, and achieves collision-free navigation in a 16-obstacle environment where the uncurated baseline incurs 3 collisions.


[29] 2604.00398

RFSS: A Multi-Standard RF Signal Source Separation Dataset with 3GPP-Standardized Channel and Hardware Impairments

The coexistence of heterogeneous cellular standards (2G-5G) in shared spectrum demands sophisticated RF source separation techniques, yet no public dataset exists for data-driven research on this problem. We present RFSS (RF Signal Source Separation), an open-source dataset of 100,000 multi-source RF signal samples generated with full 3GPP standards compliance. The dataset covers GSM (TS 45.004), UMTS (TS 25.211), LTE (TS 36.211), and 5G NR (TS 38.211), with 2-4 simultaneous sources per sample plus 4,000 single-source reference samples, at 30.72 MHz sample rate. Each sample passes through independent 3GPP TDL multipath fading channels and realistic hardware impairments: carrier frequency offset, I/Q imbalance, phase noise, DC offset, and PA nonlinearity (Rapp model). Two mixing modes are provided: co-channel (all sources at baseband) and adjacent-channel (each source frequency-shifted to its standard-specific carrier). The dataset totals 103 GB in HDF5 format with a 70/15/15 train/validation/test split. We benchmark five methods: FastICA, Frobenius-norm NMF, Conv-TasNet, DPRNN, and a CNN-LSTM baseline, evaluated using permutation-invariant SI-SINR (PI-SI-SINR). Conv-TasNet achieves -21.18 dB PI-SI-SINR on 2-source mixtures versus -34.91 dB for ICA, a 13.7 dB improvement. On co-channel mixtures, Conv-TasNet reaches -12.34 dB versus -28.04 dB for ICA and -16.19 dB for NMF. The dataset and evaluation code are publicly released at submission time.


[30] 2604.00400

Explainable Functional Relation Discovery for Battery State-of-Health Using Kolmogorov-Arnold Network

Battery health management is heavily dependent on reliable State-of-Health (SoH) estimation to ensure battery safety with maximized energy utilization. Although SoH estimation can effectively track battery degradation, it requires continuous battery data acquisition. In addition, model-based SoH estimation methods rely on accurate battery model knowledge, whereas data-driven approaches often suffer from limited interpretability. In contrast, analytical characterization of SoH will offer a direct and tractable handle on battery performance degradation, while also establishing a foundation for further analytical studies toward effective battery health management. Thus, in this work, we propose a Kolmogorov Arnold Network (KAN)-based data-driven pipeline to establish a functional relationship for SoH degradation using battery temperature data. Specifically, we learn long-term battery thermal dynamics and battery heat generation via learnable activation functions of our KAN model. We utilize the learned mapping to obtain an explicit functional relationship between SoH degradation and cycle number. The proposed pipeline was validated using real-world data, yielding a closed-form analytical formula of SoH degradation with high accuracy.


[31] 2604.00408

Fundamental Analysis of Scalable Fluid Antenna Systems: Identifiability Limits, Information Theory, and Joint Processing

Unlike fixed-position arrays with static observation entropy, the scalable fluid antenna system (S-FAS) can dynamically adjust its aperture to form different observation spaces with configuration-dependent entropy budgets. This reconfigurability requires an information-theoretic framework beyond traditional algebraic identifiability analysis. This paper establishes an observation entropy framework for S-FAS, which unifies the derivation of identifiability limits, the diagnosis of processing bottlenecks, and system design optimization. For an S-FAS with mutual coupling suppression, we derive a complete capacity hierarchy among compressed, extended, and jointly stacked configurations. The entropy framework reveals that sequential two-stage processing suffers from an information bottleneck that restricts achievable capacity, while the noise entropy ratio can be used to distinguish fundamental performance limits from algorithmic deficiencies. A joint MUSIC algorithm is proposed to approach the theoretical joint capacity bound. Extensive Monte Carlo simulations, validated by both algebraic and information-theoretic criteria, verify the derived capacity hierarchy and identifiability boundaries.


[32] 2604.00415

Dynamic Weight Optimization for Double Linear Policy: A Stochastic Model Predictive Control Approach

The Double Linear Policy (DLP) framework guarantees a Robust Positive Expectation (RPE) under optimized constant-weight designs or admissible prespecified time-varying policies. However, the sequential optimization of these time-varying weights remains an open challenge. To address this gap, we propose a Stochastic Model Predictive Control (SMPC) framework. We formulate weight selection as a receding-horizon optimal control problem that explicitly maximizes risk-adjusted returns while enforcing survivability and predicted positive expectation constraints. Notably, an analytical gradient is derived for the non-convex objective function, enabling efficient optimization via the L-BFGS-B algorithm. Empirical results demonstrate that this dynamic, closed-loop approach improves risk-adjusted performance and drawdown control relative to constant-weight and prescribed time-varying DLP baselines.


[33] 2604.00429

Distributed Safety-Critical Control of Multi-Agent Systems with Time-Varying Communication Topologies

Coordinating multiple autonomous agents to reach a target region while avoiding collisions and maintaining communication connectivity is a core problem in multi-agent systems. In practice, agents have a limited communication range. Thus, network links appear and disappear as agents move, making the topology state-dependent and time-varying. Existing distributed solutions to multi-agent reach-avoid problems typically assume a fixed communication topology, and thus are not applicable when encountering discontinuities raised by time-varying topologies. This paper presents a distributed optimization-based control framework that addresses these challenges through two complementary mechanisms. First, we introduce a truncation function that converts the time-varying communication graph into a smoothly state-dependent one, ensuring that constraints remain continuous as communication links are created or removed. Second, we employ auxiliary mismatch variables with two-time-scale dynamics to decouple globally coupled state-dependent constraints, yielding a singular perturbation system that each agent can solve using only local information and neighbor communication. Through singular perturbation analysis, we prove that the distributed controller guarantees collision avoidance, connectivity preservation, and convergence to the target region. We validate the proposed framework through numerical simulations involving multi-agent navigation with obstacles and time-varying communication topologies.


[34] 2604.00480

Penalty-Free Two-Step Optimization of Higher-Order Ising Problems for Two-Dimensional Line-Controlled RIS

Reconfigurable intelligent surfaces (RISs) are often assumed to allow continuous phase control over all elements, leading to hardware cost that scales with the number of elements. Treating the phase of each element as a discrete variable is essential for improving cost effectiveness toward ubiquitous RIS deployment. However, the resulting discrete optimization problem is inherently difficult to solve. To address this challenge, this letter proposes a two-dimensional line-control method to reduce the degrees of freedom of the phase variables. The formulation yields a fourth-order objective function and is not directly compatible with physical optimizers such as coherent Ising machines and quantum annealers, which are designed for quadratic interactions. Conventional methods for reducing the order of the objective function with additional auxiliary variables increase the number of variables and require additional penalty parameters, limiting scalability. We therefore propose a two-step optimization method that transforms the fourth-order objective into two successive quadratic optimization problems. For a RIS with 5,476 elements, the required number of discrete variables is reduced from 11,100 to 5,476. Experiments using a real coherent Ising machine demonstrated that the proposed approach solved the discrete-phase optimization problem with 5,476 elements, while limiting the beamforming-gain loss to 2 dB compared with the full continuous-control case.


[35] 2604.00490

Incremental stability in $p=1$ and $p=\infty$: classification and synthesis

All Lipschitz dynamics with the weak infinitesimal contraction (WIC) property can be expressed as a Lipschitz nonlinear system in proportional negative feedback -- this statement, a ``structure theorem,'' is true in the $p=1$ and $p=\infty$ norms. Equivalently, a Lipschitz vector field is WIC if and only if it can be written as a scalar decay plus a Lipschitz-bounded residual. We put this theorem to use using neural networks to approximate Lipschitz functions. This results in a map from unconstrained parameters to the set of WIC vector fields, enabling standard gradient-based training with no projections or penalty terms. Because the induced $1$- and $\infty$-norms of a matrix reduce to row or column sums, Lipschitz certification costs only $O(d^2)$ operations -- the same order as a forward pass and appreciably cheaper than eigenvalue or semidefinite methods for the $2$-norm. Numerical experiments on a planar flow-fitting task and a four-node opinion network demonstrate that the parameterization (re-)constructs contracting dynamics from trajectory data. In a discussion of the expressiveness of non-Euclidean contraction, we prove that the set of $2\times 2$ systems that contract in a weighted $1$- or $\infty$-norm is characterized by an eigenvalue cone, a strict subset of the Hurwitz region that quantifies the cost of moving away from the Euclidean norm.


[36] 2604.00496

The QuadSoft: Design, Construction, and Experimental Validation of a Soft and Actuated Quadrotor

This paper presents QuadSoft, a novel fully actuated quadrotor equipped with continuous-curvature, tendon-driven soft robotic arms. The design combines a semi-rigid central frame with flexible arms, enabling controlled structural reconfiguration during flight without altering the propeller layout. Unlike existing soft aerial platforms that rely on discrete bending joints, QuadSoft utilizes a continuum deformation approach to modulate arm curvature, actively adjusting its thrust vector and aerodynamic characteristics. We characterize the geometric mapping between servomotor input and the resulting constant curvature, validating it experimentally. Outdoor flight tests demonstrate stable take-off, hover, directional maneuvers, and landing, confirming that controlled arm bending can generate horizontal displacement while preserving altitude. Measurements of pitch, roll, and curvature angles show that the platform follows intended actuation patterns with minimal attitude deviations. These results demonstrate that QuadSoft preserves the baseline stability of rigid quadrotors while enabling morphology-driven maneuverability, all under the standard PX4 autopilot without retuning. Beyond a proof of concept, this work establishes a distinctive outdoor validation of a tendon-driven continuum morphing quadrotor, opening a new research avenue toward adaptive aerial systems that combine the safety and versatility of soft robotics with the performance of conventional UAVs.


[37] 2604.00520

BLISS: Global Blind Identification of Linear Systems with Sparse Inputs

Linear system identification and sparse dictionary learning can both be seen as structured matrix factorization problems. However, these two problems have historically been studied in isolation by the systems theory and machine learning communities. Although linear system identification enjoys a mature theory when inputs are known, blind linear system identification remains poorly understood beyond restrictive settings. In contrast, complete sparse dictionary learning has recently benefited from strong global identifiability results and scalable nonconvex algorithms. In this work, we bridge these two areas by showing that under a sparse input assumption, fully observed blind system identification becomes a generalization of complete dictionary learning. This connection allows us to develop global identifiability guarantees for blind system identification, by leveraging techniques from the complete dictionary learning literature. We further show empirically that a principled application of the alternating direction method of multipliers can globally recover the ground-truth system from a single trajectory, provided sufficient samples and input sparsity.


[38] 2604.00524

DeePC vs. Koopman MPC for Pasteurization: A Comparative Study

Data-driven predictive control methods can provide the constraint handling and optimization of model predictive control (MPC) without first-principles models. Two such methods differ in how they replace the model: Data-enabled predictive control (DeePC) uses behavioral systems theory to predict directly from input--output trajectories via Hankel matrices, while Koopman-based MPC (KMPC) learns a lifted linear state-space representation from data. Both methods are well studied on their own, but head-to-head comparisons on multivariable process control problems are few. This paper compares them on a pasteurization unit with three manipulated inputs and three measured outputs, using a neural-network-based digital twin as the plant simulator. Both controllers share identical prediction horizons, cost weights, and constraints, so that differences in closed-loop behavior reflect the choice of predictive representation. Results show that both methods achieve feasible constrained control with comparable tracking error, but with a clear trade-off: KMPC tracks more tightly under the chosen cost, while DeePC produces substantially smoother input trajectories. These results help practitioners choose between the two approaches for thermal processing applications.


[39] 2604.00540

Sequential Monte Carlo for Network Resilience Assessment and Control

Resilience is emerging as a key requirement for next-generation wireless communication systems, requiring the ability to assess and control rare, path-dependent failure events arising from sequential degradation and delayed recovery. In this work, we develop a sequential Monte Carlo (SMC) framework for resilience assessment and control in networked systems. Resilience failures are formulated as staged, path-dependent events and represented through a reaction-coordinate-based decomposition that captures the progression toward non-recovery. Building on this structure, we propose a multilevel splitting approach with fixed, semantically interpretable levels and a budget-adaptive population control mechanism that dynamically allocates computational effort under a fixed total simulation cost. The framework is further extended to incorporate mitigation policies by leveraging SMC checkpoints for policy evaluation, comparison, and state-contingent selection via simulation-based lookahead. A delay-critical wireless network use case is considered to demonstrate the approach. Numerical results show that the proposed SMC method significantly outperforms standard Monte Carlo in estimating rare non-recovery probabilities and enables effective policy-driven recovery under varying system conditions. The results highlight the potential of SMC as a practical tool for resilience-oriented analysis and control in future communication systems.


[40] 2604.00542

Star-Tracker-Constrained Attitude MPC for CubeSats

This paper presents an online linear model predictive control (MPC) framework for slew maneuvers that maintains star-tracker availability during ground-target tracking. The nonlinear rigid-body dynamics and geometric exclusion constraints are analytically linearized about the current state estimate at each control step, yielding a time-varying linear MPC formulation cast as a standard quadratic program (QP). This structure is compatible with established aerospace flight-software practices and offers a computational profile with lower online complexity than comparable nonlinear MPC schemes. The controller incorporates angular-rate, actuator, and star-tracker exclusion constraints over a receding horizon. Performance is assessed in high-fidelity nonlinear model-in-the-loop simulations using NASA's "42" spacecraft dynamics simulator, including a Monte Carlo campaign over varying target geometries and inertia perturbations.


[41] 2604.00561

Beyond Bounded Noise: Stochastic Set-Membership Estimation for Nonlinear Systems

In this paper, we derive a novel procedure for set-membership estimation of dynamical systems affected by stochastic noise with unbounded support. By employing a bound on the sample covariance matrix, we are able to provide a finite-sample uncertainty set containing the true system parameters with high probability. Our approach can be natively applied to a wide class of nonlinear systems affected by sub- Gaussian noise. Through our analysis, we provide conditions under which the proposed uncertainty set converges to the true system parameters and establish an upper bound on the convergence rate. The proposed uncertainty set can be used directly for the synthesis of robust controllers with probabilistic stability and performance guarantees. Concluding numerical examples demonstrate the advantages of the proposed formulation over established approaches.


[42] 2604.00564

Robust IMMPC: An Offset-free MPC for Rejecting Unknown Disturbances

Output regulation is the problem of finding a control input to asymptotically track reference trajectories and reject disturbances. This can be addressed by using the internal model principle to embed a model of the disturbance in the controller. In this work, we present a Model Predictive Control scheme to achieve offset-free control. To do so, we extend Internal Model MPC to general bounded disturbances that must not be generated by the disturbance model. We show recursive feasibility, constraint satisfaction, and provide convergence conditions for the optimal reachable output. The proposed controller is validated on a four-tank system.


[43] 2604.00565

Typical Scenarios Generation Method Considering System-level Characteristics of Power System

This paper proposes a method for generating typical scenarios based on system-level macroscopic characteristics of power system and considering its stability properties. First, considering uncertainties such as renewable energy generation in power-electronics-dominated power systems, multidimensional scaling is used to construct an electrical coordinate system. Based on this, system-level characteristics of the distribution of physical quantities, such as power generation and load, are characterized. Furthermore, a method for generating typical scenarios based on the system's system-level characteristics and stability properties is proposed. For the obtained joint probability distribution of system-level characteristics, weighted Mahalanobis distance can be used to predict the stability properties of random scenarios. Finally, the typicality and representativeness of the scenarios generated by the proposed method with respect to stability properties are verified on the CSEE benchmark case, and stability prediction for random scenarios is achieved using a probabilistic testing method.


[44] 2604.00566

Toward Efficient Deployment and Synchronization in Digital Twins-Empowered Networks

Digital twins (DTs) are envisioned as a key enabler of the cyber-physical continuum in future wireless networks. However, efficient deployment and synchronization of DTs in dynamic multi-access edge computing (MEC) environments remains challenging due to time-varying communication and computational resources. This paper investigates the joint optimization of DT deployment and synchronization in dynamic MEC environments. A deep reinforcement learning (DRL) framework is proposed for adaptive DT placement and association to minimize interaction latency between physical and digital entities. To ensure semantic freshness, an update scheduling policy is further designed to minimize the long-term weighted sum of the Age of Changed Information (AoCI) and the update cost. A relative policy iteration algorithm with a threshold-based structure is developed to derive the optimal policy. Simulation results show that the proposed methods achieve lower latency, enhanced information freshness, and reduced system cost compared with benchmark schemes


[45] 2604.00572

CRLB Minimization for ISAC Systems with Segmented Waveguide-Enabled Pinching Antenna

Pinching-antenna (PA) has recently attracted considerable research attention in wireless systems, realized by attaching small dielectric particles along a waveguide. Building upon which, the segmented waveguide-enabled pinching-antenna system (SWAN) has been proposed to mitigate the inter-antenna radiation problem in uplink transmissions of conventional PA systems. In this work, SWAN-assisted integrated sensing and communication (ISAC) is investigated, where a base station (BS) equipped with SWAN provides downlink communications for multiple communication users (CUs) and performs sensing for multiple targets. The dual-functional signals transmitted by the BS are radiated by the SWAN, and the echo signals reflected by the targets are captured by the SWAN and relayed to the BS for estimating the locations of the targets. We formulate a Cramér-Rao lower bound (CRLB) minimization problem to evaluate the performance of the ISAC system, where the CRLB of the location estimation is minimized under communication rate constraints. To jointly optimize the beamforming and the PA positions of the SWAN, we develop a Riemannian manifold optimization (RMO) method, where each variable is constrained on its corresponding Riemannian manifold, and a Riemannian product manifold (RPM) is constructed as the solution space. A penalty method combined with Riemannian Broyden-Fletcher-Goldfarb-Shanno (RBFGS) algorithm is applied to obtain a feasible solution. Simulation results show that the proposed SWAN-assisted ISAC system yields superior CRLB performance for target localization compared with existing schemes including the multi-waveguide-enabled pinching-antenna-assisted ISAC systems.


[46] 2604.00583

SAR/ISAR Imaging in 6G Network

Imaging is a crucial sensing function that finds wide applications in environmental reconstruction, autonomous driving, etc. However, the signal processing methods for existing radio imaging techniques, such as millimeter wave (mmWave) imaging, require high-resolution range estimation enabled by Gigahertz-level or even Terahertz-level bandwidth, and cannot be applied in 6G integrated sensing and communication (ISAC) network with Megahertz-level bandwidth. This paper proposes two novel high-resolution radio imaging schemes that can work on the 6G signals with limited bandwidth - bandwidth-independent synthetic aperture radar (BI-SAR), where the movable base station (BS) revolves along the static targets by 360 degrees; as well as bandwidth-independent inverse synthetic aperture radar (BI-ISAR), where the BS is static and the targets revolve along an axis by 360 degrees. Different from conventional SAR and ISAR counterparts that rely on range estimation, our proposed imaging schemes solely utilize Doppler information to perform imaging without any range information. The main technical challenge of our schemes lies in the anisotropic scattering functions over different directions, which hinder the coherent synthesis of the backscattered signals from all directions. We design an iterative adaptive approach-based Doppler association (IAA-DA) algorithm to tackle the above issue. Moreover, we also derive the imaging resolution to characterize the reconstruction quality. Real-world experiments are provided to show the feasibility and the effectiveness of our proposed 6G imaging schemes.


[47] 2604.00588

Single-Waveguide Multiple-Pinching-Antenna Systems: OMA versus NOMA

This paper investigates the performance of a pinching-antenna (PA) system with a signal waveguide and multiple pinching antennas to serve users distributed across multiple rooms. The performance of the system is evaluated through a comparative analysis under both orthogonal multiple access (OMA) and non-orthogonal multiple access (NOMA) schemes. Specifically, this paper derives closed-form expressions for the outage probability (OP) and ergodic rate (ER) in each scheme. Furthermore, asymptotic analyses are conducted to characterize the system behavior in the high signal-to-noise ratio (SNR) regime. Extensive Monte Carlo simulations are utilized to validate the accuracy of the analytical derivations. The comparative results can be summarized as follows: 1) in the downlink fixed-rate scenario, whether OMA or NOMA achieves better outage performance depends on system parameters, such as the number of users and power allocation coefficients; 2) in the uplink fixed-rate scenario, the outage performance of NOMA is inferior to that of OMA in the high-SNR regime, and the decay rate of the OP for NOMA users depends on the rate thresholds; and 3) for both uplink and downlink adaptive-rate scenarios, the rate performance comparison of the two schemes depends on system parameters in the low-SNR regime, whereas OMA generally outperforms NOMA in the high-SNR regime.


[48] 2604.00595

Toward Robust Semantic Communications: Proactive Importance-Ordered Restructuring for Enhanced Unequal Error Protection

Semantic communications (SemCom) is a promising task-oriented paradigm in which semantic features exhibit non-uniform importance. Consequently, unequal error protection (UEP), which allocates resources based on semantic importance, plays a pivotal role in maximizing system utility. However, most existing schemes adopt passive importance evaluation, which neither proactively reshapes the importance distribution nor explores its impact on UEP performance. In this paper, we propose a novel importance-ordered semantic feature restructuring (ISFR) scheme that proactively enforces a descending importance hierarchy and jointly optimizes multi-dimensional resources to improve system utility. Specifically, modules with decreasing retention probabilities and increasing distortion levels are employed, which drive the model to concentrate key semantics into front-end features and thus strengthen importance differentiation. Moreover, a joint optimization problem that jointly optimizes channel matching, feature selection, modulation schemes, and power allocation is formulated to minimize the importance-weighted total semantic distortion. To solve this non-convex problem, a hierarchical decoupling strategy is proposed, which decomposes it into four tractable subproblems. This approach leverages the ordered prior to drastically prune the search space for feature selection and modulation, while integrating greedy-based channel matching and convex power allocation. Simulation results demonstrate that the proposed ISFR scheme outperforms traditional uniform importance-based schemes under harsh channel conditions and limited resources, validating the significant robustness improvement enabled by the concentration of key semantic information.


[49] 2604.00631

Optimal GNSS Time Tracking for Long-term Stable Time Realisation in Synchronised Atomic Clocks

In this manuscript, we propose a novel optimal Global Navigation Satellite System (GNSS) time tracking algorithm to collectively steer an ensemble consisting of synchronising miniature atomic clocks towards standard GNSS time. The synchronising miniature atomic clocks generate a common synchronised time which has good short term performance but its accuracy and precision, which is measured by Allan variance, deteriorates in the long run. So, a supervisor designs and periodically broadcasts the proposed GNSS time tracking control to the ensemble miniature atomic clocks that steer the average of ensemble towards the average of GNSS receivers, which are receivers of GNSS time. The tracking control is constructed using a Kalman filter estimation process that estimates the difference in average of GNSS receivers and average of ensemble clocks by using relative clock readings between GNSS receivers and their adjacent ensemble clock. Under the influence of the periodically received tracking control, the stabilised ensemble clocks have better long term accuracy and precision over long averaging periods. Since the tracking control is designed to solely influence the average of the ensemble, the tracking process does not interfere with the synchronisation process and vice versa. The feedback matrix associated with the tracking control is obtained from an optimisation problem that minimises steady-state Allan variance. Numerical results are provided to show the efficacy of the proposed algorithm for enhancing long term performance.


[50] 2604.00659

Battery Electric Truck Infrastructure Co-design via Joint Optimization and Agent-based Simulation

As zero-emission zones emerge in European cities, fleet operators are shifting to electric vehicles. To maintain their current operations, a clear understanding of the charging infrastructure required and its relationship to existing power grid limitations is needed. This study presents an optimization frame-work for jointly designing charging infrastructure and schedules within a logistics distribution network, validated through agent-based simulations. We formulate the problem as a mixed-integer linear program and develop an agent-based model to evaluate various designs and operations under stochastic conditions. Our experiments compare rule-based and optimized strategies in a case study of the Netherlands. Results show that current commercial solutions suffice for middle-mile logistics, with central co-design yielding average cost reductions of 5.2% to 6.4% and an average 20.1% decrease in total installed power. While rule-based control effectively manages charging operations and mitigates delays, optimizing charge scheduling significantly reduces queuing times (99%), charging costs (13.5%), and time spent near capacity (10.9%). Our optimization-simulation framework paves the way for combining optimized infrastructure planning and realistic fleet operations in digital-twin environments.


[51] 2604.00667

Explicit MPC for Parameter Dependent Linear Systems

This paper presents two explicit Model Predictive Control formulations for linear systems parameterized in terms of design variables. Such parameter dependent behavior commonly arises from operating point dependent linearization of nonlinear systems as well as from variations in mechanical, electrical, or thermal properties associated with material selection in the design of the process or system components. In contrast to explicit MPC approaches that treat design parameter variations and dependencies as disturbances, the proposed methods incorporate the parameters directly into the system matrices in an affine manner. However, explicitly incorporating these dependencies significantly increases the complexity of explicit MPC formulations due to resulting nonlinear terms involving decision variables and parameters. We address this complexity by proposing two approximation methods. Both methods are applied to two examples, and their performances are compared with respect to the exact eMPC implementation.


[52] 2604.00673

Analytical Probabilistic Power Flow Approximation Using Invertible Neural Networks

Probabilistic power flow (PPF) is essential for quantifying operational uncertainty in modern distribution systems with high penetration of renewable generation and flexible loads. Conventional PPF methods primarily rely on Monte Carlo (MC) based power flow (PF) simulations or simplified analytical approximations. While MC approaches are computationally intensive and demand substantial data storage, analytical approximations often compromise accuracy. In this paper, we propose a novel analytical PPF framework that eliminates the dependence on MC-based PF simulations and, in principle, enables an approximation of the analytical form of arbitrary voltage distributions. The core idea is to learn an explicit and invertible mapping between stochastic power injections and system voltages using invertible neural networks (INNs). By leveraging the Change of Variable Theorem, the proposed framework facilitates direct approximation of the analytical form of voltage probability distributions without repeated PF computations. Extensive numerical studies demonstrate that the proposed framework achieves state-of-the-art performance both as a high-accuracy PF solver and as an efficient analytical PPF estimator.


[53] 2604.00676

DF-3DRME: A Data-Friendly Learning Framework for 3D Radio Map Estimation based on Super-Resolution Technique

High-Resolution three-dimensional (3D) radio maps (RMs) provide rich information about the radio landscape that is essential to a myriad of wireless applications in the future wireless networks. Although deep learning (DL) methods have shown their effectiveness in RM construction, existing approaches require massive high-resolution 3D RM samples in the training dataset, the acquisition of which is labor-intensive and time-consuming in practice. In this paper, our goal is to devise a data-friendly high-resolution 3D RM construction solution via training over a hybrid dataset, wherein the RMs associated with a small fraction of environment maps (EMs) are of high-resolution, while those corresponding to the majority of EMs are of low-resolution. To this end, we propose a Data-Friendly 3D Radio Map Estimator (DF-3DRME), which comprises two processing stages. Specifically, in the first stage, we leverage the abundant low-resolution 3D RM samples to train a neural network, termed the LR-Net, for predicting the low-resolution 3D RM from the input EM, which provides a coarse characterization of the spatial radio propagation. In the second stage, we employ an advanced super-resolution network, termed the SR-Net, to upscale the predicted low-resolution 3D RM to its high-resolution counterpart. Unlike the LR-Net, the SR-Net can be effectively trained with only the limited high-resolution 3D RM samples available in the hybrid dataset. Experimental results demonstrate that the proposed framework achieves compelling reconstruction performance with only 4% of the EMs in the dataset having high-resolution 3D RM labels, which significantly reduces data acquisition overhead and facilitates practical deployment.


[54] 2604.00727

3D User Localization for Planar Arrays in LoS Near- and Far-Fields via Summed Phase Differences

This paper presents a phase-difference-based scheme for three-dimensional (3D) line-of-sight (LoS) user localization using a uniform planar array (UPA), applicable to both near-field and far-field regimes under the exact spherical-wave model. Unlike the previously studied two-dimensional (2D) uniform linear array (ULA) case, the 3D UPA case requires jointly exploiting the two array axes in order to recover the user's range, azimuth, and zenith angle. Adjacent-antenna phase-differences are first estimated from uplink pilots and then summed along the array axes to obtain unwrapped phase-differences between widely separated antenna elements. These summed phase-differences enable the construction of multiple three-equation systems whose solutions yield the user's range, azimuth, and zenith angle. We quantify the number of such equation systems, provide a representative closed-form estimator that uses only three phase-difference sums, and propose an all-data nonlinear least-squares estimator that exploits all available sums. Numerical results show that the least-squares estimator, when initialized by the closed-form estimate, achieves Cramér--Rao bound accuracy. Moreover, unlike state-of-the-art baseline schemes, whose performance depends on well-tuned hyperparameters, the proposed estimators are hyperparameter-free.


[55] 2604.00728

Learning Laplacian Forms for Graph Signal Processing via the Deformed Laplacian

Learning the graph Laplacian from observed data is one of the most investigated and fundamental tasks in Graph Signal Processing (GSP). Different variants of the Laplacian, such as the combinatorial, signless or signed Laplacians have been considered depending on the type of features to be extracted from the data. The main contribution of this paper is the introduction of a parametric Laplacian, called the deformed Laplacian, defined as a quadratic matrix polynomial that provides a parametric dictionary for graph signal processing. The deformed Laplacian can be interpreted as the generator of a parametric linear reaction-diffusion dynamics on graphs, capturing the interplay between diffusive coupling and nodal reaction effects. It is a parametric polynomial matrix that enables the design of novel topological operators tailored to both the underlying graph structure and the observed signals. Interestingly, we show that several Laplacian variants proposed in the literature arise as special cases of the deformed Laplacian. We then develop a method to jointly learn the deformed Laplacian and the graph signals from data, showing how its use improves signal representation across a broad class of graphs compared to standard Laplacian forms. Through extensive numerical experiments on both synthetic and real-world datasets, including financial and communication networks, we assess the benefits of the proposed method in terms of graph signal reconstruction error and sparsity of the representation.


[56] 2604.00774

Neural Vector Lyapunov-Razumikhin Certificates for Delayed Interconnected Systems

Ensuring scalable input-to-state stability (sISS) is critical for the safety and reliability of large-scale interconnected systems, especially in the presence of communication delays. While learning-based controllers can achieve strong empirical performance, their black-box nature makes it difficult to provide formal and scalable stability guarantees. To address this gap, we propose a framework to synthesize and verify neural vector Lyapunov-Razumikhin certificates for discrete-time delayed interconnected systems. Our contributions are three-fold. First, we establish a sufficient condition for discrete-time sISS via vector Lyapunov-Razumikhin functions, which enables certification for large-scale delayed interconnected systems. Second, we develop a scalable synthesis and verification framework that learns the neural certificates and verifies the certificates on reachability-constrained delay domains with scalability analysis. Third, we validate our approach on mixed-autonomy platoons, drone formations, and microgrids against multiple baselines, showing improved verification efficiency with competitive control performance.


[57] 2604.00776

Description and Discussion on DCASE 2026 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes

This paper presents an overview of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2026 Challenge Task 4, Spatial Semantic Segmentation of Sound Scenes (S5). The S5 task focuses on the joint detection and separation of sound events in complex spatial audio mixtures, contributing to the foundation of immersive communication. First introduced in DCASE 2025, the S5 task continues in DCASE 2026 Task 4 with key changes to better reflect real-world conditions, including allowing mixtures to contain multiple sources of the same class and to contain no target sources. In this paper, we describe task setting, along with the corresponding updates to the evaluation metrics and dataset. The experimental results of the submitted systems are also reported and analyzed. The official access point for data and code is this https URL.


[58] 2604.00806

Unsupervised End-to-End Array Calibration for Multi-Target Integrated Sensing and Communication

In this work, we consider end-to-end calibration of an integrated sensing and communication (ISAC) base station (BS) under gain-phase and antenna displacement impairments without collecting signals from predefined positions (labeled data). We consider a BS with two impaired uniform linear arrays used for simultaneous multi-target sensing and communication with a user equipment (UE) leveraging orthogonal frequency-division multiplexing signals. The main contribution is the design of a framework that can compensate for the impairments without labeled data and considering coherent receive signals. We harness a differentiable precoder based on the maximum array response in an angular direction at the transmitter and the orthogonal matching pursuit (OMP) algorithm at the sensing receiver. We propose an ISAC loss as a combination of sensing and communication losses that provides a trade-off between the two functionalities. We compare two sensing objective alternatives: (i) maximize the maximum response of the angle-delay map of the targets or (ii) minimize the norm of the residual signal at the output of the OMP algorithm after all estimated targets have been removed. The communication objective maximizes the energy of the received signal at the UE. Additionally, our framework leverages an approximation of the channel gradient that avoids the impractical knowledge of the gradient of the channel. Our results show that the proposed method performs closely to using labeled data and knowledge of the channel gradient in terms of sensing position estimation and communication symbol error rate. When comparing the two sensing losses, minimizing the norm of the OMP residual yields significantly better sensing position estimation with slightly increased complexity.


[59] 2604.00823

Novel Single Clad Ho-doped Fiber with High Slope Efficiency and Low Ion Pairing

We report the design and experimental and simulated performance for a 2050 nm band fiber amplifier with high optical-optical slope efficiency and low ion pairing, using a novel high performance single clad Ho-doped fiber from the Naval Research Laboratory (NRL). We measure an optical-optical slope efficiency of 57% using 1 mW input signal power and 1860 nm pumping which we believe is the highest slope efficiency obtained to date for a single clad single stage copumped HDFA. A new method for non-destructive measurement of the ion pairing coefficient in Ho-doped fibers is introduced and validated. Using this method, we link our 57% slope efficiency to a low ion pairing coefficient of 4% in the NRL Ho-doped fiber as derived from our experimental data. We present an overview and survey of the ion pairing results for Ho-doped fiber amplifiers and lasers reported so far in the literature.


[60] 2604.00825

Min-Max Grassmannian Optimization for Online Subspace Tracking

This paper discusses robustness guarantees for online tracking of time-varying subspaces from noisy data. Building on recent work in optimization over a Grassmannian manifold, we introduce a new approach for robust subspace tracking by modeling data uncertainty in a Grassmannian ball. The robust subspace tracking problem is cast into a min-max optimization framework, for which we derive a closed-form solution for the worst-case subspace, enabling a geometric robustness adjustment that is both analytically tractable and computationally efficient, unlike iterative convex relaxations. The resulting algorithm, GeRoST (Geometrically Robust Subspace Tracking), is validated on two case studies: tracking a linear time-varying system and online foreground-background separation in video.


[61] 2604.00826

Bridging RL and MPC for mixed-integer optimal control with application to Formula 1 race strategies

We propose a hybrid reinforcement learning (RL) and model predictive control (MPC) framework for mixed-integer optimal control, where discrete variables enter the cost and dynamics but not the constraints. Existing hierarchical approaches use RL only for the discrete action space, leaving continuous optimization to MPC. Unlike these methods, we train the RL agent on the full hybrid action space, ensuring consistency with the cost of the underlying Markov decision process. During deployment, the RL actor is rolled out over the prediction horizon to parametrize an integer-free nonlinear MPC through the discrete action sequence and provide a continuous warm-start. The learned critic serves as a terminal cost to capture long-term performance. We prove recursive feasibility, and validate the framework on a Formula 1 race strategy problem. The hybrid method achieves near-optimal performance relative to an offline mixed-integer nonlinear program benchmark, outperforming a standalone RL agent. Moreover, the hybrid scheme enables adaptation to unseen disturbances through modular MPC extensions at zero retraining cost.


[62] 2604.00846

Spatial Upper Bound of Radiated Power in Active Antenna Systems

The assessment of unwanted radiated emissions from Active Antenna Systems (AAS) has become a critical issue in adjacent-band coexistence scenarios. In this paper, we establish the existence of a deterministic spatial upper bound on the radiated power of active antenna arrays. We show that the maximum radiated power always occurs in the boresight direction, irrespective of frequency or signal nature (useful signal, nonlinear distortion, or noise), or instantaneous beamforming configuration, thereby defining a conservative spatial upper bound whose angular envelope is solely determined by the elementary radiating building block of the antenna architecture, i.e., the element or sub-array radiation pattern. Starting from a two-element array with third-order nonlinearities, we derive the spatial envelope and extend the result to realistic AAS architectures. The theoretical findings are validated by over-the-air (OTA) measurements performed on a 3.5 GHz Massive Multiple-Input Multiple-Output (MIMO) antenna. The proposed approach offers a simple, robust, and measurement-oriented methodology for coexistence assessments involving beamformed radio systems.


[63] 2604.00863

Optimal Anchor Placement for Wireless Localization in Mixed LOS and NLOS Scenarios

We develop a unified Fisher-information framework for localization in environments with both Line-of-Sight (LOS) and Non-Line-of-Sight (NLOS) paths, focusing on diffraction-dominated NLOS propagation characteristic of Outdoor-to-Indoor (O2I) signal propagation. The model couples anchor geometry with a physically grounded path-loss law that is continuous across the LOS/NLOS boundary and serves as an optimization objective for our optimal anchor placement problem. As the first step, we analyze single-target anchor placement and derive the classical A-, D-, and E-optimality criteria. Under a specific path-loss assumption, these criteria collapse to a polygon-closure condition in the complex plane: A-, D-, and E-optimal designs coincide, yielding necessary and sufficient conditions for optimal placement. Next, we extend the notion of optimal anchor placement with respect to a single target to optimality over a feasible region (multi-target setting) using a general formulation that explicitly includes a realistic path loss model. This is achieved by recasting the anchor placement as a combinatorial anchor-selection problem with provable guarantees. Next, we specify E- and D-optimal objectives over multiple targets in a predefined feasible target region and show that E-optimality straddles A-optimality (within a constant factor), while D-optimality provides looser bounds. These insights yield two practical algorithms, both mixed-integer second-order cone programs (MISOCP) with exact E-optimal and exact D-optimal objectives that produce robust, region-wide designs under mixed LOS/NLOS conditions.


[64] 2604.00864

DOA Estimation for Low-Altitude Networks: HAD Architectures, Methods, and Challenges

With the rapid expansion of low-altitude economy (LAE) services and the growing demand for integrated sensing and communication (ISAC) in air-ground networks, reliable direction-of-arrival (DOA) estimation has become essential for both directional communication and sensing functions. DOA underpins beam alignment, spatial-reuse scheduling, and ISAC-critical tasks such as airspace situational awareness and multi-target monitoring. Hybrid analog-digital (HAD) architectures have emerged as a practical solution for large-aperture directional operation under stringent radio frequency (RF), analog-to-digital converter (ADC), and size, weight, and power (SWaP) constraints. However, HAD compresses antenna-domain observations through analog combining, fundamentally reshaping the measurement model and introducing new algorithmic and system-level challenges for DOA estimation. This article first reviews the principles and representative architectures of HAD, highlighting their advantages for scalable beam-centric and ISAC-oriented operation in LAE scenarios. We then provide a structured overview of HAD-enabled DOA estimation methodologies, including spatial covariance matrix (SCM) reconstruction, multi-combiner scan-based acquisition, and pilot-aided estimation, along with key design tradeoffs. Finally, we discuss open challenges and outline reliability-driven research directions toward robust, deployable HAD-enabled DOA solutions for practical ISAC-enabled low-altitude environments.


[65] 2604.00900

Soft projections for robust data-driven control

We consider data-based predictive control based on behavioral systems theory. In the linear setting this means that a system is described as a subspace of trajectories, and predictive control can be formulated using a projection onto the intersection of this behavior and a constraint set. Instead of learning the model, or subspace, we focus on determining this projection from data. Motivated by the use of regularization in data-enabled predictive control (DeePC), we introduce the use of soft projections, which approximate the true projector onto the behavior from noisy data. In the simplest case, these are equivalent to known regularized DeePC schemes, but they exhibit a number of benefits. First, we provide a bound on the approximation error consisting of a bias and a variance term that can be traded-off by the regularization weight. The derived bound is independent of the true system order, highlighting the benefit of soft projections compared to low-dimensional subspace estimates. Moreover, soft projections allow for intuitive generalizations, one of which we show has superior performance on a case study. Finally, we provide update formulas for soft projectors enabling the efficient adaptation of the proposed data-driven control methods in the case of streaming data.


[66] 2604.00926

Dispatch-Embedded Long-Term Tail Risk Assessment and Mitigation via CVaR for Renewable Power Systems

Renewable energy (RE) generation exhibits pronounced seasonality and variability, and neglecting these features can lead to significant underestimation of long-term power system risks in power supply. While long-term dispatch strategies are essential for evaluating and mitigating tail risks, they are often excluded from existing models due to their complexity. This paper proposes a long-term tail risk assessment and mitigation framework for renewable power systems, explicitly embedding dispatch strategies. A representative scenario generation method is designed, combining multi-timescale Copula modeling to capture RE's long-range variability and correlation. Building on these scenarios, an evolution-based risk assessment model is established, where Conditional Value-at-Risk (CVaR) is employed as a robust metric to quantify tail risks. Finally, a controlled evolution-based risk mitigation scheme is introduced to refine long-term dispatch strategies for mitigating tail risks. Case studies on a modified IEEE-39 bus system incorporating real-world data substantiate the efficacy of the proposed method.


[67] 2604.00935

Polynomial Parametric Koopman Operators for Stochastic MPC

This paper develops a parametric Koopman operator framework for Stochastic Model Predictive Control (SMPC), where the Koopman operator is parametrized by Polynomial Chaos Expansions (PCEs). The model is learned from data using the Extended Dynamic Mode Decomposition -- Dictionary Learning (EDMD-DL) method, which preserves the convex least-squares structure for the PCE coefficients of the EDMD matrix. Unlike conventional stochastic Galerkin projection approaches, we derive a condensed deterministic reformulation of the SMPC problem whose dimension scales only with the control horizon and input dimension, and is independent of both the lifted state dimension and the number of retained PCE terms. Our framework, therefore, enables efficient nonlinear SMPC problems with expectation and second-order moment constraints with standard convex optimization solvers. Numerical examples demonstrate the efficacy of our framework for uncertainty-aware SMPC of nonlinear systems.


[68] 2604.00950

Mean-Field Control of Adherence in Participation-Coupled Vehicle Rebalancing Systems

Human driver participation is a critical source of uncertainty in Mobility-on-Demand (MoD) rebalancing. Drivers follow platform recommendations probabilistically, and their willingness to comply evolves with experienced outcomes. This creates a closed-loop feedback in which stronger recommendations increase participation, participation increases congestion, congestion lowers allocation success, and realized allocations update adherence beliefs. We propose a microscopic stochastic model that couples (i) belief-driven participation, (ii) Poisson demand, (iii) uniform matching, and (iv) Beta--Bernoulli belief updates. Under a large-population closure, we derive a deterministic mean-field recursion for the population adherence state under platform actuation. For i.i.d. Poisson demand and constant recommendation intensity, we prove global well-posedness and invariance of the recursion, establish equilibrium existence, provide uniqueness conditions, and show global convergence in the regime where platform recommendations are no weaker than baseline participation. We then define steady-state adherence and throughput, characterize the induced performance frontier, and show that adherence and throughput cannot, in general, be simultaneously maximized under uniform time-invariant actuation. This yields a throughput-maximization problem with an adherence floor. Exploiting the monotone frontier structure, we show the optimal uniform time-invariant policy is the maximal feasible recommendation intensity and provide an efficient bisection-based algorithm.


[69] 2604.00982

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

Audio-Visual Speech Recognition (AVSR) systems nowadays integrate Large Language Model (LLM) decoders with transformer-based encoders, achieving state-of-the-art results. However, the relative contributions of improved language modelling versus enhanced audiovisual encoding remain unclear. We propose Viseme-Guided AV-HuBERT (VisG AV-HuBERT), a multi-task fine-tuning framework that incorporates auxiliary viseme classification to strengthen the model's reliance on visual articulatory features. By extending AV-HuBERT with a lightweight viseme prediction sub-network, this method explicitly guides the encoder to preserve visual speech information. Evaluated on LRS3, VisG AV-HuBERT achieves comparable or improved performance over the baseline AV-HuBERT, with notable gains under heavy noise conditions. WER reduces from 13.59% to 6.60% (51.4% relative improvement) at -10 dB Signal-to-Noise Ratio (SNR) for Speech noise. Deeper analysis reveals substantial reductions in substitution errors across noise types, demonstrating improved speech unit discrimination. Evaluation on LRS2 confirms generalization capability. Our results demonstrate that explicit viseme modelling enhances encoder representations, and provides a foundation for enhancing noise-robust AVSR through encoder-level improvements.


[70] 2604.00992

Tube-Based Safety for Anticipative Tracking in Multi-Agent Systems

A tube-based safety framework is presented for robust anticipative tracking in nonlinear Brunovsky multi-agent systems subject to bounded disturbances. The architecture establishes robust safety certificates for a feedforward-augmented ancillary control policy. By rendering the state-deviation dynamics independent of the agents' internal nonlinearities, the formulation strictly circumvents the restrictive Lipschitz-bound feasibility conditions otherwise required for robust stabilization. Consequently, this structure admits an explicit, closed-form robust positively invariant (RPI) tube radius that systematically attenuates the exponential control barrier function (eCBF) tightening margins, thereby mitigating constraint conservatism while preserving formal forward invariance. Within the distributed model predictive control (MPC) layer, mapping the local tube radii through the communication graph yields a closed-form global formation error bound formulated via the minimum singular value of the augmented Laplacian. Robust inter-agent safety is enforced with minimal communication overhead, requiring only a single scalar broadcast per neighbor at initialization. Numerical simulations confirm the framework's efficacy in safely navigating heterogeneous formations through cluttered environments.


[71] 2604.00995

Robust Multidimensional Chinese Remainder Theorem (MD-CRT) with Non-Diagonal Moduli and Multi-Stage Framework

The Chinese remainder theorem (CRT) provides an efficient way to reconstruct an integer from its remainders modulo several integer moduli, and has been widely applied in signal processing and information theory. Its multidimensional extension (MD-CRT) generalizes this principle to integer vectors and integer matrix moduli, enabling reconstruction in multidimensional signal processing scenarios. However, since matrices are generally non-commutative, the multidimensional extension introduces new theoretical and algorithmic challenges. When all matrix moduli are diagonal, the system is equivalent to applying the one-dimensional CRT independently along each dimension. This work first investigates whether non-diagonal (non-separable) moduli offer fundamental advantages over traditional diagonal ones. We show that under the same determinant constraint, non-diagonal matrices do not increase the dynamic range but yield more balanced and better-conditioned sampling patterns. More importantly, they generate lattices with longer shortest vectors, leading to higher robustness to vector remainder errors, compared to diagonal ones. To further improve the robustness, we develop a multi-stage robust MD-CRT framework that improves the robustness level without reducing the dynamic range. Due to the multidimensional nature and modulo matrix forms, it is challenging and not straightforward to extend the existing one-dimensional multi-stage robust CRT. In this paper, we obtain a new condition for matrix moduli, which can be easily checked, such that a multi-stage robust MD-CRT can be implemented. Both theoretical analysis and simulation results demonstrate that the proposed multi-stage robust MD-CRT achieves stronger error tolerance and more reliable reconstruction under erroneous vector remainders than that of single-stage robust MD-CRT.


[72] 2604.01056

A Functional Learning Approach for Team-Optimal Traffic Coordination

In this paper, we develop a kernel-based policy iteration functional learning framework for computing team-optimal strategies in traffic coordination problems. We consider a multi-agent discrete-time linear system with a cost function that combines quadratic regulation terms and nonlinear safety penalties. Building on the Hilbert space formulation of offline receding-horizon policy iteration, we seek approximate solutions within a reproducing kernel Hilbert space, where the policy improvement step is implemented via a discrete Fréchet derivative. We further study the model-free receding-horizon scenario, where the system dynamics are estimated using recursive least squares, followed by updating the policy using rolling online data. The proposed method is tested in signal-free intersection scenarios via both model-based and model-free simulations and validated in SUMO.


[73] 2604.01060

Data-Model Co-Driven Continuous Channel Map Construction: A Perceptive Foundation for Embodied Intelligent Agents in 6G Networks

Future 6G networks will host massive numbers of embodied intelligent agents, which require real-time channel awareness over continuous-space for autonomous decision-making. By pre-obtaining location-specific channel state information (CSI), channel map can be served as a foundational world model for embodied intelligence to achieve wireless channel perception. However, acquiring CSI via measurements is costly, so in practice only sparse observations are available, leaving agents blind to channel conditions at unvisited locations. Meanwhile, purely model-driven channel maps can provide dense CSI but often yields unsatisfactory accuracy and robustness, while purely data-driven interpolation from sparse measurements is computationally prohibitive for real-time updates. To address these challenges, this paper proposes a data-model co-driven (DMcD) framework that performs a two-stage interpolation toward a space-time continuous channel map, First, a hybrid ray tracing and geometry-based channel model (H-RT/GBSM) is developed to capture dynamic scatterers, providing dense, time-variant channel properties that match measurement statistics as a physically consistent prior. Then, an inductive edge-conditioned graph neural network (InductE-GNN) fuses the prior with sparse measurements to perform real-time spatial interpolation, enabling rapid online adaptation without retraining, ensuring the synchronization with the dynamic physical reality. Evaluations with measured datasets show that the proposed DMcD framework significantly outperforms data-only and model-only baselines, providing accurate and queryable channel information for embodied intelligent agents.


[74] 2604.01104

Maximizing Power Flexibility of Hybrid Energy Systems for Capacity Market

Hybrid Energy Systems (HES), integrating generation sources, energy storage, and controllable loads, are well-positioned to provide real-time grid flexibility. However, quantifying this maximum flexibility is challenging due to renewable generation uncertainty and the complexity of power allocation across multiple assets in real time. This paper presents a rule-based framework for characterizing HES flexibility and systematically allocating power among its constituent assets. The flexibility envelope defines the dynamic power boundary within which the HES can inject or absorb power without violating operational constraints. Shaped in real time by capacity bids, available solar generation, and power allocation protocol, it enables reliable and predictable HES participation in regulation markets. Depending on the operational objective, the framework supports both symmetric and asymmetric flexibility cases. Further, the proposed power-allocation rule is benchmarked against an optimal dispatch, providing a performance reference under realistic conditions. Finally, state of charge drift correction control is presented to ensure sustained battery operation and system reliability. This work, therefore, offers a rigorous and practical framework for integrating HES into capacity markets through effective flexibility characterization.


[75] 2604.01115

A Distributed SOS Program For Local Stability Analysis of Polynomial PDEs in the PIE Representation

It has recently been shown that the evolution of a state, described by a Partial Differential Equation (PDE), can be more conveniently represented as the evolution of the state's highest spatial derivative (the ``fundamental state''), which lies in $L_2$ and has no boundary conditions (BCs) or continuity constraints. For linear PDEs, this yields a Partial Integral Equation (PIE) parametrized by Partial Integral (PI) operators mapping the fundamental state to the PDE state. In this paper, we show that for polynomial PDEs, the dynamics of the fundamental state can instead be compactly expressed as a distributed polynomial in the fundamental state, parametrized by a new tensor algebra of PI operators acting on the tensor product of the fundamental state. We further define a SOS parametrization of the distributed polynomial and use this to construct a distributed SOS program, for testing local stability of polynomial PDEs.


[76] 2604.01120

Diff-VS: Efficient Audio-Aware Diffusion U-Net for Vocals Separation

While diffusion models are best known for their performance in generative tasks, they have also been successfully applied to many other tasks, including audio source separation. However, current generative approaches to music source separation often underperform on standard objective metrics. In this paper, we address this issue by introducing a novel generative vocal separation model based on the Elucidated Diffusion Model (EDM) framework. Our model processes complex short-time Fourier transform spectrograms and employs an improved U-Net architecture based on music-informed design choices. Our approach matches discriminative baselines on objective metrics and achieves perceptual quality comparable to state-of-the-art systems, as assessed by proxy subjective metrics. We hope these results encourage broader exploration of generative methods for music source separation


[77] 2604.01122

Region-Adaptive Generative Compression with Spatially Varying Diffusion Models

Generative image codecs aim to optimize perceptual quality, producing realistic and detailed reconstructions. However, they often overlook a key property of human vision: our tendency to focus on particular aspects of a visual scene (e.g., salient objects) while giving less importance to other regions. An ideal perceptual codec should be able to exploit this property by allocating more representational capacity to perceptually important areas. To this end, we propose a region-adaptive diffusion-based image codec that supports non-uniform bit allocation within an image. We design a novel spatially varying diffusion model capable of denoising varying amounts of noise per pixel according to arbitrary importance maps. We further identify that these maps can serve as effective priors on the latent representation, and integrate them into our entropy model, improving rate-distortion performance. Built on these contributions, our spatially-adaptive diffusion-based codec outperforms state-of-the-art ROI-controllable baselines in both full-image and ROI-masked perceptual quality.


[78] 2604.01144

Schrodinger Bridges and Density Steering Problems for Gaussian Mixtures Models in Discrete-Time

In this work, we revisit the discrete-time Schrödinger Bridge (SB) and Density Steering (DS) problems for Gaussian mixture model (GMM) boundary distributions. Building on the existing literature, we construct a set of feasible Markovian policies that transport the initial distribution to the final distribution, and are expressed as mixtures of elementary component-to-component optimal policies. We then study the policy optimization within this feasible set in the context of discrete-time SBs and density-steering problems, respectively. We show that for minimum-effort density-steering problems, the proposed policy achieves the same control cost as existing approaches in the literature. For discrete-time SB problems, the proposed policy yields a cost smaller than or equal to that in the literature, resulting in a less conservative approximation. Finally, we study the continuous-time limit of our proposed discrete-time approach and show that it agrees with recently proposed approximations to the continuous-time SB for GMM boundary distributions. We illustrate this new result through two numerical examples.


[79] 2604.01156

Data-based Low-conservative Nonlinear Safe Control Learning

This paper develops a data-driven safe control framework for nonlinear discrete-time systems with parametric uncertainty and additive disturbances. The proposed approach constructs a data-consistent closed-loop representation that enables controller synthesis and safety certification directly from data. Unlike existing methods that treat unmodeled nonlinearities as global worst-case uncertainties using Lipschitz bounds, the proposed approach embeds nonlinear terms directly into the invariance conditions via a geometry-aware difference-of-convex formulation. This enables facet- and direction-specific convexification, avoiding both nonlinearity cancellation and the excessive conservatism induced by uniform global bounds. We further propose a vertex-dependent controller construction that enforces convexity and contractivity conditions locally on the active facets associated with each vertex, thereby enlarging the class of certifiable invariant sets. For systems subject to additive disturbances, disturbance effects are embedded directly into the verification conditions through optimized, geometry-dependent bounds, rather than via uniform margin inflation, yielding less conservative robust safety guarantees. As a result, the proposed methods can certify substantially larger safe sets, naturally accommodate joint state and input constraints, and provide data-driven safety guarantees. The simulation results show a significant improvement in both nonlinearity tolerance and the size of the certified safe set.


[80] 2604.01167

AdaLoRA-QAT: Adaptive Low-Rank and Quantization-Aware Segmentation

Chest X-ray (CXR) segmentation is an important step in computer-aided diagnosis, yet deploying large foundation models in clinical settings remains challenging due to computational constraints. We propose AdaLoRA-QAT, a two-stage fine-tuning framework that combines adaptive low-rank encoder adaptation with full quantization-aware training. Adaptive rank allocation improves parameter efficiency, while selective mixed-precision INT8 quantization preserves structural fidelity crucial for clinical reliability. Evaluated across large-scale CXR datasets, AdaLoRA-QAT achieves 95.6% Dice, matching full-precision SAM decoder fine-tuning while reducing trainable parameters by 16.6\times and yielding 2.24\times model compression. A Wilcoxon signed-rank test confirms that quantization does not significantly degrade segmentation accuracy. These results demonstrate that AdaLoRA-QAT effectively balances accuracy, efficiency, and structural trust-worthiness, enabling compact and deployable foundation models for medical image segmentation. Code and pretrained models are available at: this https URL


[81] 2604.01173

Safe learning-based control via function-based uncertainty quantification

Uncertainty quantification is essential when deploying learning-based control methods in safety-critical systems. This is commonly realized by constructing uncertainty tubes that enclose the unknown function of interest, e.g., the reward and constraint functions or the underlying dynamics model, with high probability. However, existing approaches for uncertainty quantification typically rely on restrictive assumptions on the unknown function, such as known bounds on functional norms or Lipschitz constants, and struggle with discontinuities. In this paper, we model the unknown function as a random function from which independent and identically distributed realizations can be generated, and construct uncertainty tubes via the scenario approach that hold with high probability and rely solely on the sampled realizations. We integrate these uncertainty tubes into a safe Bayesian optimization algorithm, which we then use to safely tune control parameters on a real Furuta pendulum.


[82] 2604.01188

Learning Neural Network Controllers with Certified Robust Performance via Adversarial Training

Neural network (NN) controllers achieve strong empirical performance on nonlinear dynamical systems, yet deploying them in safety-critical settings requires robustness to disturbances and uncertainty. We present a method for jointly synthesizing NN controllers and dissipativity certificates that formally guarantee robust closed-loop performance using adversarial training, in which we use counterexamples to the robust dissipativity condition to guide training. Verification is done post-training using alpha,beta-CROWN, a branch-and-bound-based method that enables direct analysis of the nonlinear dynamical system. The proposed method uses quadratic constraints (QCs) only for characterization of non-parametric uncertainties. The method is tested in numerical experiments on maximizing the volume of the set on which a system is certified to be robustly dissipative. Our method certifies regions up to 78 times larger than the region certified by a linear matrix inequality-based approach that we derive for comparison.


[83] 2604.01198

Polynomial Constraints for Robustness Analysis of Nonlinear Systems

This paper presents a framework for abstracting uncertain or non-polynomial components of dynamical systems using polynomial constraints. This enables the application of polynomial-based analysis tools, such as sum-of-squares programming, to a broader class of non-polynomial systems. A numerical method for constructing these constraints is proposed. The relationship between polynomial constraints and existing integral quadratic constraints (IQCs) is investigated, providing transformations of IQCs into polynomial constraints. The effectiveness of polynomial constraints in characterizing nonlinearities is validated via numerical examples to compute inner estimates of the region of attraction for two systems.


[84] 2604.01211

Making Every Bit Count for $A$-Optimal State Estimation

We study the problem of controlling how a limited communication bandwidth budget is allocated across heterogeneously quantized sensor measurements. The performance criterion is the trace of the error covariance matrix of the linear minimum mean square error (LMMSE) state estimator, i.e., an $A$-optimal design criterion. Minimizing this criterion with a bit budget constraint yields a nonconvex optimization problem. We derive a formula that reduces each evaluation of the gradient to a single Cholesky factorization. This enables efficient optimization by both a projection-free Frank-Wolfe method (with a computable convergence certificate) and an interior point method with L-BFGS Hessian approximation over the problem's continuous relaxation. A largest remainder rounding procedure recovers integer bit allocations with a bound on the quality of the rounded solution. Numerical experiments in IEEE power grid test cases with up to 300 buses compare both solvers and demonstrate that the analytic gradient is the key computational enabler for both methods. Additionally, the heterogeneous bit allocation is compared to standard uniform bit allocation on the 500 bus IEEE power grid test case.


[85] 2603.29042

An Empirical Recipe for Universal Phone Recognition

Phone recognition (PR) is a key enabler of multilingual and low-resource speech processing tasks, yet robust performance remains elusive. Highly performant English-focused models do not generalize across languages, while multilingual models underutilize pretrained representations. It also remains unclear how data scale, architecture, and training objective contribute to multilingual PR. We present PhoneticXEUS -- trained on large-scale multilingual data and achieving state-of-the-art performance on both multilingual (17.7% PFER) and accented English speech (10.6% PFER). Through controlled ablations with evaluations across 100+ languages under a unified scheme, we empirically establish our training recipe and quantify the impact of SSL representations, data scale, and loss objectives. In addition, we analyze error patterns across language families, accented speech, and articulatory features. All data and code are released openly.


[86] 2604.00061

Advancing Multi-Robot Networks via MLLM-Driven Sensing, Communication, and Computation: A Comprehensive Survey

Imagine advanced humanoid robots, powered by multimodal large language models (MLLMs), coordinating missions across industries like warehouse logistics, manufacturing, and safety rescue. While individual robots show local autonomy, realistic tasks demand coordination among multiple agents sharing vast streams of sensor data. Communication is indispensable, yet transmitting comprehensive data can overwhelm networks, especially when a system-level orchestrator or cloud-based MLLM fuses multimodal inputs for route planning or anomaly detection. These tasks are often initiated by high-level natural language instructions. This intent serves as a filter for resource optimization: by understanding the goal via MLLMs, the system can selectively activate relevant sensing modalities, dynamically allocate bandwidth, and determine computation placement. Thus, R2X is fundamentally an intent-to-resource orchestration problem where sensing, communication, and computation are jointly optimized to maximize task-level success under resource constraints. This survey examines how integrated design paves the way for multi-robot coordination under MLLM guidance. We review state-of-the-art sensing modalities, communication strategies, and computing approaches, highlighting how reasoning is split between on-device models and powerful edge/cloud servers. We present four end-to-end demonstrations (sense -> communicate -> compute -> act): (i) digital-twin warehouse navigation with predictive link context, (ii) mobility-driven proactive MCS control, (iii) a FollowMe robot with a semantic-sensing switch, and (iv) real-hardware open-vocabulary trash sorting via edge-assisted MLLM grounding. We emphasize system-level metrics -- payload, latency, and success -- to show why R2X orchestration outperforms purely on-device baselines.


[87] 2604.00067

Temporal Memory for Resource-Constrained Agents: Continual Learning via Stochastic Compress-Add-Smooth

An agent that operates sequentially must incorporate new experience without forgetting old experience, under a fixed memory budget. We propose a framework in which memory is not a parameter vector but a stochastic process: a Bridge Diffusion on a replay interval $[0,1]$, whose terminal marginal encodes the present and whose intermediate marginals encode the past. New experience is incorporated via a three-step \emph{Compress--Add--Smooth} (CAS) recursion. We test the framework on the class of models with marginal probability densities modeled via Gaussian mixtures of fixed number of components~$K$ in $d$ dimensions; temporal complexity is controlled by a fixed number~$L$ of piecewise-linear protocol segments whose nodes store Gaussian-mixture states. The entire recursion costs $O(LKd^2)$ flops per day -- no backpropagation, no stored data, no neural networks -- making it viable for controller-light hardware. Forgetting in this framework arises not from parameter interference but from lossy temporal compression: the re-approximation of a finer protocol by a coarser one under a fixed segment budget. We find that the retention half-life scales linearly as $a_{1/2}\approx c\,L$ with a constant $c>1$ that depends on the dynamics but not on the mixture complexity~$K$, the dimension~$d$, or the geometry of the target family. The constant~$c$ admits an information-theoretic interpretation analogous to the Shannon channel capacity. The stochastic process underlying the bridge provides temporally coherent ``movie'' replay -- compressed narratives of the agent's history, demonstrated visually on an MNIST latent-space illustration. The framework provides a fully analytical ``Ising model'' of continual learning in which the mechanism, rate, and form of forgetting can be studied with mathematical precision.


[88] 2604.00320

Hierarchical Motion Planning and Control under Unknown Nonlinear Dynamics via Predicted Reachability

Autonomous motion planning under unknown nonlinear dynamics requires learning system properties while navigating toward a target. In this work, we develop a hierarchical planning-control framework that enables online motion synthesis with limited prior system knowledge. The state space is partitioned into polytopes and approximates the unknown nonlinear system using a piecewise-affine (PWA) model. The local affine models are identified once the agent enters the corresponding polytopes. To reduce computational complexity, we introduce a non-uniform adaptive state space partition strategy that refines the partition only in task-relevant regions. The resulting PWA system is abstracted into a directed weighted graph, whose edge existence is incrementally verified using reach control theory and predictive reachability conditions. Certified edges are weighted using provable time-to-reach bounds, while uncertain edges are assigned information-theoretic weights to guide exploration. The graph is updated online as new data becomes available, and high-level planning is performed by graph search, while low-level affine feedback controllers are synthesized to execute the plan. Furthermore, the conditions of classical reach control theory are often difficult to satisfy in underactuated settings. We therefore introduce relaxed reachability conditions to extend the framework to such systems. Simulations demonstrate effective exploration-exploitation trade-offs with formal reachability guarantees.


[89] 2604.00382

mmAnomaly: Leveraging Visual Context for Robust Anomaly Detection in the Non-Visual World with mmWave Radar

mmWave radar enables human sensing in non-visual scenarios-e.g., through clothing or certain types of walls-where traditional cameras fail due to occlusion or privacy limitations. However, robust anomaly detection with mmWave remains challenging, as signal reflections are influenced by material properties, clutter, and multipath interference, producing complex, non-Gaussian distortions. Existing methods lack contextual awareness and misclassify benign signal variations as anomalies. We present mmAnomaly, a multi-modal anomaly detection framework that combines mmWave radar with RGBD input to incorporate visual context. Our system extracts semantic cues-such as scene geometry and material properties-using a fast ResNet-based classifier, and uses a conditional latent diffusion model to synthesize the expected mmWave spectrum for the given visual context. A dual-input comparison module then identifies spatial deviations between real and generated spectra to localize anomalies. We evaluate mmAnomaly on two multi-modal datasets across three applications: concealed weapon localization, through-wall intruder localization, and through-wall fall localization. The system achieves up to 94% F1 score and sub-meter localization error, demonstrating robust generalization across clothing, occlusions, and cluttered environments. These results establish mmAnomaly as an accurate and interpretable framework for context-aware anomaly detection in mmWave sensing.


[90] 2604.00388

Gradient-Based Data Valuation Improves Curriculum Learning for Game-Theoretic Motion Planning

We demonstrate that gradient-based data valuation produces curriculum orderings that significantly outperform metadata-based heuristics for training game-theoretic motion planners. Specifically, we apply TracIn gradient-similarity scoring to GameFormer on the nuPlan benchmark and construct a curriculum that weights training scenarios by their estimated contribution to validation loss reduction. Across three random seeds, the TracIn-weighted curriculum achieves a mean planning ADE of $1.704\pm0.029$\,m, significantly outperforming the metadata-based interaction-difficulty curriculum ($1.822\pm0.014$\,m; paired $t$-test $p=0.021$, Cohen's $d_z=3.88$) while exhibiting lower variance than the uniform baseline ($1.772\pm0.134$\,m). Our analysis reveals that TracIn scores and scenario metadata are nearly orthogonal (Spearman $\rho=-0.014$), indicating that gradient-based valuation captures training dynamics invisible to hand-crafted features. We further show that gradient-based curriculum weighting succeeds where hard data selection fails: TracIn-curated 20\% subsets degrade performance by $2\times$, whereas full-data curriculum weighting with the same scores yields the best results. These findings establish gradient-based data valuation as a practical tool for improving sample efficiency in game-theoretic planning.


[91] 2604.00391

Behavioral Score Diffusion: Model-Free Trajectory Planning via Kernel-Based Score Estimation from Data

Diffusion-based trajectory optimization has emerged as a powerful planning paradigm, but existing methods require either learned score networks trained on large datasets or analytical dynamics models for score computation. We introduce \emph{Behavioral Score Diffusion} (BSD), a training-free and model-free trajectory planner that computes the diffusion score function directly from a library of trajectory data via kernel-weighted estimation. At each denoising step, BSD retrieves relevant trajectories using a triple-kernel weighting scheme -- diffusion proximity, state context, and goal relevance -- and computes a Nadaraya-Watson estimate of the denoised trajectory. The diffusion noise schedule naturally controls kernel bandwidths, creating a multi-scale nonparametric regression: broad averaging of global behavioral patterns at high noise, fine-grained local interpolation at low noise. This coarse-to-fine structure handles nonlinear dynamics without linearization or parametric assumptions. Safety is preserved by applying shielded rollout on kernel-estimated state trajectories, identical to existing model-based approaches. We evaluate BSD on four robotic systems of increasing complexity (3D--6D state spaces) in a parking scenario. BSD with fixed bandwidth achieves 98.5\% of the model-based baseline's average reward across systems while requiring no dynamics model, using only 1{,}000 pre-collected trajectories. BSD substantially outperforms nearest-neighbor retrieval (18--63\% improvement), confirming that the diffusion denoising mechanism is essential for effective data-driven planning.


[92] 2604.00439

Reachability-Aware Time Scaling for Path Tracking

This paper studies tracking of collision-free waypoint paths produced by an offline planner for a planar double-integrator system with bounded speed and acceleration. Because sampling-based planners must route around obstacles, the resulting waypoint paths can contain sharp turns and high-curvature regions, so one-step reachability under acceleration limits becomes critical even when the path geometry is collision-free. We build on a pure-pursuit-style, reachability-guided quadratic-program (QP) tracker with a one-step acceleration margin. Offline, we evaluate this margin along a spline fitted to the waypoint path and update a scalar speed-scaling profile so that the required one-step acceleration remains below the available bound. Online, the same look-ahead tracking structure is used to track the scaled reference.


[93] 2604.00449

Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout

We study distributed optimization over networks with Byzantine agents that may send arbitrary adversarial messages. We propose \emph{Gradient Tracking with Probabilistic Edge Dropout} (GT-PD), a stochastic gradient tracking method that preserves the convergence properties of gradient tracking under adversarial communication. GT-PD combines two complementary defense layers: a universal self-centered projection that clips each incoming message to a ball of radius $\tau$ around the receiving agent, and a fully decentralized probabilistic dropout rule driven by a dual-metric trust score in the decision and tracking channels. This design bounds adversarial perturbations while preserving the doubly stochastic mixing structure, a property often lost under robust aggregation in decentralized settings. Under complete Byzantine isolation ($p_b=0$), GT-PD converges linearly to a neighborhood determined solely by stochastic gradient variance. For partial isolation ($p_b>0$), we introduce \emph{Gradient Tracking with Probabilistic Edge Dropout and Leaky Integration} (GT-PD-L), which uses a leaky integrator to control the accumulation of tracking errors caused by persistent perturbations and achieves linear convergence to a bounded neighborhood determined by the stochastic variance and the clipping-to-leak ratio. We further show that under two-tier dropout with $p_h=1$, isolating Byzantine agents introduces no additional variance into the honest consensus dynamics. Experiments on MNIST under Sign Flip, ALIE, and Inner Product Manipulation attacks show that GT-PD-L outperforms coordinate-wise trimmed mean by up to 4.3 percentage points under stealth attacks.


[94] 2604.00451

CASCADE: Cascaded Scoped Communication for Multi-Agent Re-planning in Disrupted Industrial Environments

Industrial disruption replanning demands multi-agent coordination under strict latency and communication budgets, where disruptions propagate through tightly coupled physical dependencies and rapidly invalidate baseline schedules and commitments. Existing coordination schemes often treat communication as either effectively free (broadcast-style escalation) or fixed in advance (hand-tuned neighborhoods), both of which are brittle once the disruption footprint extends beyond a local region. We present \CASCADE, a budgeted replanning mechanism that makes communication scope explicit and auditable rather than fixed or implicit. Each agent maintains an explicit knowledge base, solves role-conditioned local decision problems to revise commitments, and coordinates through lightweight contract primitives whose footprint expands only when local validation indicates that the current scope is insufficient. This design separates a unified agent substrate (Knowledge Base / Decision Manager / Communication Manager) from a scoped interaction layer that controls who is contacted, how far coordination propagates, and when escalation is triggered under explicit budgets. We evaluate \CASCADE on disrupted manufacturing and supply-chain settings using unified diagnostics intended to test a mechanism-design claim -- whether explicit scope control yields useful quality-latency-communication trade-offs and improved robustness under uncertainty -- rather than to provide a complete algorithmic ranking.


[95] 2604.00487

Competition and Cooperation of LLM Agents in Games

Large language model (LLM) agents are increasingly deployed in competitive multi-agent settings, raising fundamental questions about whether they converge to equilibria and how their strategic behavior can be characterized. In this paper, we study LLM agent interactions in two standard games: a network resource allocation game and a Cournot competition game. Rather than converging to Nash equilibria, we find that LLM agents tend to cooperate when given multi-round prompts and non-zero-sum context. Chain-of-thought analysis reveals that fairness reasoning is central to this behavior. We propose an analytical framework that captures the dynamics of LLM agent reasoning across rounds and explains these experimental findings.


[96] 2604.00553

Scenario theory for multi-criteria data-driven decision making

The scenario approach provides a powerful data-driven framework for designing solutions under uncertainty with rigorous probabilistic robustness guarantees. Existing theory, however, primarily addresses assessing robustness with respect to a single appropriateness criterion for the solution based on a dataset, whereas many practical applications - including multi-agent decision problems - require the simultaneous consideration of multiple criteria and the assessment of their robustness based on multiple datasets, one per criterion. This paper develops a general scenario theory for multi-criteria data-driven decision making. A central innovation lies in the collective treatment of the risks associated with violations of individual criteria, which yields substantially more accurate robustness certificates than those derived from a naive application of standard results. In turn, this approach enables a sharper quantification of the robustness level with which all criteria are simultaneously satisfied. The proposed framework applies broadly to multi-criteria data-driven decision problems, providing a principled, scalable, and theoretically grounded methodology for design under uncertainty.


[97] 2604.00573

Verifying Well-Posedness of Linear PDEs using Convex Optimization

Ensuring that a PDE model is well-posed is a necessary precursor to any form of analysis, control, or numerical simulation. Although the Lumer-Phillips theorem provides necessary and sufficient conditions for well-posedness of dissipative PDEs, these conditions must hold only on the domain of the PDE -- a proper subspace of $L_{2}$ -- which can make them difficult to verify in practice. In this paper, we show how the Lumer-Phillips conditions for PDEs can be tested more conveniently using the equivalent Partial Integral Equation (PIE) representation. This representation introduces a fundamental state in the Hilbert space $L_{2}$ and provides a bijection between this state space and the PDE domain. Using this bijection, we reformulate the Lumer-Phillips conditions as operator inequalities on $L_{2}$. We show how these inequalities can be tested using convex optimization methods, establishing a least upper bound on the exponential growth rate of solutions. We demonstrate the effectiveness of the proposed approach by verifying well-posedness for several classical examples of parabolic and hyperbolic PDEs.


[98] 2604.00674

Managing the Mismatch: The Role of Flexibility on the Path to a Carbon-Neutral Energy System

A rapid expansion of system flexibility is essential to integrate increasing shares of renewable energy into future energy systems. However, flexibility needs and technology-specific contributions to flexibility remain poorly quantified in energy system modelling. Existing methods are not widely applied, leaving key questions unanswered: which flexibility technologies are critical for climate neutrality, and what are the cost implications of alternative deployment strategies? To address this gap, we apply a correlation-based flexibility metric to a high-resolution, sector-coupled model of the German energy system, covering its transformation towards climate neutrality. For our default scenario, we find that daily flexibility needs increase by a factor of 3.7 between 2025 and 2045, driven primarily by the expansion of solar PV. By 2045, stationary batteries provide 38% of daily flexibility, while flexible electric vehicle charging contributes 30%. Systems with constrained flexibility increase system costs by 6.9%, electricity prices by 14 EUR/MWh and trigger 47% higher hydrogen and e-fuel imports compared to an unconstrained system in 2045. In contrast, scenarios with high shares of flexible electric vehicle charging, vehicle-to-grid, and industrial demand-side management achieve system cost reductions of 3.3%, while also reducing import dependence. Higher flexibility also reduces electricity price ranges, decreases average electricity prices by 3 EUR/MWh, and reduces backup capacity by 22% (22 GW). Overall, our results highlight the decisive role of specific flexibility technologies in achieving cost-efficient and energy-secure climate-neutral energy systems, providing quantitative guidance for policy and investment decisions.


[99] 2604.00688

OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models

We present OmniVoice, a massive multilingual zero-shot text-to-speech (TTS) model that scales to over 600 languages. At its core is a novel diffusion language model-style discrete non-autoregressive (NAR) architecture. Unlike conventional discrete NAR models that suffer from performance bottlenecks in complex two-stage (text-to-semantic-to-acoustic) pipelines, OmniVoice directly maps text to multi-codebook acoustic tokens. This simplified approach is facilitated by two key technical innovations: (1) a full-codebook random masking strategy for efficient training, and (2) initialization from a pre-trained LLM to ensure superior intelligibility. By leveraging a 581k-hour multilingual dataset curated entirely from open-source data, OmniVoice achieves the broadest language coverage to date and delivers state-of-the-art performance across Chinese, English, and diverse multilingual benchmarks. Our code and pre-trained models are publicly available at this https URL.


[100] 2604.00748

Optimal Sampling and Actuation Policies of a Markov Source over a Wireless Channel

This paper studies efficient data management and timely information dissemination for real-time monitoring of an $N$-state Markov process, enabling accurate state estimation and reliable actuation decisions. First, we analyze the Age of Incorrect Information (AoII) and derive closed-form expressions for its time average under several scheduling policies, including randomized stationary, change-aware randomized stationary, semantics-aware randomized stationary, and threshold-aware randomized stationary policies. We then formulate and solve constrained optimization problems to minimize the average AoII under a time-averaged sampling action constraint, and compare the resulting optimal sampling and transmission policies to identify the conditions under which each policy is most effective. We further show that directly using reconstructed states for actuation can degrade system performance, especially when the receiver is uncertain about the state estimate or when actuation is costly. To address this issue, we introduce a cost function, termed the Cost of Actions under Uncertainty (CoAU), which determines when the actuator should take correct actions and avoid incorrect ones when the receiver is uncertain about the reconstructed source state. We propose a randomized actuation policy and derive a closed-form expression for the probability of taking no incorrect action. Finally, we formulate an optimization problem to find the optimal randomized actuation policy that maximizes this probability. The results show that the resulting policy substantially reduces incorrect actuator actions.


[101] 2604.01081

ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction

3D semantic occupancy prediction is central to autonomous driving, yet current methods are vulnerable to long-tailed class bias and out-of-distribution (OOD) inputs, often overconfidently assigning anomalies to rare classes. We present ProOOD, a lightweight, plug-and-play method that couples prototype-guided refinement with training-free OOD scoring. ProOOD comprises (i) prototype-guided semantic imputation that fills occluded regions with class-consistent features, (ii) prototype-guided tail mining that strengthens rare-class representations to curb OOD absorption, and (iii) EchoOOD, which fuses local logit coherence with local and global prototype matching to produce reliable voxel-level OOD scores. Extensive experiments on five datasets demonstrate that ProOOD achieves state-of-the-art performance on both in-distribution 3D occupancy prediction and OOD detection. On SemanticKITTI, it surpasses baselines by +3.57% mIoU overall and +24.80% tail-class mIoU; on VAA-KITTI, it improves AuPRCr by +19.34 points, with consistent gains across benchmarks. These improvements yield more calibrated occupancy estimates and more reliable OOD detection in safety-critical urban driving. The source code is publicly available at this https URL.


[102] 2604.01134

VRUD: A Drone Dataset for Complex Vehicle-VRU Interactions within Mixed Traffic

The Operational Design Domain (ODD) of urbanoriented Level 4 (L4) autonomous driving, especially for autonomous robotaxis, confronts formidable challenges in complex urban mixed traffic environments. These challenges stem mainly from the high density of Vulnerable Road Users (VRUs) and their highly uncertain and unpredictable interaction behaviors. However, existing open-source datasets predominantly focus on structured scenarios such as highways or regulated intersections, leaving a critical gap in data representing chaotic, unstructured urban environments. To address this, this paper proposes an efficient, high-precision method for constructing drone-based datasets and establishes the Vehicle-Vulnerable Road User Interaction Dataset (VRUD), as illustrated in Figure 1. Distinct from prior works, VRUD is collected from typical "Urban Villages" in Shenzhen, characterized by loose traffic supervision and extreme occlusion. The dataset comprises 4 hours of 4K/30Hz recording, containing 11,479 VRU trajectories and 1,939 vehicle trajectories. A key characteristic of VRUD is its composition: VRUs account for about 87% of all traffic participants, significantly exceeding the proportions in existing benchmarks. Furthermore, unlike datasets that only provide raw trajectories, we extracted 4,002 multi-agent interaction scenarios based on a novel Vector Time to Collision (VTTC) threshold, supported by standard OpenDRIVE HD maps. This study provides valuable, rare edge-case resources for enhancing the safety performance of ADS in complex, unstructured urban environments. To facilitate further research, we have made the VRUD dataset open-source at: this https URL.


[103] 2604.01141

Looking into a Pixel by Nonlinear Unmixing -- A Generative Approach

Due to the large footprint of pixels in remote sensing imagery, hyperspectral unmixing (HU) has become an important and necessary procedure in hyperspectral image analysis. Traditional HU methods rely on a prior spectral mixing model, especially for nonlinear mixtures, which has largely limited the performance and generalization capacity of the unmixing approach. In this paper, we address the challenging problem of hyperspectral nonlinear unmixing (HNU) without explicit knowledge of the mixing model. Inspired by the principle of generative models, where images of the same distribution can be generated as that of the training images without knowing the exact probability distribution function of the image, we develop an invertible mixing-unmixing process via a bi-directional GAN framework, constrained by both the cycle consistency and the linkage between linear and nonlinear mixtures. The combination of cycle consistency and linear linkage provides powerful constraints without requiring an explicit mixing model. We refer to the proposed approach as the linearly-constrained CycleGAN unmixing net, or LCGU net. Experimental results indicate that the proposed LCGU net exhibits stable and competitive performance across different datasets compared with other state-of-the-art model-based HNU methods.


[104] 2604.01149

Spectral Decomposition of Discrete-Time Controllability Gramian and Its Inverse via System Eigenvalues

This paper develops a closed-form spectral decomposition framework for the Gramian matrices of discrete-time linear dynamical systems. The main results provide explicit decompositions of the discrete-time controllability Gramian and its inverse in terms of the eigenvalues of the dynamics matrix, yielding a mode-resolved representation of these matrices. In contrast to the more common use of aggregate Gramian characteristics, such as eigenvalues, singular values, determinants, and trace-based metrics, the proposed approach describes the internal structure of the Gramian itself through contributions associated with individual modes and their pairwise combinations. The framework is extended further to the solution of the discrete-time Lyapunov difference equation, placing the obtained formulas in a broader context relevant to the analysis and computation of time-varying and nonlinear systems. In addition, the decomposition is generalized to systems whose dynamics matrix has multiple eigenvalues, enabling a closed-form estimation of the effects of resonant interactions between eigenmodes. The proposed results provide a structural tool for the analysis of controllability, observability and stability in discrete-time systems and complement existing Gramian-based methods used in model reduction, estimation, actuator and sensor selection, and energy-aware control. Beyond their theoretical interest, the derived decompositions may support the development of improved computational procedures and more informative performance criteria for a range of discrete-time control problems.


[105] 2301.05351

Data-driven Moving Horizon Estimation for Angular Velocity of Space Noncooperative Target in Eddy Current De-tumbling Mission

Angular velocity estimation is critical for eddy current de-tumbling of noncooperative space targets. However, unknown model of the noncooperative target and few observation data make the model-based estimation methods challenged. In this paper, a Data-driven Moving Horizon Estimation method is proposed to estimate the angular velocity of the noncooperative target with de-tumbling torque. In this method, model-free state estimation of the angular velocity can be achieved using only one historical trajectory data that satisfies the rank condition. With local linear approximation, the Willems fundamental lemma is extended to nonlinear autonomous systems, and the rank condition for the historical trajectory data is deduced. Then, a data-driven moving horizon estimation algorithm based on the M step Lyapunov function is designed, and the time-discount robust stability of the algorithm is given. In order to illustrate the effectiveness of the proposed algorithm, experiments and simulations are performed to estimate the angular velocity in eddy current de-tumbling with only de-tumbling torque measurement.


[106] 2410.09236

Enhancing Infant Crying Detection with Gradient Boosting for Improved Emotional and Mental Health Diagnostics

Infant crying can serve as a crucial indicator of various physiological and emotional states. This paper introduces a comprehensive approach detecting infant cries within audio data. We integrate Wav2Vec with traditional audio features and employ Gradient Boosting Machines for cry classification. We validate our approach on a real world dataset, demonstrating significant performance improvements over existing methods.


[107] 2502.19977

Convergence Guarantees of Model-free Policy Gradient Methods for LQR with Stochastic Data

Policy gradient (PG) methods are the backbone of many reinforcement learning algorithms due to their good performance in policy optimization problems. As a gradient-based approach, PG methods typically rely on knowledge of the system dynamics. If this is not available, trajectory data can be utilized to approximate first-order information. When the data are noisy, gradient estimates become inaccurate and a study that investigates uncertainty estimation and the analysis of its propagation through the algorithm is currently missing. To address this, our work focuses on the Linear Quadratic Regulator (LQR) problem for systems subject to additive stochastic noise. After briefly summarizing the state of the art for cases with a known model, we focus on scenarios where the system dynamics are unknown, and approximate gradient information is obtained using zeroth-order optimization techniques. We analyze the theoretical properties by computing the error in the estimated gradient and examining how this error affects the convergence of PG algorithms. Additionally, we provide global convergence guarantees for various versions of PG methods, including those employing adaptive step sizes and variance reduction techniques, which help increase the convergence rate and reduce sample complexity. This study contributed to characterizing the robustness of model-free PG methods, aiming to identify their limitations in the presence of stochastic noise and proposing improvements to enhance their applicability.


[108] 2505.04959

MoRe-3DGSMR: Motion-resolved reconstruction framework for free-breathing pulmonary MRI based on 3D Gaussian representation

This study presents an unsupervised, motion-resolved reconstruction framework for high-resolution, free-breathing pulmonary magnetic resonance imaging (MRI), utilizing a three-dimensional Gaussian representation (3DGS). The proposed method leverages 3DGS to address the challenges of motion-resolved 3D isotropic pulmonary MRI reconstruction by enabling data smoothing between voxels for continuous spatial representation. Pulmonary MRI data acquisition is performed using a golden-angle radial sampling trajectory, with respiratory motion signals extracted from the center of k-space in each radial spoke. Based on the estimated motion signal, the k-space data is sorted into multiple respiratory phases. A 3DGS framework is then applied to reconstruct a reference image volume from the first motion state. Subsequently, a patient-specific convolutional neural network is trained to estimate the deformation vector fields (DVFs), which are used to generate the remaining motion states through spatial transformation of the reference volume. The proposed reconstruction pipeline is evaluated on six datasets from six subjects and bench-marked against three state-of-the-art reconstruction methods. The experimental findings demonstrate that the proposed reconstruction framework effectively reconstructs high-resolution, motion-resolved pulmonary MR images. Compared with existing approaches, it achieves superior image quality, reflected by higher signal-to-noise ratio and contrast-to-noise ratio. The proposed unsupervised 3DGS-based reconstruction method enables accurate motion-resolved pulmonary MRI with isotropic spatial resolution. Its superior performance in image quality metrics over state-of-the-art methods highlights its potential as a robust solution for clinical pulmonary MR imaging.


[109] 2505.19225

Unified Medical Image Tokenizer for Autoregressive Synthesis and Understanding

Autoregressive modeling has driven major advances in multimodal AI, yet its application to medical imaging remains constrained by the absence of a unified image tokenizer that simultaneously preserves fine-grained anatomical structures and rich clinical semantics across heterogeneous modalities. Existing approaches jointly optimize image reconstruction and textual semantic objectives, relying on large-scale image-caption pairs and are prone to gradient interference. This is ill-suited for the medical domain where paired data are scarce and abundant unpaired images remain unexploited. This work identifies these issues in building unified medical image tokenizers, and introduces a principled two-stage training framework using visual representation as a bridge to address them. The propose visual representation alignment stage enables the utilization of large-scale unpaired medical images to ensure reconstruction fidelity and establish foundational semantics, alleviating the interference and better preparing for the second stage where fine-grained textual semantics are injected using image-text pairs. The resulting tokenizer, MedITok, is trained on over 33 million medical images spanning 9 modalities and 2 million image-text pairs. MedITok achieves state-of-the-art performance on 30+ benchmarks spanning 9 imaging modalities and 4 task families. It further enables autoregressive modeling for diagnostic and generative applications, serving as a scalable component for future multimodal models with unified synthesis and understanding capabilities in the medical domain. Project page: this https URL


[110] 2507.16426

Derivative-Agnostic Inference of Nonlinear Hybrid Systems

This paper addresses the problem of inferring a hybrid automaton from a set of input-output traces of a hybrid system exhibiting discrete mode switching between continuously evolving dynamics. Existing approaches mainly adopt a derivative-based method where (i) the occurrence of mode switching is determined by a drastic variation in derivatives and (ii) the clustering of trace segments relies on signal similarity -- both subject to user-supplied thresholds. We present a derivative-agnostic approach, named Dainarx, to infer nonlinear hybrid systems where the dynamics are captured by nonlinear autoregressive exogenous (NARX) models. Dainarx employs NARX models as a unified, threshold-free representation through the detection of mode switching and trace-segment clustering. We show that Dainarx suffices to learn models that closely approximate a general class of hybrid systems featuring high-order nonlinear dynamics with exogenous inputs, nonlinear guard conditions, and linear resets. Experimental results on a collection of benchmarks indicate that our approach can effectively and efficiently infer nontrivial hybrid automata with high-order dynamics yielding significantly more accurate approximations than state-of-the-art techniques.


[111] 2507.16962

Harmonization in Magnetic Resonance Imaging: A Survey of Acquisition, Image-level, and Feature-level Methods

Magnetic resonance imaging (MRI) has greatly advanced neuroscience research and clinical diagnostics. However, imaging data collected across different scanners, acquisition protocols, or imaging sites often exhibit substantial heterogeneity, known as batch effects or site effects. These non-biological sources of variability can obscure true biological signals, reduce reproducibility and statistical power, and severely impair the generalizability of learning-based models across datasets. Image harmonization is grounded in the central hypothesis that site-related biases can be eliminated or mitigated while preserving meaningful biological information, thereby improving data comparability and consistency. This review provides a comprehensive overview of key concepts, methodological advances, publicly available datasets, and evaluation metrics in the field of MRI harmonization. We systematically cover the full imaging pipeline and categorize harmonization approaches into prospective acquisition and reconstruction, retrospective image-level and feature-level methods, and traveling-subject-based techniques. By synthesizing existing methods and evidence, we revisit the central hypothesis of image harmonization and show that, although site invariance can be achieved with current techniques, further evaluation is required to verify the preservation of biological information. To this end, we summarize the remaining challenges and highlight key directions for future research, including the need for standardized validation benchmarks, improved evaluation strategies, and tighter integration of harmonization methods across the imaging pipeline.


[112] 2508.15860

Robust Residual Finite Scalar Quantization for Neural Compression

Finite Scalar Quantization (FSQ) offers simplified training but suffers from residual magnitude decay in multi-stage settings, where subsequent stages receive exponentially weaker signals. We propose Robust Residual Finite Scalar Quantization (RFSQ), addressing this fundamental limitation through two novel conditioning strategies: learnable scaling factors and invertible layer normalization. Our experiments across audio and image modalities demonstrate RFSQ's effectiveness and generalizability. In audio reconstruction at 24 bits/frame, RFSQ-LayerNorm achieves 3.646 DNSMOS, a 3.6% improvement over state-of-the-art RVQ (3.518). On ImageNet, RFSQ achieves 0.102 L1 loss and 0.100 perceptual loss, with LayerNorm providing 9.7% L1 improvement and 17.4% perceptual improvement over unconditioned variants. The LayerNorm strategy consistently outperforms alternatives by maintaining normalized input statistics across stages, effectively preventing exponential magnitude decay that limits naive residual approaches. RFSQ combines FSQ's simplicity with multi-stage quantization's representational power, establishing a new standard for neural compression across diverse modalities.


[113] 2509.03311

Credible Uncertainty Quantification under Noise and System Model Mismatch

State estimators often provide self-assessed uncertainty metrics, such as covariance matrices, whose credibility is critical for downstream tasks. However, these self-assessments can be misleading due to underlying modeling violations like noise model mismatch (NMM) or system model misspecification (SMM). This letter addresses this problem by developing a unified, multi-metric framework that integrates noncredibility index (NCI), negative log-likelihood (NLL), and energy score (ES) metrics, featuring an empirical location test (ELT) to detect system model bias and a directional probing technique that uses the metrics' asymmetric sensitivities to distinguish NMM from SMM. Monte Carlo simulations reveal that the proposed method achieves excellent diagnosis accuracy (80-100%) and significantly outperforms single-metric diagnosis methods. The effectiveness of the proposed method is further validated on a real-world UWB positioning dataset. This framework provides a practical tool for turning patterns of credibility indicators into actionable diagnoses of model deficiencies.


[114] 2509.19928

Measuring Prosody Diversity in Zero-Shot TTS: A New Metric, Benchmark, and Exploration

Prosody diversity is essential for achieving naturalness and expressiveness in zero-shot text-to-speech (TTS). However, frequently used acoustic metrics capture only partial views of prosodic variation and correlate poorly with human perception, leaving the problem of reliably quantifying prosody diversity underexplored. To bridge this gap, we introduce ProsodyEval, a prosody diversity assessment dataset that provides Prosody Mean Opinion Score (PMOS) alongside conventional acoustic metrics. ProsodyEval comprises 1000 speech samples derived from 7 mainstream TTS systems, with 2000 human ratings. Building on this, we propose the Discretized Speech Weighted Edit Distance (DS-WED), a new objective diversity metric that quantifies prosodic variation via weighted edit distance over semantic tokens. Experiments on ProsodyEval show that DS-WED achieves substantially higher correlation with human judgments than existing acoustic metrics, while remaining highly robust in speech tokenization from HuBERT and WavLM. Leveraging DS-WED, we benchmark state-of-the-art open-source TTS systems on LibriSpeech test-clean and Seed-TTS test-en, and further explorations uncover several factors that influence prosody diversity, including generative modeling paradigms, duration control, and reinforcement learning. Moreover, we find that current large audio language models (LALMs) remain limited in capturing prosodic variations. Audio samples are available at this https URL.


[115] 2510.22015

Motion Planning with Precedence Specifications via Augmented Graphs of Convex Sets

We present an algorithm for planning trajectories that avoid obstacles and satisfy key-door precedence specifications expressed with a fragment of signal temporal logic. Our method includes a novel exact convex partitioning of the obstacle free space that encodes connectivity among convex free space sets, key sets, and door sets. We then construct an augmented graph of convex sets that exactly encodes the key-door precedence specifications. By solving a shortest path problem in this augmented graph of convex sets, our pipeline provides an exact solution up to a finite parameterization of the trajectory. To illustrate the effectiveness of our approach, we present a method to generate key-door mazes that provide challenging problem instances, and we perform numerical experiments to evaluate the proposed pipeline. Our pipeline is faster by several orders of magnitude than recent state-of-the art methods that use general purpose temporal logic tools.


[116] 2510.22514

Robust Multi-Agent Safety via Tube-Based Tightened Exponential Barrier Functions

This paper presents a constructive framework for synthesizing provably safe controllers for nonlinear multi-agent systems subject to bounded disturbances. The methodology applies to systems representable in Brunovsky canonical form, accommodating arbitrary-order dynamics in multi-dimensional spaces. The central contribution is a method of constraint tightening that formally couples robust error feedback with nominal trajectory planning. The key insight is that the design of an ancillary feedback law, which confines state errors to a robust positively invariant (RPI) tube, simultaneously provides the exact information needed to ensure the safety of the nominal plan. Specifically, the geometry of the resulting RPI tube is leveraged via its support function to derive state-dependent safety margins. These margins are then used to systematically tighten the high relative-degree exponential control barrier function (eCBF) constraints imposed on the nominal planner. This integrated synthesis guarantees that any nominal trajectory satisfying the tightened constraints corresponds to a provably safe trajectory for the true, disturbed system. We demonstrate the practical utility of this formal synthesis method by implementing the planner within a distributed Model Predictive Control (MPC) scheme, which optimizes performance while inheriting the robust safety guarantees.


[117] 2511.09609

TempRetinex: Retinex-based Unsupervised Enhancement for Low-light Video Under Diverse Lighting Conditions

The acquisition of paired low-light video sequences remains challenging due to issues associated with poor temporal consistency, varying illumination characteristics and camera parameters. This has driven significant interest in unsupervised low-light enhancement approaches. In this context, we propose TempRetinex, an unsupervised Retinex-based video enhancement framework exploiting inter-frame correlations. We introduce Brightness Consistency Preprocessing (BCP) that explicitly aligns intensity distributions across exposures. BCP is shown to significantly improve model robustness to diverse lighting scenarios. Moreover, we propose a multiscale temporal consistency-aware loss and an occlusion-aware masking technique to enforce similarity between consecutive frames. We further incorporate a Reverse Inference (RI) strategy to refine temporally unstable frames and a Self-Ensemble (SE) mechanism to boost denoising across diverse textures. Experiments demonstrate that TempRetinex achieves state-of-the-art performance in perceptual quality.


[118] 2511.19447

A model of the Unity High Definition Render Pipeline, with applications to flat-panel and head-mounted display characterization

Game engines such as Unity and Unreal Engine have become popular tools for creating perceptual and behavioral experiments in complex, interactive environments. They are often used with flat-panel displays, and also with head-mounted displays. Here I describe and test a mathematical model of luminance and color in Unity's High Definition Render Pipeline (HDRP). I show that the HDRP has several non-obvious features, such as nonlinearities applied to material properties and rendered values, that must be taken into account in order to show well-controlled stimuli. I also show how the HDRP can be configured to display gamma-corrected luminance and color, and I provide software to create the specialized files needed for gamma correction.


[119] 2512.21937

Integrating Low-Altitude SAR Imaging into UAV Data Backhaul

Synthetic aperture radar (SAR) on unmanned aerial vehicles (UAVs) enables high-resolution sensing in low-altitude wireless networks, while requiring reliable uplink data backhaul to ground base stations under dynamic channel conditions. Conventional orthogonal frequency division multiplexing (OFDM)-based SAR systems rely on pilot or deterministic signaling, which occupies only a small fraction of the available timefrequency (TF) resources and limits imaging performance. This paper develops a data-aided OFDM-SAR imaging framework that reuses uplink communication data symbols for sensing, thereby exploiting the dominant TF resources of the UAV backhaul link. However, the randomness of data symbols disrupts the coherent structure required for SAR imaging, especially in highly dynamic channels with strong TF coupling, leading to severe degradation in range-Doppler focusing. To address this issue, we establish a unified TF domain filtering framework to suppress data-induced randomness and recover an equivalent deterministic imaging channel. Within this framework, reciprocal, matched, and Wiener filtering are interpreted under a common formulation, enabling a systematic characterization of their impact on imaging performance. A normalized mean square error (NMSE) metric of a reference point target's profile is further adopted to quantify the joint effects of randomnessinduced distortion and noise amplification. Simulation results based on 5G NR parameters show that the proposed dataaided scheme significantly outperforms pilot-only approaches by leveraging uplink data resources, demonstrating that effective TF-domain filtering is essential to ensure high-resolution imaging in dynamic UAV channels.


[120] 2601.00226

Let Distortion Guide Restoration (DGR): A physics-informed learning framework for Prostate Diffusion MRI

We present Distortion-Guided Restoration (DGR), a physics-informed hybrid CNN-diffusion framework for acquisition-free correction of severe susceptibility-induced distortions in prostate single-shot EPI diffusion-weighted imaging (DWI). DGR is trained to invert a realistic forward distortion model using large-scale paired distorted and undistorted data synthesized from distortion-free prostate DWI and co-registered T2-weighted images from 410 multi-institutional studies, together with 11 measured B0 field maps from metal-implant cases incorporated into a forward simulator to generate low-b DWI (b = 50 s per mm squared), high-b DWI (b = 1400 s per mm squared), and ADC distortions. The network couples a CNN-based geometric correction module with conditional diffusion refinement under T2-weighted anatomical guidance. On a held-out synthetic validation set (n = 34) using ground-truth simulated distortion fields, DGR achieved higher PSNR and lower NMSE than FSL TOPUP and FUGUE. In 34 real clinical studies with severe distortion, including hip prostheses and marked rectal distension, DGR improved geometric fidelity and increased radiologist-rated image quality and diagnostic confidence. Overall, learning the inverse of a physically simulated forward process provides a practical alternative to acquisition-dependent distortion-correction pipelines for prostate DWI.


[121] 2601.11453

Implications of Grid-Forming Inverter Parameters on Disturbance Localization and Controllability

The shift from traditional synchronous generator (SG) based power generation to generation driven by power electronic devices introduces new dynamic phenomena and considerations for the control of large-scale power systems. In this paper, two aspects of all-inverter power systems are investigated: greater localization of system disturbance response and greater system controllability. The prevalence of both of these aspects are shown to be related to the lower effective inertia of inverters and have implications for future widearea control system design. Greater disturbance localization implies the need for feedback measurement placement close to generator nodes to properly reject disturbances in the system while increased system controllability implies that widearea control systems should preferentially actuate inverters to most efficiently control the system. This investigation utilizes reduced-order linear time-invariant models of both SGs and inverters that are shown to capture the frequency dynamics of interest in both all-SG and all-inverter systems, allowing for the efficient use of both frequency and time domain analysis methods.


[122] 2602.17202

A Novel Near-Field Dictionary Design for Hybrid MIMO with Uniform Planar Arrays

Near-field ultra-massive MIMO (U-MIMO) systems provide enhanced spatial resolution but present challenges for channel estimation, particularly when hybrid architectures are employed. Within this framework, dictionary-based channel estimation schemes are needed to achieve accurate reconstruction from a reduced set of measurements. However, existing near-field dictionaries generally provide full three-dimensional coverage, which is unnecessary when user equipments are primarily located on the ground. In this paper, we propose a novel near-field grid design tailored to this common scenario. Specifically, grid points lie on a reference plane located at an arbitrary height with respect to the U-MIMO system, equipped with a uniform planar array. Furthermore, a channel accuracy metric is used to improve codebook performance, and to remark the limitations of the traditional far-field angular sampling in the near field. Results show that, as long as user equipments are not far from the reference plane, the proposed grid outperforms state-of-the-art designs in both channel estimation accuracy and spectral efficiency.


[123] 2603.21510

Unregistered Spectral Image Fusion: Unmixing, Adversarial Learning, and Recoverability

This paper addresses the fusion of a pair of spatially unregistered hyperspectral image (HSI) and multispectral image (MSI) covering roughly overlapping regions. HSIs offer high spectral but low spatial resolution, while MSIs provide the opposite. The goal is to integrate their complementary information to enhance both HSI spatial resolution and MSI spectral resolution. While hyperspectral-multispectral fusion (HMF) has been widely studied, the unregistered setting remains challenging. Many existing methods focus solely on MSI super-resolution, leaving HSI unchanged. Supervised deep learning approaches were proposed for HSI super-resolution, but rely on accurate training data, which is often unavailable. Moreover, theoretical analyses largely address the co-registered case, leaving unregistered HMF poorly understood. In this work, an unsupervised framework is proposed to simultaneously super-resolve both MSI and HSI. The method integrates coupled spectral unmixing for MSI super-resolution with latent-space adversarial learning for HSI super-resolution. Theoretical guarantees on the recoverability of the super-resolution MSI and HSI are established under reasonable generative models -- providing, to our best knowledge, the first such insights for unregistered HMF. The approach is validated on semi-real and real HSI-MSI pairs across diverse conditions.


[124] 2603.23401

Self-Supervised Graph Neural Networks for Optimal Substation Reconfiguration

Changing the transmission system topology is an efficient and costless lever to reduce congestion or increase exchange capacities. The problem of finding the optimal switch states within substations is called Optimal Substation Reconfiguration (OSR), and may be framed as a Mixed Integer Linear Program (MILP). Current state-of-the-art optimization techniques come with prohibitive computing times, making them impractical for real-time decision-making. Meanwhile, deep learning offers a promising perspective with drastically smaller computing times, at the price of an expensive training phase and the absence of optimality guarantees. In this work, we frame OSR as an Amortized Optimization problem, where a Graph Neural Network (GNN) model -- our data being graphs -- is trained in a self-supervised way to improve the objective function. We apply our approach to the maximization of the exchange capacity between two areas of a small-scale 12-substations system. Once trained, our GNN model improves the exchange capacity by 10.2% on average compared to the all connected configuration, while a classical MILP solver reaches an average improvement of 15.2% with orders-of-magnitude larger computing times.


[125] 2603.24116

How Open is Open TTS? A Practical Evaluation of Open Source TTS Tools

Open-source text-to-speech (TTS) frameworks have emerged as highly adaptable platforms for developing speech synthesis systems across a wide range of languages. However, their applicability is not uniform -- particularly when the target language is under-resourced or when computational resources are constrained. In this study, we systematically assess the feasibility of building novel TTS models using four widely adopted open-source architectures: FastPitch, VITS, Grad-TTS, and Matcha-TTS. Our evaluation spans multiple dimensions, including qualitative aspects such as ease of installation, dataset preparation, and hardware requirements, as well as quantitative assessments of synthesis quality for Romanian. We employ both objective metrics and subjective listening tests to evaluate intelligibility, speaker similarity, and naturalness of the generated speech. The results reveal significant challenges in tool chain setup, data preprocessing, and computational efficiency, which can hinder adoption in low-resource contexts. By grounding the analysis in reproducible protocols and accessible evaluation criteria, this work aims to inform best practices and promote more inclusive, language-diverse TTS development. All information needed to reproduce this study (i.e. code and data) are available in our git repository: this https URL


[126] 2603.26835

ANVIL: Accelerator-Native Video Interpolation via Codec Motion Vector Priors

Real-time 30-to-60 fps video frame interpolation on mobile neural processing units (NPUs) requires each synthesized frame within 33.3 ms. We show that mainstream flow-based video frame interpolation faces three structural deployment barriers on mobile NPUs: spatial sampling operators exceed the frame budget or lack hardware support, iterative flow refinement collapses under 8-bit integer post-training quantization, and memory-bound operators dominate the inference graph. ANVIL addresses these barriers by reusing motion vectors from the H.264/AVC decoder to prealign input frames, removing learned optical flow, spatial sampling, and iterative accumulation from the accelerator graph. The remaining residual is refined by a convolution-dominated network composed almost entirely of compute-bound operators. On a Snapdragon 8 Gen 3 device, ANVIL achieves 12.8 ms 1080p inference at 8-bit integer precision; an open-source Android player sustains 28.4 ms median end-to-end latency over 30-minute continuous playback. Per-operator causal analysis identifies quantized accumulation on recurrent flow states as a key mechanism behind integer quantization failure in iterative methods. The current design targets H.264/AVC playback with decoder-exposed motion vectors.


[127] 2603.27051

Proprioceptive feedback paradigm for safe and resilient motion control

Proprioception is a human sense that provides feedback from muscles and joints about body position and motion. This key capability keeps us upright, moving, and responding quickly to slips or stumbles. In this paper we discuss a proprioception-like feature (machine proprioceptive feedback - MPF) for motion control systems. An unexpected response of one actuator, or one agent in a multi-agent system, is compensated by other actuators/agents through fast feedback loops that react only to the unexpected portion. The paper appropriates the predictor-corrector mechanism of decentralized, multi-agent controllers as "proprioceptive feedback" for centrally controlled ones. It analyzes a nature and degree of impairment that can be managed and offers two options, full- MPF and split-MPF, with different wiring architectures as well as different stability and safety properties. Multi-vehicle interchange lane-swap traffic simulations confirm the analytical results.


[128] 2603.27592

Fundamental Limits of Man-in-the-Middle Attack Detection in Model-Free Reinforcement Learning

We consider the problem of learning-based man-in-the-middle (MITM) attacks in cyber-physical systems (CPS), and extend our previously proposed Bellman Deviation Detection (BDD) framework for model-free reinforcement learning (RL). We refine the standard MDP attack model by allowing the reward function to depend on both the current and subsequent states, thereby capturing reward variations induced by errors in the adversary's transition estimate. We also derive an optimal system-identification strategy for the adversary that minimizes detectable value deviations. Further, we prove that the agent's asymptotic learning time required to secure the system scales linearly with the adversary's learning time, and that this matches the optimal lower bound. Hence, the proposed detection scheme is order-optimal in detection efficiency. Finally, we extend the framework to asynchronous and intermittent attack scenarios, where reliable detection is preserved.


[129] 2603.28873

Associative Memory System via Threshold Linear Networks

Humans learn and form memories in stochastic environments. Auto-associative memory systems model these processes by storing patterns and later recovering them from corrupted versions. Here, memories are learned by associating each pattern with an attractor in a latent space. After learning, when (possibly corrupted) patterns are presented to the system, latent dynamics facilitate retrieval of the appropriate uncorrupted pattern. In this work, we propose a novel online auto-associative memory system. In contrast to existing works, our system supports sequential memory formation and provides formal guarantees of robust memory retrieval via region-of-attraction analysis. We use a threshold-linear network as latent space dynamics in combination with an encoder, decoder, and controller. We show in simulation that the memory system successfully reconstructs patterns from corrupted inputs.


[130] 2603.29490

Flatness-based control of a Timoshenko beam

The paper presents an approach to flatness-based control design for hyperbolic multi-input systems, building upon the hyperbolic controller form (HCF). The transformation into HCF yields a simplified system representation that considerably facilitates the design of state feedback controllers for trajectory tracking. The proposed concept is demonstrated for a Timoshenko beam and validated through numerical simulations, demonstrating trajectory tracking and closed-loop stability.


[131] 2505.14222

MATHDance: Mamba-Transformer Architecture with Uniform Tokenization for High-Quality 3D Dance Generation

Music-to-dance generation represents a challenging yet pivotal task at the intersection of choreography, virtual reality, and creative content generation. Despite its significance, existing methods face substantial limitation in achieving choreographic consistency. To address the challenge, we propose MatchDance, a novel framework for music-to-dance generation that constructs a latent representation to enhance choreographic consistency. MatchDance employs a two-stage design: (1) a Kinematic-Dynamic-based Quantization Stage (KDQS), which encodes dance motions into a latent representation by Finite Scalar Quantization (FSQ) with kinematic-dynamic constraints and reconstructs them with high fidelity, and (2) a Hybrid Music-to-Dance Generation Stage(HMDGS), which uses a Mamba-Transformer hybrid architecture to map music into the latent representation, followed by the KDQS decoder to generate 3D dance motions. Additionally, a music-dance retrieval framework and comprehensive metrics are introduced for evaluation. Extensive experiments on the FineDance dataset demonstrate state-of-the-art performance.


[132] 2506.02768

Geometric Visual Servo Via Optimal Transport

When developing control laws for robotic systems, the principle factor when examining their performance is choosing inputs that allow smooth tracking to a reference input. In the context of robotic manipulation, this involves translating an object or end-effector from an initial pose to a target pose. Robotic manipulation control laws frequently use vision systems as an error generator to track features and produce control inputs. However, current control algorithms don't take into account the probabilistic features that are extracted and instead rely on hand-tuned feature extraction methods. Furthermore, the target features can exist in a static pose thus allowing a combined pose and feature error for control generation. We present a geometric control law for the visual servoing problem for robotic manipulators. The input from the camera constitutes a probability measure on the 3-dimensional Special Euclidean task-space group, where the Wasserstein distance between the current and desired poses is analogous with the geometric geodesic. From this, we develop a controller that allows for both pose and image-based visual servoing by combining classical PD control with gravity compensation with error minimization through the use of geodesic flows on a 3-dimensional Special Euclidean group. We present our results on a set of test cases demonstrating the generalisation ability of our approach to a variety of initial positions.


[133] 2507.17851

Speaker Disentanglement of Speech Pre-trained Model Based on Interpretability

Self-supervised speech models learn representations that capture both content and speaker information. Yet this entanglement creates problems: content tasks suffer from speaker bias, and privacy concerns arise when speaker identity leaks through supposedly anonymized representations. We present two contributions to address these challenges. First, we develop InterpTRQE-SptME (Timbre Residual Quantitative Evaluation Benchmark of Speech pre-training Models Encoding via Interpretability), a benchmark that directly measures residual speaker information in content embeddings using SHAP-based interpretability analysis. Unlike existing indirect metrics, our approach quantifies the exact proportion of speaker information remaining after disentanglement. Second, we propose InterpTF-SptME, which uses these interpretability insights to filter speaker information from embeddings. Testing on VCTK with seven models including HuBERT, WavLM, and ContentVec, we find that SHAP Noise filtering reduces speaker residuals from 18.05% to nearly zero while maintaining recognition accuracy (CTC loss increase under 1%). The method is model-agnostic and requires no retraining.


[134] 2511.16765

RampoNN: A Reachability-Guided System Falsification for Efficient Cyber-Kinetic Vulnerability Detection

Detecting kinetic vulnerabilities in Cyber-Physical Systems (CPS), vulnerabilities in control code that can precipitate hazardous physical consequences, is a critical challenge. This task is complicated by the need to analyze the intricate coupling between complex software behavior and the system's physical dynamics. Furthermore, the periodic execution of control code in CPS applications creates a combinatorial explosion of execution paths that must be analyzed over time, far exceeding the scope of traditional single-run code analysis. This paper introduces RampoNN, a novel framework that systematically identifies kinetic vulnerabilities given the control code, a physical system model, and a Signal Temporal Logic (STL) specification of safe behavior. RampoNN first analyzes the control code to map the control signals that can be generated under various execution branches. It then employs a neural network to abstract the physical system's behavior. To overcome the poor scaling and loose over-approximations of standard neural network reachability, RampoNN uniquely utilizes Deep Bernstein neural networks, which are equipped with customized reachability algorithms that yield orders of magnitude tighter bounds. This high-precision reachability analysis allows RampoNN to rapidly prune large sets of guaranteed-safe behaviors and rank the remaining traces by their potential to violate the specification. The results of this analysis are then used to effectively guide a falsification engine, focusing its search on the most promising system behaviors to find actual vulnerabilities. We evaluated our approach on a PLC-controlled water tank system and a switched PID controller for an automotive engine. The results demonstrate that RampoNN leads to acceleration of the process of finding kinetic vulnerabilities by up to 98.27% and superior scalability compared to other state-of-the-art methods.


[135] 2512.02079

Robust Geospatial Coordination of Multi-Agent Communications Networks Under Attrition

Coordinating emergency responses in extreme environments, such as wildfires, requires resilient and high-bandwidth communication backbones. While autonomous aerial swarms can establish ad-hoc networks to provide this connectivity, the high risk of individual node attrition in these settings often leads to network fragmentation and mission-critical downtime. To overcome this challenge, we introduce and formalize the problem of Robust Task Networking Under Attrition (RTNUA), which extends connectivity maintenance in multi-robot systems to explicitly address proactive redundancy and attrition recovery. We then introduce Physics-Informed Robust Employment of Multi-Agent Networks ($\Phi$IREMAN), a topological algorithm leveraging physics-inspired potential fields to solve this problem. In our evaluations, $\Phi$IREMAN consistently outperforms baselines, and is able to maintain greater than $99.9\%$ task uptime despite substantial attrition in simulations with up to 100 tasks and 500 drones, demonstrating both effectiveness and scalability.


[136] 2601.06690

S-DAPT-2026: A Stage-Aware Synthetic Dataset for Advanced Persistent Threat Detection

The detection of advanced persistent threats (APTs) remains a crucial challenge due to their stealthy, multistage nature and the limited availability of realistic, labeled datasets for systematic evaluation. Synthetic dataset generation has emerged as a practical approach for modeling APT campaigns; however, existing methods often rely on computationally expensive alert correlation mechanisms that limit scalability. Motivated by these limitations, this paper presents a near realistic synthetic APT dataset and an efficient alert correlation framework. The proposed approach introduces a machine learning based correlation module that employs K Nearest Neighbors (KNN) clustering with a cosine similarity metric to group semantically related alerts within a temporal context. The dataset emulates multistage APT campaigns across campus and organizational network environments and captures a diverse set of fourteen distinct alert types, exceeding the coverage of commonly used synthetic APT datasets. In addition, explicit APT campaign states and alert to stage mappings are defined to enable flexible integration of new alert types and support stage aware analysis. A comprehensive statistical characterization of the dataset is provided to facilitate reproducibility and support APT stage predictions.


[137] 2603.05537

Sketch It Out: Exploring Label-Free Structural Cues for Multimodal Gait Recognition

Gait recognition is a non-intrusive biometric technique for security applications, yet existing studies are dominated by silhouette- and parsing-based representations. Silhouettes are sparse and miss internal structural details, limiting discriminability. Parsing enriches silhouettes with part-level structures, but relies heavily on upstream human parsers (e.g., label granularity and boundary precision), leading to unstable performance across datasets and sometimes even inferior results to silhouettes. We revisit gait representations from a structural perspective and describe a design space defined by edge density and supervision form: silhouettes use sparse boundary edges with weak single-label supervision, while parsing uses denser cues with strong semantic priors. In this space, we identify an underexplored paradigm: dense part-level structure without explicit semantic labels, and introduce SKETCH as a new visual modality for gait recognition. Sketch extracts high-frequency structural cues (e.g., limb articulations and self-occlusion contours) directly from RGB images via edge-based detectors in a label-free manner. We further show that label-guided parsing and label-free sketch are semantically decoupled and structurally complementary. Based on this, we propose SKETCHGAIT, a hierarchically disentangled multi-modal framework with two independent streams for modality-specific learning and a lightweight early-stage fusion branch to capture structural complementarity. Extensive experiments on SUSTech1K and CCPG validate the proposed modality and framework: SketchGait achieves 92.9% Rank-1 on SUSTech1K and 93.1% mean Rank-1 on CCPG.


[138] 2603.11360

Fair-Gate: Fairness-Aware Interpretable Risk Gating for Sex-Fair Voice Biometrics

Voice biometric systems can exhibit sex-related performance gaps even when overall verification accuracy is strong. We attribute these gaps to two practical mechanisms: (i) demographic shortcut learning, where speaker classification training exploits spurious correlations between sex and speaker identity, and (ii) feature entanglement, where sex-linked acoustic variation overlaps with identity cues and cannot be removed without degrading speaker discrimination. We propose Fair-Gate, a fairness-aware and interpretable risk-gating framework that addresses both mechanisms in a single pipeline. Fair-Gate applies risk extrapolation to reduce variation in speaker-classification risk across proxy sex groups, and introduces a local complementary gate that routes intermediate features into an identity branch and a sex branch. The gate provides interpretability by producing an explicit routing mask that can be inspected to understand which features are allocated to identity versus sex-related pathways. Experiments on VoxCeleb1 show that Fair-Gate improves the utility--fairness trade-off, yielding more sex-fair ASV performance under challenging evaluation conditions.


[139] 2603.24324

Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning

Designing effective auxiliary rewards for cooperative multi-agent systems remains a challenging task. Misaligned incentives risk inducing suboptimal coordination, especially when sparse task feedback fails to provide sufficient grounding. This study introduces an automated reward design framework that leverages large language models to synthesize executable reward programs from environment instrumentation. The procedure constrains candidate programs within a formal validity envelope and evaluates their efficacy by training policies from scratch under a fixed computational budget. Selection across generations depends exclusively on the sparse task return. The framework is evaluated across four distinct Overcooked-AI layouts characterized by varied corridor congestion, handoff dependencies, and structural asymmetries. Iterative search generations consistently yield superior task returns and delivery counts, with the most pronounced gains occurring in environments dominated by interaction bottlenecks. Diagnostic analysis of the synthesized shaping components indicates increased interdependence in action selection and improved signal alignment in coordination-intensive tasks. These results demonstrate that the search for objective-grounded reward programs can mitigate the burden of manual engineering while producing shaping signals compatible with cooperative learning under finite budgets.