New articles on Electrical Engineering and Systems Science


[1] 2605.27454

NL-MambaXCT: Self-Supervised Nested-Learning Mamba for Nomex Honeycomb X-ray CT Defect Classification

X-ray computed tomography (XCT) is widely used for non-destructive testing of Nomex honeycomb structures in aerospace manufacturing, but industrial inspection still relies heavily on manual interpretation and supervised models trained on limited labeled data. This work introduces NL-MambaXCT, a Mamba-based framework that combines self-supervised masked image modelling with a Nested Learning (NL) formulation for automated, label-efficient defect classification from production XCT slices. The backbone is a four-stage 2D encoder with RegNet convolutional blocks in the early stages and Mamba-based sequence mixing with attention in the deeper stages. It is pretrained by masked image modelling on 19,961 unlabeled industrial XCT slices and fine-tuned on 2,000 relabeled Nomex XCT slices split by production order. NL is instantiated through two-timescale parameter dynamics: selected projections maintain slow exponential-moving-average traces alongside fast weights, while a deep-momentum optimizer introduces an additional slow parameter-update trajectory. On the held-out test set, the MIM-pretrained NL-MambaXCT model achieves 96.91% accuracy and 96.8% macro F1, outperforming CNN, attention, and single-timescale Mamba baselines by 3.11--10.31 percentage points in accuracy. The results suggest that combining masked self-supervision with NL-style fast/ slow learning dynamics is a promising strategy for robust defect classification in Nomex honeycomb XCT inspection.


[2] 2605.27544

Subsystem Structure as an Inferential Resource for Coupled Engineered Systems

Engineered infrastructure systems pose inverse problems in which hidden states, unknown parameters, and subsystem couplings must be inferred from sparse and noisy measurements. These problems are difficult because physical subsystems are heterogeneous, sensing is partial, uncertainty is distributed across subsystem interfaces, and computational cost grows rapidly with system size. We address this challenge with probabilistic compositional inference, a graph-based architecture that represents a coupled system as interacting subsystems, each retaining its own local model, estimator, and uncertainty representation, while coupling is handled through physically meaningful stochastic messages exchanged across subsystem interfaces. This formulation allows mechanistic, learned, and deterministic components to coexist within a single inference framework and propagates calibrated uncertainty without assembling a global augmented state or covariance. We validate the framework in three increasingly demanding settings: a sparse-sensing canonical inverse problem, where interface couplings can also be learned from data; infrastructure-scale power networks, where the method matches centralized joint state-and-parameter inference while reducing computational scaling from approximately cubic to approximately linear; and a multi-physics turbine embedded in a power-grid network, where heterogeneous subsystems compose hierarchically without degrading local inference or collapsing local posteriors into a global estimate. Together, these results show that subsystem structure can be exploited as the organizing principle for uncertainty-aware inverse inference in coupled engineered systems.


[3] 2605.27645

Private & Common Information States in Decentralized Team Equilibrium via Dynamic Programming for POMDPs with Delayed Sharing

Witsenhausen, in his seminal 1971 paper [1], introduced decentralized partially observable Markov decision problems (POMDPs), with multiple agents or controls operating under T-step delayed sharing information patterns. A fundamental problem in [1] is the identification of structural properties of optimal strategies that compress the information patterns into multiple information states. In this paper, we develop such structural properties of optimal strategies and associated dynamic programming (DP) equations, using the concept of decentralized sequential team equilibrium (a generalization of person-by-person optimality from static team theory). Within this framework, each strategy is assigned an individual value function conditioned on its delayed sharing information pattern, while the strategies of all other agents are held fixed. The resulting DP framework yields several new DP equations and characterizations of decentralized team equilibrium. Moreover, these DP equations exhibit fundamental properties analogous to those of centralized DP of POMDPs: the optimization in each agent's DP equations is performed over the agent's action space rather than over strategy spaces; each agent's multiple information states satisfy Markov recursions; and a separation principle holds. The DP equations reveal a structural compression property of optimal strategies: each agent compresses its delayed sharing information pattern into three components: 1) a private posterior distribution conditioned on the agent's delayed sharing information pattern, 2) a centralized posterior distribution conditioned on the common information shared by all agents, and 3) the agent's private information component. This structural result substantially extends Witsenhausen's Assertion 8 in [1].


[4] 2605.27730

DSRDM: Digital Signal Recovery Diffusion Model for Semantic Communications

Diffusion model (DM) has recently appeared as a promising type of generative model for AI-generated content, which has been widely used for image reconstruction, generation, and channel denoising in semantic communication (SemCom) due to its strong generation capacity. However, most of existing works regarding SemCom remain confined to the image or text transmission, and neglect the commonly adopted digital signals in wireless systems. In this letter, in order to address this gap, we propose and investigate a digital signal recovery diffusion model (DSRDM) for SemCom. Specifically, DSRDM encodes digital signals by gradually adding Gaussian signals to images in the forward diffusion process of DM. After the encoded Gaussian signals embedded in the carrier image are sent to the receiver, it recovers the digital signals by predicting the added Gaussian signals iteratively in the reverse diffusion process. Moreover, to reduce the computation complexity of DSRDM, a signal adding approach is designed to avoid the retraining latency. In particular, we use the latent representation of images instead of themselves as the carrier for digital signals in DSRDM to reduce the inference latency.


[5] 2605.27753

Decoupled Delay-Doppler and Angle Estimation in BD-RIS Sensing via Nested Tucker Decomposition

We study single-target localization in a group-connected beyond-diagonal reconfigurable intelligent surface (BD-RIS)-assisted monostatic network with K element groups. We propose a Nested Tensor Factorization and Estimation (NTFE) algorithm that models the received signal as a 3rd-order nested Tucker tensor, decoupling the delay-Doppler and angle domains. The resulting two-stage procedure estimates the target-bearing tensor factors and then extracts the other physical parameters using subspace and closed-form steps. We also analyze identifiability and uniqueness conditions. Simulations show that NTFE exploits the group-connected BD-RIS structure and outperforms state-of-the-art sensing benchmarks.


[6] 2605.27796

Benchmarking Ultrasound Foundation Models for Fetal Plane Classification

Ultrasound is widely used in obstetric care due to its safety, accessibility, and real-time imaging. However, interpretation remains operator-dependent and susceptible to noise and artifacts. Deep learning models have shown strong performance to solve these problem, but they typically require large annotated datasets that are difficult to obtain in clinical ultrasound. Foundation models (FMs) offer an alternative, using a large number of ultrasound images to learn transferable representations that can generalize with limited labeled data. This work presents a comprehensive benchmark of ultrasound-specific FMs for fetal plane classification. We evaluated four ultrasound FMs (USFM, MOFO, UltraSAM, FetalCLIP) against two CNN baselines (ResNet50, EfficientNet-V2) and a ViT (DINOv3) pretrained on natural images. We trained all models under two complementary settings: full fine-tuning and linear probing with a frozen encoder. All models were trained using 5-fold patient-level cross-validation on a Spanish fetal ultrasound dataset and tested on both in-domain data and an external African cohort to assess cross-population generalization. We found that FetalCLIP achieved the best results in the linear probing setting (F1 = 0.9261 for in-domain, F1 = 0.9731 for out-of-domain), while USFM performed best in the full fine-tuning setting (F1 = 0.9476 for in-domain, F1 = 0.9515 for out-of-domain). MOFO and UltraSAM degraded most in both settings, underperforming natural image pretrained models in some cases. These findings highlight how the choice of pretrained model strongly affects fetal plane classification performance, since different pretraining objectives lead to different levels of transferability.


[7] 2605.27798

Automatic Attenuation Control for Mitigating Photon-Counting Saturation in SPAD-based Optical Wireless Communications

Single-photon avalanche diodes (SPADs) have emerged as a promising candidate for optical wireless communication (OWC) owing to their ultra-high sensitivity and singlephoton detection capability. However, under strong background radiation or high signal power, SPAD-based receivers suffer from photon-counting saturation, which severely degrades communication performance. To address this challenge, this paper introduces an automatic attenuation control (AAC) technique that dynamically optimizes the incident optical intensity to mitigate saturation effects. We develop a comprehensive analytical model for the SPAD-based OWC system, incorporating the influence of dead time and the lack of photon-number resolution. Based on this model, a convex optimization-based AAC algorithm is proposed to maximize the achievable rate in real time. Furthermore, a low-complexity AAC algorithm is devised using a closed-form trigger probability criterion, reducing computational complexity by two orders of magnitude. Numerical results demonstrate that the proposed AAC technique significantly improves both the achievable rate and symbol error rate across a wide range of background conditions, providing an efficient solution to enhance the dynamic range of photon-counting receivers.


[8] 2605.27839

Movable Antenna Enhanced Dual-Functional Radar-Communication: A Symbol-Level Precoding Approach

This letter investigates a symbol-level precoder design for movable antenna (MA)-enhanced dual-functional radar-communication (DFRC) systems. To enhance radar sensing capabilities, we formulate an optimization problem aimed at maximizing the minimum radar signal-to-interference-plus-noise ratio (SINR) across multiple targets in a cluttered environment. Our approach jointly designs the space-time transmitted waveforms, receiving filters, and antenna placement. However, the resulting problem is intractable to solve due to practical waveform constraints and the non-linear mapping from antenna positions to the corresponding channel coefficients. To address these challenges, we develop a bi-level optimization framework by leveraging deep reinforcement learning (DRL). Specifically, the twin delayed deep deterministic policy gradient (TD3) algorithm is employed in the outer layer to optimize antenna placement, while penalty convex-concave procedure (CCP) and majorization-minimization (MM) techniques are incorporated in the inner layer for regularizing waveform design. Simulation results demonstrate that the proposed method significantly improves radar SINR and achieves a superior sensing-communication trade-off compared to benchmark schemes.


[9] 2605.27840

LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation

Audio tokenizers are fundamental to unifying audio understanding and generation. Understanding requires high-level semantics, while generation demands semantic and acoustic details. Existing unified tokenizers jointly encode both in high-dimensional continuous latents, which increases the modeling burden of Diffusion Transformers (DiTs) for generation. We propose LoSATok, a low-dimensional audio tokenizer for cross-domain audio understanding and generation. Motivated by the observation that 1280-dimensional semantic encoder features are compressible, we introduce a Semantic Bottleneck that compresses them into 128 dimensions, regularized by the proposed time-relation loss for temporal feature consistency. We further design a dual-level semantic supervision method that leverages both high- and low-dimensional semantic signals, enabling the tokenizer to jointly capture semantics and acoustic details within a compact latent space. Experiments on speech, music, and general audio show that SemBo preserves strong low-dimensional semantic capacity and LoSATok retains competitive understanding performance compared with several semantic representations, while consistently improving DiT modeling performance on speech, music, and audio generation. These results demonstrate that LoSATok's low-dimensional representations can effectively support audio understanding and generation. Our code is provided at this https URL.


[10] 2605.27964

DRIFT: Driving Risk Inference via Field Transmission for Human-like Autonomous Driving

Risk fields offer spatially structured alternatives to scalar safety metrics. However, hand-crafted static risk field models struggle with occlusion and topology-driven propagation. We present DRIFT, a spatiotemporal risk field governed by an advection-diffusion-reaction partial differential equation (PDE), with an optional telegrapher term. DRIFT draws on three sources: anisotropic Gaussian kernels to capture velocity-induced risk, occlusion-aware latent hazards behind large vehicles, and topology-coupled merge-zone conflict pressure. We further introduce field-centric evaluation metrics to complement the existing Surrogate Safety Measures (SSMs), including Lane-Change Risk Differential, Temporal Anticipation Index, Occlusion Sensitivity Index, and Occlusion Response Latency. Experiments on real-world traffic datasets show that DRIFT reduces occlusion response latency and lowers the near-collision rate under occlusion compared with selected baselines in synthetic scenarios.


[11] 2605.28064

I Hear, Therefore I Trust: A Socio-Technical Investigation of Humans as Synthetic Speech Detectors

Automatic deepfake detection has received considerable research attention, yet the socio-technical environment in which humans actually encounter synthetic speech remains poorly understood. We investigate voice deepfake detection as a perceptual and contextual process, presenting a localization task in which 47 participants marked suspected synthetic segments across authentic, fully synthetic, and partially synthetic utterances under three manipulated trust cues: instructional framing, affective priming, and provenance labeling. Participants provided quality ratings on mechanicalness, expressiveness, intelligibility, clarity, calmness, and confidence of evaluation. Utterance class was the primary determinant of detection accuracy and perceptual quality; trust cues produced no main effects but motivated detection behavior. Fully synthetic speech was detected at below-chance levels. Quality ratings tracked utterance type, indicating implicit discrimination where overt detection failed.


[12] 2605.28180

Tensor Train Decomposition Based Noise Reduction and Enhanced Parameter Estimation for FMCW MIMO Radar Systems

Frequency modulated continuous wave (FMCW) radar is widely used in autonomous driving and industrial inspection due to its high-resolution target location and velocity estimation capability. However, the plethora of connected devices in automotive applications introduces electromagnetic interference and brings challenges to location-aware services, primarily due to the issue of low signal-to-noise ratio (SNR) caused by mixed noise contamination. Conventional matrix-based signal processing methods exhibit performance deterioration when handling higher-order signals under low SNR conditions. To address this challenge, this paper proposes a tensor decomposition-based framework that jointly performs noise reduction and parameter estimation for four-dimensional signals in FMCW multiple-input multiple-output (MIMO) radar systems. Specifically, the framework exploits the inherent low-rank structure and multidimensional correlations of the received signals through tensor train decomposition to effectively separate noise subspace. A data smoothing processor then reconstructs an augmented signal tensor to resolve rank deficiency caused by coherent signals. Finally, an enhanced rotational subspace algorithm is employed to jointly decouple the distance, velocity, and angle parameters by exploiting the structural fitting to the restored signal. Both simulation and field experiments under real-world noise demonstrate that our proposed framework achieves significant noise reduction while improving target SNR and parameter estimation accuracy. These advancements make the proposed framework a robust solution for high-precision MIMO FMCW radar applications in dynamic, noise-polluted environments.


[13] 2605.28182

Cross-Predictive Sparse Bayesian Learning with Application to XL-MIMO Channel Estimation

Accurate channel estimation is a key requirement in extremely large-scale multiple-input multiple-output (XL-MIMO) systems. Sparse Bayesian learning (SBL) is a well-established framework for exploiting channel sparsity, but its performance depends on parametric prior assumptions and hyperparameter optimization based on marginal likelihood, which may be sensitive to noise, limited pilot observations, and model mismatch. In this work, we propose \textit{cross-predictive SBL (CP-SBL)}, a data-driven variant of SBL in which the sparsity-inducing weights are learned by minimizing a randomized cross-predictive objective rather than through likelihood maximization. The proposed method preserves the hierarchical Bayesian structure of SBL while replacing parametric prior learning with a predictive consistency criterion derived from random data splitting. Numerical results for near-field XL-MIMO channel estimation show that CP-SBL consistently achieves lower normalized mean squared error than the baseline SBL across a wide range of signal-to-noise ratios, pilot lengths, numbers of antennas, and numbers of propagation paths, with comparable complexity and without requiring manual hyperparameter tuning.


[14] 2605.28191

Spatiotemporal Tracking in Cooperative ISAC Networks: A Stochastic Geometry Framework

We adopt a stochastic-geometry framework to study continuous target tracking in integrated sensing and communication (ISAC) networks, with base-station locations modelled as a Poisson point process. The single-BS analysis shows that the antenna energy-conservation identity forces the mean inter-BS coupling gain to unity, making densification an antenna-irreducible liability for monostatic sensing, while a first-passage-time analysis reveals a target-distance-dependent beamwidth trap. These findings rule out single-BS tracking under densification, motivating a multi-BS cooperative treatment. The static-cluster cooperative mean tracking lifetime is then shown to exhibit a sharp percolation phase transition, with the resulting sensing-capacity ceiling saturating above a critical macro density. Yet the static-cluster idealisation itself misrepresents modern network deployments, where the cooperating cluster is dynamically re-selected as the target drifts; we therefore lift this assumption with a dynamic clustering model that maps the $K$-nearest-neighbour handover onto a 2D Brownian motion with stochastic resetting, and obtain a Bessel-function closed form for the dynamic mean tracking lifetime that dissolves the phase transition under any positive handover rate. With a per-link reliability floor, the dynamic clustering framework preserves classical linear density scaling throughout the realistic 6G regime and delivers an order-of-magnitude capacity lift at small-cell densities. Monte-Carlo simulations corroborate all theoretical predictions.


[15] 2605.28252

Digital-Based Potentiostat and Mesoporous Microelectrode Co-Design for Non-Enzymatic Glucose Detection at 0.3V-VDD and 1.65nW-Power

This paper presents a proof-of-concept ultra-low voltage and ultra-low power chronoamperometric electrochemical sensor for non-enzymatic glucose readout integrated circuit (IC) in 130nm CMOS detection featuring a reconfigurable Digital-Based (DB) Potentiostat. The signal transfer and noise characteristics of the new digital-based architecture are analytically described in the frequency domain for the first time by an equivalent linearized model that is validated by simulations and experiments. Based on experiments, the proposed DB potentiostat enables the detection of a wide electrochemical current range, spanning from 600pA to 650nA, with R2=0.991 linearity and consumes only 1.65nW (53.5nW) at V dd = 300mV (V dd = 500mV). The proposed DB readout is tested in a proof of-concept platform for non-enzymatic glucose detection with nanostructured microelectrodes, demonstrating successful non enzymatic glucose detection at physiological levels at the lowest reported voltage and power, even in the presence of an interferent (ascorbic acid) and under aerobic conditions, thus revealing a strong potential for emerging Point of Care (PoC) diagnostics applications.


[16] 2605.28361

A Lightweight Method for Multiple Signal Direction Estimation with Adaptive Notch Filters

For multi-signal detection and direction-of-arrival (DoA) estimation, conventional Capon beamforming degrades when there are more transmitters than receive antennas minus one. This paper proposes a lightweight method using adaptive notch filters (ANFs) with only two receive antennas for simultaneous DoA of two or more narrowband signals. Cascaded ANF stages form isolated channels per signal, and Capon estimates direction on each. The method has very low computational cost, and in simulation, for transmitters separable in time, frequency, and angle, performance approaches that of an oracle with prior signal knowledge. As with ANFs themselves, multiple components at closely spaced frequencies are poorly resolved, forming the main limitation of the proposed method. Simulations are complemented by an experimental implementation on a low-cost software-defined radio (ADALM-Pluto).


[17] 2605.28399

Information Age-Controllability Trade-offs in Communication-Constrained Networks

We investigate the trade-off between controllability, channel access, and age-related performance in a wireless network of control systems. Controllers share a random-access channel to transmit control inputs to actuators over slotted blocks. We measure reliable control via block controllability, where a block is controllable if it contains a required number of consecutive successful transmissions. In parallel, we capture information freshness via the age of information. To enable efficient allocation of channel resources over time, we introduce adaptive access probabilities at the block level, prioritizing controllers that have not yet achieved controllability. We then derive closed-form expressions for block controllability probability, the peak latency between inter-block consecutive successes, and peak age of information. We further characterize the peak control latency, defined as the time between consecutive controllable blocks. Finally, we optimize access probabilities to jointly balance controllability and age-related metrics. Numerical results illustrate the effectiveness of the proposed adaptive access policies in managing this trade-off in interference-limited wireless control networks.


[18] 2605.28403

A Gray-Box Approach for Decentralized Grid-Equivalent Model Identification

We propose a decentralized, frequency-domain identification algorithm that estimates the grid-equivalent model from the perspective of local converters. Since local electric signals in a multi-converter setup are affected by voltage inputs from other converters, considered as the grid, estimating a single apparent impedance yields biased and inaccurate results. To overcome this, we design a framework that decouples the effect of the equivalent impedance (passive) from that of the equivalent voltage (active). The parameters and equivalent grid voltages are then estimated using a constrained least squares and a Kalman filter algorithm, respectively, applied across frequency samples. We then demonstrate the accuracy and performance of our algorithm on an interconnected 5-converter system in grid-forming mode, with minimal voltage excitations and non-ideal operating conditions.


[19] 2605.28431

A Unified Maximum-Likelihood Framework for 3D InISAR Phase Unwrapping with Outlier Rejection

This paper presents a novel mathematical framework for phase unwrapping in three-dimensional interferometric ISAR (3D InISAR) imaging. The approach works on a scatterer-by-scatterer basis and does not rely on any spatial continuity assumptions, making it suitable for sparse point clouds. The formulation is derived from the Mixed-Integer Least Squares (MILS) theory, an optimal maximum-likelihood framework for joint estimation of integer and real unknowns in the presence of Gaussian noise. This provides a unified way to handle generic sensor geometries, multi-baseline, multi-frequency, or hybrid setups. The method also produces a natural a posteriori quality metric for each unwrapped phase, which can be used to build a statistical test to reject outliers. The algorithm is simple to implement and has a computational cost suitable for operational systems. This paper presents the theoretical foundations of the framework and a first validation study on a standard L-shaped dual-frequency setup using Monte Carlo simulations. Results show that the proposed framework enables reliable 3D reconstruction in challenging ambiguity conditions.


[20] 2605.28432

Transformer-Based Heartbeat Monitoring with FMCW Radar Under Random Body Motion

Millimeter-wave Frequency Modulated Continuous Wave (FMCW) radar enables contactless cardiac monitoring, but heartbeat estimation becomes challenging when respiration and random body motion (RBM) distort the radar signal. In this paper, we propose a hybrid framework for 77 GHz FMCW radar that combines model-based signal processing with a Convolutional Neural Network (CNN)-Transformer network. The first block extracts chest displacement and constructs meaningful high-level motion features from raw radar data, while the second block reconstructs a photoplethysmography (PPG)-like signal from the extracted features. In this study, a synchronized PPG signal is used as the ground truth for heartbeat monitoring in supervised training. The method is evaluated following the IEEE AESS Radar Challenge Problem I protocol using the official datasets and figures of merit across three motion scenarios: stationary, deep breathing, and RBM. Results show that the proposed architecture reliably reconstructs the PPG signal in all scenarios, achieving high fidelity in controlled conditions and maintaining robust performance under motion. This enables reliable average heart rate (AHR) and heart rate variability (HRV) estimation even where benchmark methods fail, and leads to the highest total score among the compared approaches.


[21] 2605.28434

Experimental Characterization of a Multifunction X-Band AESA Radar Demonstrator

Modern naval surveillance demands multifunction radar systems capable of operating in cluttered and contested environments. This paper presents the experimental characterization of a compact, X-band Active Electronically Scanned Array (AESA) radar demonstrator. The system was evaluated in a realistic coastal field environment at Naval Support and Experimentation Centre (CSSN) and, specifically, its specialized institute, the G. Vallauri Institute, which has historical expertise in testing and evaluating the performance of operational sensors as well as those under development, using real maritime targets and an active noise jammer. The trials assessed three core functions: direction-of-arrival (DoA) estimation, adaptive jammer suppression using MVDR beamforming, and high-resolution Inverse Synthetic Aperture Radar (ISAR) imaging. The results confirm that the demonstrator successfully detects and localizes targets, effectively suppresses high-power interference, and generates clear ISAR images of non-cooperative vessels. These findings validate the multifunction performance of the AESA demonstrator, confirming its suitability for advanced naval surveillance applications.


[22] 2605.28453

A Unified Framework for Unbiased Non-Coherent Over-the-Air Computation

Over-the-Air Computation (OAC) enables efficient data aggregation in large-scale distributed systems by exploiting the superposition property of wireless multiple-access channels. In contrast to most existing studies on OAC assuming exact channel state information, we consider non-coherent OAC (NC-OAC) where the channel phase is unknown at the transmitters. A three-step framework for NC-OAC with a mapping between source data and codewords is proposed: 1) Devices encode their data to non-negative codewords; 2) Devices transmit a sequence of symbols with amplitude proportional to their codewords, such that the receiver can estimate the codeword sum. Estimation of the codeword sum is studied under two scenarios of global channel amplitude knowledge: statistical or instantaneous; 3) The estimated codeword sum is decoded to the desired source data sum at the receiver. With the proposed framework, we first study prior work on NC-OAC and map these to the framework. Next, we define and compare the two most commonly (often implicitly) used mappings for NC-OAC: the Affine and the Augmented Affine mappings. Under the constraint of unbiased estimation, we show that with uniformly distributed data and standard channel assumptions, the Augmented Affine mapping exhibits an order of magnitude lower estimation variance than the Affine mapping with both statistical and instantaneous channel knowledge. This result is validated by extensive simulations. Finally, we propose and analyze a new mapping, which demonstrates superior performance over the previous two affine mappings.


[23] 2605.28478

Towards Autonomous Commissioning of Industrial Drives via Multi-Objective Bayesian Optimization

The commissioning of industrial electric drives still relies heavily on manual tuning of cascaded control loops, requiring expert knowledge and significant time. In this paper, we propose a fully automated approach for tuning the current control loop of industrial drives using Bayesian Optimization (BO) directly on real hardware, without requiring a system model or firmware modifications. The drive is treated as a black-box system, and the controller parameters are iteratively updated through closed-loop experiments. The tuning problem is formulated as a multi-objective optimization task that directly minimizes tracking error, time-weighted error, overshoot, and oscillatory behavior, enabling the identification of Pareto-optimal controller configurations. To address discrete parameters, noisy evaluations, and limited budgets, we adopt a multivariate Tree-structured Parzen Estimator (TPE) as the underlying BO strategy. The proposed method operates under practical industrial constraints, including communication latency and limited evaluation budgets. The experimental validation on a real motor drive system under no-load conditions shows that the method achieves performance comparable to expert tuning within a few minutes and without human intervention. Results show that Gaussian Process (GP)-based BO can yield highly competitive final solutions, but TPE-based BO is better aligned with this setting due to faster convergence, richer Pareto-front approximation, and lower computational overhead.


[24] 2605.28480

Audio-Mind: An Auditable Agentic Framework for Audio Understanding

Audio agents extend large audio-language models (LALMs) by decomposing audio questions into tool calls, intermediate evidence, and iterative reasoning steps. However, as LALMs become stronger, the key challenge shifts from enabling tool use to determining when agentic evidence acquisition genuinely benefits audio understanding. We propose Audio-Mind, an auditable and pluggable framework for conditional evidence acquisition in audio understanding. Audio-Mind dynamically combines a strong frontend with planner-guided tool use, preserving frontend judgment when initial evidence is sufficient while acquiring bounded external evidence for questions with unresolved evidence gaps. Experiments on MMAR and MSU-Bench show that Audio-Mind outperforms prior audio-agent baselines, reaching 80.4% accuracy on MMAR and 82.8% accuracy on MSU-Bench. A matched-backbone comparison highlights why this design matters: under strong audio frontends, agentic decomposition can become an orchestration bottleneck when the workflow does not preserve the frontend's holistic audio-grounded judgment. Beyond accuracy, Audio-Mind produces higher-quality, auditable reasoning traces that expose uncertainty, tool evidence, and answer rationales, offering a potential basis for more reliable audio-QA annotation and error analysis.


[25] 2605.28514

Channel Measurements and Characterization with Phase Drift Compensation for Outdoor 330-360 GHz MIMO Communications

In this paper, an outdoor channel measurement campaign at 330-360 GHz employing a 128 * 4 virtual antenna array (VAA)-based multiple-input multiple-output (MIMO) configuration is conducted. The transmitter (Tx) and receiver (Rx) location pairs are classified into line-of-sight (LoS) and obstructed-LoS (OLoS) scenarios to enable a detailed investigation of outdoor terahertz (THz) band channel characteristics. During the measurement process, the stationarity of the outdoor environment is carefully verified, and a linear phase drift (PD) effect is identified. Then, we propose a PD-aware Space-Alternating Generalized Expectation-Maximization (SAGE) algorithm, which significantly improves both delay resolution and channel parameter estimation accuracy. Based on the processed measurement data, we characterize key channel properties, including the power delay profile, path loss, shadow fading, delay spread, angular spread, Rician K-factor, as well as their cumulative distribution functions and correlation characteristics. In addition, near-field effects and MIMO-specific properties, including the spatial non-stationarity and the cluster birth-death property, are analyzed.


[26] 2605.28547

On Unified CRLB Framework from Generic Signals to ISAC Waveforms with Virtual Array Sensing

This paper presents a unified Cramér-Rao lower bound (CRLB) framework for signal-level parameters in integrated sensing and communications (ISAC)-enabled radar systems. Starting from the generic signal model, we analyze the coupling between delay and Doppler in the Fisher information matrix (FIM), which is unsolved and often overlooked in relevant studies. Addressing this issue, we derive the conditions under which the coupling terms can be eliminated and demonstrate that these conditions are typically satisfied for ISAC-enabled waveforms. Afterward, the CRLBs of representative ISAC waveforms are derived within the unified framework, enabling consistent and comparable analysis across the waveforms and avoiding model-dependent discrepancies. Further, the framework is extended to virtual array (VA) sensing systems, where the impact of different multiplexing schemes is analyzed. Simulation results demonstrate the consistency between the CRLBs derived from the proposed framework and those obtained from waveform-specific analyses. The proposed framework shows strong generality, waveform-compatibility, and flexibility, offering a versatile tool for the CRLB analysis of various waveforms, including those lacking existing analytical results.


[27] 2605.28560

Unified Analytical Framework for SPAD Array Receivers with Dead-Time-Induced Blocking Loss and Inter-Symbol Interference in PAM-OWC Systems

Optical wireless communication (OWC) leveraging single-photon avalanche diode (SPAD) arrays offers exceptional sensitivity for photon-starving links. However, the inherent dead time of SPADs critically limits achievable data rates by introducing non-linear photon-counting distortions: blocking loss within a symbol duration and inter-symbol interference (ISI) across durations. This paper proposes a unified analytical framework capturing both distortions across all operational speed regimes for pulse-amplitude modulation (PAM), by establishing comprehensive statistical models for SPAD array receivers. For low and medium-speed systems (symbol duration longer than dead time), we derive exact closed-form expressions for the photon counts probability distribution using renewal theory, explicitly incorporating blocking loss and ISI. For high-speed systems (symbol duration shorter than dead time), we develop a Markov chain model characterizing the steady-state operational states and integrate it with trigger probability to obtain the exact binomial photon counts distribution. Furthermore, we propose low-complexity, near-optimal threshold detection schemes based on these models. This work provides essential theoretical tools for designing and optimizing high-performance SPAD-based OWC systems employing PAM.


[28] 2605.28569

Actor-Identifier-Critic Reinforcement Learning for Adaptive Model-Free Optimal Control of Nonlinear Systems with Stochastic Packet Dropouts

Packet dropouts in control systems poses a critical challenge, as it can significantly compromise system performance and stability. In these conditions, classical controllers often struggle to deliver effective control, as they rely on accurate system models, which may not always be available. This paper proposes a novel Actor-Identifier-Critic~(AIC) controller to address model-free tracking control of nonlinear systems in the presence of packet dropouts in both the controller-to-actuator and sensor-to-controller channels. Using an identifier to learn the system dynamics, the proposed controller is able to handle packet dropouts in the communication link and facilitate gradient propagation from the critic to the actor within a model-free control framework. The performance of the proposed method is demonstrated on two nonlinear SIMO and MIMO systems and a case study on power system stability subject to stochastic packet dropouts.


[29] 2605.28618

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

Recent advances in speech generation have enabled high-fidelity synthesis, yet systematic evaluation of models under long-context conditions remains largely underexplored. A comprehensive evaluation benchmark for long-form speech is indispensable for two reasons: 1) existing test scenarios are often confined to limited domains, creating a significant gap with the diverse downstream applications; 2) existing metrics overlook critical long-text factors such as consistency and coherence, failing to generalize reliably. To this end, we propose Swanbench-Speech, a comprehensive benchmark that decomposes long-form speech quality into specific, disentangled dimensions. SwanBench-Speech has three key properties. 1) Rich speech scenarios: Focusing on long-form speech generation and dialog generation, SwanBench-Speech covers acoustics, semantics, and expressiveness challenges, and consists of 1,101 samples spanning 17 common speech scenarios; 2) Comprehensive evaluation dimensions: Along the acoustics, semantics, and expressiveness axes, SwanBench-Speech defines an automated evaluation protocol with seven metrics to provide a comprehensive, accurate, and standardized assessment; 3) Valuable Insights: Through extensive experiments, we reveal that current models still struggle in highly expressive scenarios and exhibit a notable gap in consistency and hierarchy compared to real recordings.


[30] 2605.28665

On the Solvability of Quasi-Regulator Equations in Non-smooth Output Regulation

Motivated by the prevalence of non-smooth, possibly non-periodic signals in real-world applications, the output regulation of linear systems subject to non-smooth non-periodic exogenous signals has emerged as a challenging problem. A fundamental prerequisite for solving this problem is the existence of solutions to the so-called ``quasi-regulator equations''. In this paper, we investigate the solvability of these equations. To this end, we reformulate the quasi-regulator equations as differential-algebraic equations and highlight the critical role played by the system's relative degree. We finally propose a ``non-smooth non-resonance condition'' that, under specific relative degree requirements, provides a necessary and sufficient characterization of the solvability of the quasi-regulator equations.


[31] 2605.28697

Deep Learning Strain Estimation: Is Physics-Based Simulation the Solution?

Speckle tracking echocardiography (STE) is the clinical standard for myocardial strain estimation. Despite good performance on global strain (GLS), its accuracy for regional strain remains limited, even though this biomarker is highly relevant for early diagnosis and the characterization of subtle abnormalities. from clinical data. Deep learning is a promising alternative, but its development is constrained by the lack of reliable motion references. Existing solutions rely either on STE-derived labels or on simulations generated by physics-based models, but these synthetic sequences still have limited realism compared with clinical this http URL this paper, we propose a novel simulation strategy that incorporates speckle decorrelation measures from real videos and uses an iterative refinement process to improve the motion realism in the simulations. We created an open-source photorealistic dataset of 1,478 videos with reference motion, which was used to train an echocardiographic motion estimation algorithm. The proposed method achieves unmatched performance on global and regional strain, notably reaching a GLS variability of 1.42% in an inter-expert setting compared to 1.78% for the clinical reference.


[32] 2605.00025

MoDAl: Self-Supervised Neural Modality Discovery via Decorrelation for Speech Neuroprosthesis

Speech neuroprosthesis systems decode intended speech from neural activity in the absence of audible output, offering a path to restoring communication for individuals with speech-impairing conditions. Current approaches decode predominantly from motor cortical areas, discarding others -- such as area 44, part of Broca's area -- that may encode complementary linguistic information. We introduce MoDAl (Modality Decorrelation and Alignment), a framework that discovers complementary neural modalities through the interplay of two objectives in a shared projection space. A contrastive loss aligns each of several parallel brain encoders with the text embeddings of a pretrained large language model (LLM), while a decorrelation loss prevents the encoders from coalescing to duplicative representations. We prove that these objectives are in productive tension: Contrastive alignment induces transitive modality coalescence, which decorrelation must counteract for the framework to discover diverse neurolinguistic modalities. On the Brain-to-Text Benchmark '24, MoDAl reduces word error rate (WER) from 26.3% to 21.6% compared to the previous best end-to-end method, with the gain from incorporating previously discarded area 44 signals arising entirely from the decorrelation mechanism. Analysis of the discovered modalities reveals functional specialization: Encoders receiving area 44 input capture structural and syntactic properties (sentence length, grammatical voice, wh-words), consistent with the neurolinguistic understanding of Broca's area.


[33] 2605.27422

Speed-Weighted Adaptive Flocking for Sailing Swarms under Dynamic Environmental Forcing

Collective behavior models, such as aggregation and flocking, usually assume self-propelled robots that can directly execute their desired speed and direction of motion without fundamental constraints. However, autonomous sailing robots violate this assumption. Their motion is shaped by wind-dependent propulsion, restricted headings, and spatially varying wind conditions. In particular, maneuverability is coupled to wind speed: in weak wind, sailboats may turn only slowly or not at all, whereas stronger wind enables faster turns. This introduces transient heterogeneity in speed and maneuverability across the flock. We focus on this fast-slow coordination problem in sailing robot flocks. To study this problem, we introduce SailSwarmSwIM, a reduced-order simulator for autonomous sailing robot swarms that captures wind-dependent speed and maneuverability, no-go zones, tacking behavior, and steady or gusty wind fields. To design our novel flocking technique, we start from the Couzin model and introduce a speed-weighted social interaction rule that accounts for each robot's transient motion constraints. A key result is that increasing the social influence of slower robots improves polarization and reduces close encounters. This effect arises from a balance between attraction to fast neighbors, which helps maintain movement, and cohesion around slow neighbors, which prevents the flock from fragmenting. Together, our simulator, SailSwarmSwIM, and the speed-weighted interaction rule provide a modeling framework for studying adaptive collective behavior in robotic fleets whose motion capabilities are continuously shaped by wind.


[34] 2605.27553

Economic Nonlinear Model Predictive Control for Microgrids with Generator Up and Downtime Constraints

Recently there has been a lot of progress in the development of economic nonlinear model predictive control (NMPC) schemes for multistage optimal power flow (OPF) problems. However, the additional inclusion of discrete decision variables to model generator runtimes and generator startup costs can amount to large scale mixed-integer nonlinear programs (MINLPs) that are computationally very challenging. This work investigates the practical approach that replaces the nonlinear AC power flow equations by convex quadratic approximations. In combination with the discrete generator dynamics this leads to a mixed-integer quadratically constrained program (MIQCP) which is of significantly lower complexity and can be solved in reasonable time by off-the-shelf solvers such as CPLEX. We further show that simple terminal constraints are not sufficient to guarantee recursive feasibility of the NMPC scheme if constraints on generator runtime and on the number of generator startup events are present. To address this challenge we propose the use of additional time-coupled constraints and prove the resulting recursive feasibility property. Based on the assumption of periodic dissipativity of the underlying system we can prove stability of the proposed controller. To illustrate our results, we present simulations of a realistic 6-bus microgrid under different demand scenarios.


[35] 2605.27628

Intelligence as Managed Autonomy: Failure, Escalation, and Governance for Agentic AI Systems

As autonomous and agentic AI systems scale in robotic and human-machine environments, managing hallucination and persistent but unjustified action remains an open challenge. Rather than attributing these failures solely to model or alignment limitations, this paper explores the architectural vulnerability of unbounded autonomy - the presumption that an agent should continue operating regardless of rising uncertainty. It introduces a theory of managed autonomy that defines intelligent behavior through the formal capacity to detect epistemic drift, suspend reasoning, attempt recovery, and ultimately surrender control when reliability diminishes. We instantiate this theory via the SMARt (Self-Managing Multi-tier Autonomous Reasoning with Regulated/Revoked transitions) model, a four-layer framework featuring Stable, Meta-cognitive, Assisted, and Regulated states. By developing a timed, guarded Petri net formulation, we establish theoretically bounded properties for the system, demonstrating how architecture can formally mandate escalation, constrain invalid outputs, and ensure governance reachability under specified conditions. We further analyze how incorporating domain-specific trigger sets across varied operational settings (e.g., healthcare, robotics, etc.) can systematically preserve safety, assuming completeness and soundness criteria are met. Because these triggers are designed to be adaptive, the SMARt model accommodates the safe, controlled expansion of an agent's operational scope over time. We conclude that formalizing failure management within the autonomy lifecycle is a crucial step toward realizing reliable and governed artificial intelligence.


[36] 2605.27682

Inversion of the Multiplicative Matrix Compound Operator

We study the problem of determining a matrix whose $k$th multiplicative compound is a prescribed matrix~$M$. The cardinality of the set of matrices whose $k$th multiplicative compound equals~$M$ is characterized in terms of $\rank(M)$. On the one hand, if $\rank(M)\le 1$, it is shown that there exist infinitely many such matrices for which a complete characterization is determined. On the other hand, if $\rank(M)>1$, then there exists a unique matrix -- up to an overall sign -- whose compound is~$M$. An algorithm for finding a matrix whose compound equals~$M$ is detailed, and its time complexity is analyzed.


[37] 2605.27755

A Vertical Look at UAV Connectivity in the Wild: Cellular vs. Starlink, 3D Characterization, and Performance Prediction

In this paper, we present an open-source measurement platform designed to characterize the performance of commercial cellular (Verizon, a major US provider) and LEO satellite (Starlink) networks through real-world flight tests in rural environments. We implement a comprehensive multi-layer measurement approach spanning physical layer signal metrics, multi-cell network topology, and end-to-end (E2E) application performance. Through an extensive flight campaign with more than $10$ flight tests, $4.5$+ hours of flight time resulting in more than $18$K samples, we present the first detailed, open-source dataset analyzing dual cellular and Starlink performance for low-altitude UAV operations. Our cellular-Starlink comparative results, which are collected \emph{simultaneously at the same time and location}, demonstrate significant performance differences between the two technologies: the LEO satellite link achieves superior latency performance with $95\%$ of Round-Trip Time (RTT) measurements below $50$ ms compared to $80\%$ under $150$ ms for cellular, and exceptional downlink capacity with $95\%$ exceeding $25$ Mbps versus only $5$ Mbps for cellular. Our analysis on cellular network performance demonstrates that while higher altitudes (e.g., $330+$ m above the sea level) improve signal power by $15-20$ dB via line-of-sight (LOS) propagation, it causes a $3-4$ $\times$ increase in handover rates, which is due to excessive multi-cell visibility rather than signal degradation. Furthermore, we observe asymmetric impacts on the RTT performance due to handovers such that $53.5$\% of handovers improve RTT, but worst-case degradation ($275$ ms) is $2$ $\times$ larger than best-case improvement ($137$ ms).


[38] 2605.27771

A Preliminary Assessment of Midhaul Links at 140 GHz using Ray-Tracing

The ever-growing demand for mobile data necessitates a transport network architecture that can withstand the 5G-and-beyond multi-Gbps traffic requirements. To cater for such unprecedented demand, studies are being conducted to incorporate TeraHertz (THz) communications in future mobile networks. In this paper, we consider an urban environment and evaluate the feasibility of THz wireless midhaul links for the transport networks between the Central Units (CU) and Distributed Units (DU) in a disaggregated 5G network architecture with functional splits. Our goal is to study the feasibility of midhaul links at 140 GHz by minimizing the number of required CUs to serve all the DUs. To this end, we define several policies for selecting CU and DU nodes in order to determine the peak data rate that can be supported over each link between a CU and DU. Our numerical results based on ray-tracing suggest that wireless links at 140 GHz with 3GPP option 2 as High Layer Split (HLS) represents a promising technology for midhaul transport networks.


[39] 2605.27781

Day-Ahead Electricity Price Forecasting Using a Multivariate Group Lasso Method

Electricity price signals in modern power systems exhibit complex dependence structures that render forecasting inherently challenging. Our analysis of real-world pricing signals from the California Independent System Operator (CAISO) reveals complex temporal group effects, whereby the influence of explanatory variables on electricity prices persists across consecutive blocks of time due to underlying economic and operational drivers. In response, we propose a multivariate statistical method based on a Group Lasso formulation to forecast the vector of day-ahead electricity prices, by leveraging multi-feature temporal group effects. Our approach is evaluated on two full years of electricity prices from CAISO, demonstrating considerable improvements in point and probabilistic forecast metrics compared to a wide array of statistical and deep learning methods. Theoretical and empirical analyses confirm the effectiveness of the proposed approach in modeling realistic group effects, maintaining both interpretability and low computational complexity. When retrospectively evaluated on test data from a recent international electricity price forecasting challenge, the proposed method ranked in second place, despite having access to significantly less information than competing approaches. Finally, the proposed method is independently validated against two operational electricity price forecasting systems in CAISO, demonstrating competitive predictive performance and practical relevance.


[40] 2605.27795

Geometric Analysis of Variational Quantum Eigensolver

The Variational Quantum Eigensolver (VQE) is a fundamental algorithm in quantum computing, yet a coherent geometric characterization of VQE remains missing due to fragmented analyses across fixed-ansatz and adaptive-circuit formulations. In this paper, we establish a geometric analysis of VQE in terms of optimization landscape, initialization guarantee, and noise robustness. First, we study the optimization landscape via an ansatz-free product-unitary formulation over the unitary group, unifying both paradigms. For the single-unitary case, we establish linear convergence of Riemannian gradient descent (RGD) and prove the strict saddle property. For the product-unitary case, we show the convergence rate deteriorates polynomially with circuit depth, providing a geometric explanation of the barren plateau phenomenon. Second, we prove that small-angle random Pauli-rotation circuits satisfy the required initialization conditions with high probability. Third, we show that RGD retains linear convergence under finite-shot measurements, and that coefficient-adaptive allocation achieves strictly lower statistical error than uniform sampling under a fixed measurement budget.


[41] 2605.27799

GraD-IBD: Graph Representation Learning from Diagnosis Trajectories for Early Detection of Inflammatory Bowel Disease

International Classification of Diseases (ICD) is a globally recognized coding system that records diagnostic events during each patient encounter, providing a standardized data foundation for various clinical tasks. However, the irregular and hierarchical nature of ICD code sequences poses challenges for N-D lattice-based sequential modeling methods, leading to overly complex model designs. In this paper, we propose GraD-IBD, a graph diagnosis model that reformulates longitudinal ICD trajectories as visit-bucketized, temporally directed graphs to detect the risk of inflammatory bowel disease (IBD). A novel context-aware, time-decay message passing mechanism was developed to capture temporal dependencies while reducing model complexity. The experimental results using a real-world clinical dataset demonstrated consistent and robust improvements in IBD detection over state-of-the-art methods, with significant reductions in computational complexity compared to sequential models. These findings highlight the potential of graph representation learning to enable efficient, scalable, and accurate disease risk prediction from longitudinal ICD diagnosis codes.


[42] 2605.27831

Decentralized Parameter-Free Online Learning with Compressed Gossip

We study decentralized online convex optimization when agents communicate over a graph and messages may be compressed. Classical decentralized online methods typically require learning-rate choices that depend on the horizon, comparator scale, or other problem parameters, while compressed communication introduces additional disagreement that must be controlled. We propose DECO-EF (DEcentralized COin-betting with Error Feedback), a decentralized parameter-free online learning algorithm that combines coin-betting predictions with compressed difference-based gossip. Each agent maintains a clean accumulated state and a compressed tracker, and communicates only compressed state differences during gossip steps. The method is parameter-free in the online-learning sense: it does not tune to the horizon, the comparator norm, or the learning rate. We prove expected comparator-adaptive network-regret bounds for DECO-EF under compressed communication. To the best of our knowledge, this gives the first expected sublinear network-regret guarantees for parameter-free decentralized online learning under compressed communication.


[43] 2605.27930

Optimization of CF-mMIMO Systems for the Coexistence between eMBB+ and mMTC+: From Analytical to GNN-Aided Designs

This paper investigates uplink multiple access for the coexistence of enhanced mobile broadband+ (eMBB+) and massive machine-type communications+ (mMTC+) in terminal-centric cell-free massive MIMO (CF-mMIMO) systems. We propose a non-orthogonal scheme in which low-rate mMTC+ transmissions are spread across the time-frequency grid shared with eMBB+ users, enabling efficient resource reuse. In the presence of imperfect channel state information, we derive closed-form expressions for the achievable rates of both services based solely on statistical channel knowledge. For mMTC+ devices, the analysis also incorporates finite blocklength (FBL) modeling to capture short-packet transmissions. To support heterogeneous service requirements, we formulate a power-control problem that maximizes the minimum energy efficiency of mMTC+ devices subject to quality-of-service constraints on eMBB+ users. The resulting nonconvex problem is solved via sequential fractional programming, accounting for both the Shannon and FBL regimes. To enable real-time operation, we further propose a graph neural network (GNN) with multi-head attention to approximate the model-based solution. Constraint satisfaction during training is enforced via an augmented Lagrangian loss. Numerical results demonstrate effective multiplexing of the two data services and show that the proposed GNN algorithm achieves near-optimal performance with a significantly lower computational complexity.


[44] 2605.28143

Sequential Neural Probabilistic Amplitude Shaping: Learning the Channel's Language

We present the first neural probabilistic amplitude shaping that outperforms existing methods while accounting for all implementation losses, using a block-less, easily implementable sequential autoregressive encoder compatible with arithmetic distribution matching, yielding reduced rate loss and higher achievable information rates.


[45] 2605.28254

Natural Locomotion: Principle and Method

Robotic locomotion can become efficient when mechanisms exploit passive dynamics, compliance, and resonance rather than track prescribed trajectories. This paper formulates natural locomotion as an exchange principle for systems whose motion is mediated by environmental constraints or interactions. A motion is natural when an internal oscillator returns periodically, the body pose drifts, and the mean Propulsion--Oscillator Exchange power (POE power) vanishes over one cycle. The selected family is a Natural Locomotion Manifold (NLM). We develop the conservative realization of this principle for continuous ideal environmental constraints: the constraints do no external work, total mechanical energy is conserved, and zero mean POE power is an internal exchange with the environment-mediated propulsive channel, not external energy input. The method is a closed/open construction. The propulsive channel is first closed to reveal an effective internal oscillator, organized by scalar action-angle structure in one effective degree of freedom or by nonlinear modal sectors in several degrees of freedom. The channel is then reopened, pose is reconstructed, and accepted cycles must preserve internal recurrence and zero mean POE power. We demonstrate the principle on two ideal nonholonomic no-slip systems: a Chaplygin-sleigh / pendulum-driven car and a three-body extension. In the scalar case, POE closure is equivalent to the missing internal return condition, giving a theorem-backed computation of the NLM family. In the multi-degree case, POE closure remains necessary but must be completed by modal identity, internal return, dynamics consistency, same fixed passive architecture, and nonzero displacement. Natural locomotion becomes a design question: which passive architectures support no, one, or several certified NLM families?


[46] 2605.28325

ISAC Privacy: Challenges and Solutions for 6G

Integrated sensing and communication (ISAC) is a promising feature of future communication networks. While spatial sensing can improve network performance and enable external services, it also creates privacy challenges that go beyond the confidentiality of communication content. Future networks using millimeter-wave (mmWave) and sub-terahertz (THz) frequencies may collect or infer detailed information about people, devices, bystanders, passive objects, and environments in a sixth-generation (6G) deployment area. Such sensing can reveal location and environment data, support behavioral profiling such as movement or activity recognition, and, in advanced cases, expose physiological information such as breathing frequency or heart-rate-related data. Thus, the capabilities of spatial sensing must be controlled to satisfy privacy requirements. In this work, we organize privacy-sensitive ISAC data into three sensing levels: location and environment data, behavioral data, and physiological data, and use this classification as the organizing principle throughout the paper. Based on this classification, we discuss internal and external ISAC applications, identify privacy challenges related to consent, transparency, data ownership, profiling, bystander exposure, and sensitive sensing data, review representative solution directions, and outline future research directions for privacy-preserving ISAC.


[47] 2605.28345

Picid: A Modular Evaluation Infrastructure for Reproducible PHM Across Tasks and Domains

Progress in Prognostics and Health Management (PHM) is hindered by the lack of standardized and reusable evaluation practices across tasks, datasets, and application domains. Reported results are often difficult to reproduce and compare, as key protocol choices, such as data splits, preprocessing, label alignment, temporal windowing, and metrics, are often implicit or implemented ad hoc. We introduce \picid, a modular evaluation infrastructure that formalizes the PHM evaluation pipeline as an explicit, executable, and reproducible protocol. Through well-defined abstractions, \picid enforces deterministic, leakage-safe dataset construction while remaining flexible across diverse PHM settings. The framework supports fault detection, diagnostics, and prognostics through a unified interface and can be extended to new datasets and model classes without violating protocol invariants. By standardizing data contracts and evaluation boundaries, \picid also enables fair cross-task comparisons across diagnostics (classification) and prognostics (regression), allowing identical model families to be evaluated consistently across heterogeneous settings. We demonstrate \picid through an empirical evaluation of thirteen models on twelve datasets spanning batteries, bearings, turbofan engines, hydraulics, filtration systems, and buildings. This work establishes a reusable foundation for standardized, fair and reproducible evaluation in PHM.


[48] 2605.28367

Safety-Critical Adaptive Impedance Control via Nonsmooth Control Barrier Functions under State and Input Constraints

Safe physical interaction is critical for deploying robotic manipulators in human-robot interaction and contact-rich tasks, where uncertainty, external forces, and actuator limitations can compromise both performance and safety. We propose an online adaptive impedance control framework that enforces joint-state safety while achieving compliant interaction under uncertain dynamics. The approach combines a quadratic-program-based safety filter with a novel composed position-velocity non-smooth control barrier function (NCBF), enabling joint position and velocity constraints to be enforced through a unified relative-degree-one barrier. Unknown dynamics are compensated online using an interval type-2 fuzzy logic system, while actuator torque limits are handled through soft constraints with exact penalty recovery of feasible solutions. A disturbance-observer-enhanced safety mechanism improves robustness against modelling errors and external interaction forces. Using composite Lyapunov analysis, we prove forward invariance of the safe set and the uniform ultimately boundedness of the impedance-tracking error. Simulations on a 7-DOF manipulator with severe parametric uncertainty and external interaction wrenches demonstrate safe constraint satisfaction and robust impedance tracking.


[49] 2605.28456

Diffusion Large Language Models for Visual Speech Recognition

Existing Visual Speech Recognition (VSR) systems commonly rely on left-to-right autoregressive decoding, which can force premature decisions on visually ambiguous tokens before sufficient context is available. We propose DLLM-VSR, to the best of our knowledge, the first Diffusion Large Language Model (DLLM)-based VSR framework, formulating transcription as iterative masked denoising with flexible-order decoding. With confidence-based unmasking, DLLM-VSR commits high-confidence positions early and uses the committed tokens as bidirectional context to refine ambiguous ones. To adapt DLLMs to VSR, we introduce a two-stage masked-denoising training strategy that separates visual-to-text content alignment from length modeling. We further observe a performance gap with oracle-length decoding, which assumes access to the true transcript length, indicating that reducing target-length uncertainty can improve DLLM-based VSR. To reduce this gap, we develop length-guided candidate decoding, which uses video duration to construct plausible transcript-length hypotheses, decodes under multiple hypotheses, and reranks candidates using length plausibility and decoding confidence. The proposed method achieves a state-of-the-art WER of 19.5\% on LRS3 using only its labeled training data.


[50] 2605.28550

Model Predictive Control for Constrained Linear Positive Systems on Graphs

Positive systems describing networks with inherently non-negative states and inputs arise naturally in routing, logistics, and compartmental modelling. We consider problems modelled as positive linear systems in incidence form with linear cost. The addition of capacity constraints on states (storage) and inputs (flows between nodes) significantly increases the problem complexity. Leveraging the analytic structure of the unconstrained problem, an explicit suboptimal admissible controller is constructed. This yields graph-computable performance bounds and a minimum stabilising horizon length for a model predictive controller without terminal conditions. A convex program enables efficient computation of the optimal bound and horizon. These results highlight how system structure enables explicit MPC guarantees that are typically not available.


[51] 2605.28583

SARAD: LLM-Based Safety-Aware Hybrid Reinforcement Learning with Collision Prediction for Autonomous Driving

Ensuring both safety and efficiency in decision-making for autonomous driving systems remains a fundamental challenge. Traditional Deep Reinforcement Learning (DRL) suffers from unsafe random exploration and slow convergence, while Large Language Models (LLMs) demonstrate inherent latency in real-time inference operations. To address these limitations, this paper proposes SARAD, a novel safety-aware hybrid framework that synergizes LLMs and DRL for autonomous driving. SARAD substitutes the random exploration of DRL with Retrieval-Augmented Generation (RAG)-enhanced, LLM-guided decisions sourced from a dynamic expert knowledge repository. An attention discriminator is proposed to integrate the prior knowledge of LLMs into DRL policy optimization. A collision predictor module, fine-tuned with historical collision data, is further designed to improve vehicle safety. Extensive experiments show that SARAD achieves significant performance improvements in the Highway-Env simulator, validating the effectiveness of the proposed model in autonomous driving.


[52] 2605.28654

Integrated Exploration-Aware UAV Route Optimization and Path Planning

Uncrewed aerial vehicles (UAVs) are increasingly used for exploration-driven monitoring in hazardous environments such as disaster zones, contaminated sites, wildfire areas, and damaged infrastructure, where limited flight endurance must be allocated between visiting reported locations and gathering new information. In these settings, prior information regarding hazards is often incomplete, spatially imprecise, and subject to change during execution. For example, initial reports may identify a region where a hazard is likely to exist, but the actual hazard may be displaced, partially observed, or entirely unreported. We present an integrated exploration-aware UAV route optimization and path planning framework for hazard monitoring under uncertain and evolving prior information. The environment is represented as a spatial risk map, where each location has an associated belief of hazardous conditions. Reported hazards are modeled as uncertain regions of interest (ROIs) rather than confirmed target locations, requiring the UAV to inspect reported areas while also using its limited flight endurance to explore informative regions. The proposed method solves a vehicle routing problem over reported ROIs, augments the route with auxiliary pseudo-nodes to improve spatial coverage, allocates the remaining flight distance budget across route segments, and optimizes dynamically feasible B-spline trajectories for local exploration. During execution, UAV measurements update a grid-based belief map, and the remaining trajectory is replanned when new information and the remaining budget justify adaptation. Across 48 scenario configurations, online replanning improves average KL reduction by 15.9% over the offline optimized planner and 48.6% over straight-line traversal.


[53] 2605.28674

Disjunctive Sum of Squares

We introduce the concept of disjunctive sum of squares for certifying nonnegativity of polynomials. Unlike the popular sum of squares approach where nonnegativity is certified by a single algebraic identity, the disjunctive sum of squares approach certifies nonnegativity with multiple algebraic identities which can be found in parallel. Our main result is a disjunctive Positivstellensatz proving that we can keep the degree of each algebraic identity as low as the degree of the polynomial whose nonnegativity is in question. Based on this result, we construct a semidefinite programming based converging hierarchy of lower bounds for the problem of minimizing a polynomial over a compact basic semialgebraic set, where the size of the largest semidefinite constraint is fixed throughout the hierarchy. We further prove a second disjunctive Positivstellensatz which leads to an optimization-free hierarchy for polynomial optimization. We specialize this result to the problem of proving copositivity of matrices. Finally, we describe how the disjunctive sum of squares approach can be combined with a branch-and-bound algorithm and we present numerical experiments on polynomial, copositive, and combinatorial optimization problems.


[54] 2406.06306

Unified Fourier transform on graphs sampled from stochastic block models

Recently, an approach to graph signal processing based on graphons was proposed. Here we show how such a graphon-driven approach to the Fourier transform can be used on graphs sampled from a stochastic block model (SBM). In particular, we show how a Fourier basis can be easily calculated from the block sizes and the block probability matrix. Using perturbation theory, we derive bounds on the sensitivity of the basis with respect to variations in the block sizes. We then consider SBMs constructed from weighted Cayley graphs. When block sizes are equal, a nice Fourier basis can be derived from the representation theory of the underlying group. When block sizes are nearly uniform, we demonstrate that this Fourier basis closely approximates the SBM Fourier basis. For highly non-uniform block sizes, the group-based Fourier basis is no longer applicable, though, as we show, the underlying group still provides partial information about the SBM Fourier basis.


[55] 2502.08236

Hierarchical Coherent Imaging of Composite Anisotropic Moving Targets in ISAC

In Integrated Sensing and Communication (ISAC) networks, distributed devices can cooperate to produce radio images of the surrounding environment by exploiting phase-coherent signal processing. However, existing imaging methods are not well-suited for composite moving targets with multiple independently moving extended parts. This is due to simplistic isotropic scattering models and the lack of methods to compensate for distinct Doppler shifts from each component, which leads to image defocusing. We propose MOSAIC, the first hierarchical imaging method for composite moving targets using distributed User Equipments (UEs) and a single ISAC Base Station (BS). MOSAIC generates high-resolution images of each target part and estimates its velocity vector. Coherent imaging is performed within selected clusters of UEs observing a locally isotropic scattering from each part, while cluster-specific images are combined non-coherently across wide angles to improve the reconstruction. To mitigate Doppler-induced defocusing, Doppler components are pre-compensated before coherent imaging, turning a limitation into an additional means of resolving multiple target parts. This also enables low-complexity velocity estimation by associating Doppler frequencies across UEs. Simulations show over 50% improvement in image quality compared to existing methods, in terms of Wasserstein distance, and dm/s-level velocity estimation accuracy.


[56] 2507.07067

How to Bridge the Sim-to-Real Gap in Digital Twin-Aided Telecommunication Networks

Training effective artificial intelligence models for telecommunications is challenging due to the scarcity of deployment-specific data. Real data collection is expensive, and available datasets often fail to capture the unique operational conditions and contextual variability of the network environment. Digital twinning provides a potential solution to this problem, as simulators tailored to the current network deployment can generate site-specific data to augment the available training datasets. However, there is a need to develop solutions to bridge the inherent simulation-to-reality (sim-to-real) gap between synthetic and real-world data. This paper reviews recent advances on two complementary strategies: 1) the calibration of digital twins (DTs) through real-world measurements, and 2) the use of sim-to-real gap-aware training strategies to robustly handle residual discrepancies between digital twin-generated and real data. For the latter, we evaluate two conceptually distinct methods that model the sim-to-real gap either at the level of the environment via Bayesian learning or at the level of the training loss via prediction-powered inference.


[57] 2509.13745

Theoretical Validation of the Latent Optimally Partitioned-$\ell_2/\ell_1$ Penalty with Application to Angular Power Spectrum Estimation

This paper demonstrates that, in both theory and practice, the latent optimally partitioned (LOP)-$\ell_2/\ell_1$ penalty is effective for exploiting block-sparsity without knowledge of the concrete block structure. More precisely, we first present a novel theoretical result showing that the optimized block partition in the LOP-$\ell_2/\ell_1$ penalty satisfies a condition required for accurate recovery of block-sparse signals. Motivated by this result, we present a new application of the LOP-$\ell_2/\ell_1$ penalty to estimation of angular power spectrum, which is block-sparse with unknown block partition, in MIMO communication systems. Numerical simulations show that the proposed use of block-sparsity with the LOP-$\ell_2/\ell_1$ penalty significantly improves the estimation accuracy of the angular power spectrum.


[58] 2509.19484

linrax: A JAX Compatible, Simplex Method Linear Program Solver

We present linrax, the first simplex based linear program (LP) solver compatible with the JAX ecosystem. In many control algorithms, LPs are often automatically generated and frequently solved either offline or online in the control loop. This motivates the design of linrax, which is especially suited for compilation into a complex JAX-based pipeline as a subroutine. We discuss the challenges associated with implementing a general purpose LP solver under strict design requirements from JAX. Notably, we can solve general problems which may include dependent constraints-something not possible with existing JAX-compatible LP solvers that use first-order techniques and may fail to converge. We demonstrate the utility of linrax through several examples, including a robust control synthesis pipeline for a nonlinear vehicle model using automatic differentiation through a LP-based reachable set framework.


[59] 2510.06970

Falsification-driven reinforcement learning for maritime motion planning

Compliance with maritime traffic rules is essential for the safe operation of autonomous vessels, yet training reinforcement learning (RL) agents to adhere to them is challenging. The behavior of RL agents is shaped by the training scenarios they encounter, but creating scenarios that capture the complexity of maritime navigation is non-trivial, and real-world data alone is insufficient. To address this, we propose a falsification-driven RL approach that generates adversarial training scenarios in which the vessel under test violates maritime traffic rules, which are expressed as signal temporal logic specifications. Our experiments on open-sea navigation with two vessels demonstrate that the proposed approach provides more relevant training scenarios and achieves more consistent rule compliance.


[60] 2510.21378

Optimized Power Control for Multi-User Integrated Sensing and Edge AI

This work investigates an integrated sensing and edge artificial intelligence (ISEA) system, where multiple devices first transmit probing signals for target sensing and then offload locally extracted features to the access point (AP) via analog over-the-air computation (AirComp) for collaborative inference. To characterize the relationship between AirComp error and inference performance, two proxies are established: the \emph{computation-optimal} proxy that minimizes the aggregation distortion, and the \emph{decision-optimal} proxy that maximizes the inter-class separability, respectively. Optimal transceiver designs in terms of closed-form power allocation are derived for both time-division multiplexing (TDM) and frequency-division multiplexing (FDM) settings, revealing threshold-based and dual-decomposition structures, respectively. Experimental results validate the theoretical findings.


[61] 2511.17847

Generative MR Multitasking with complex-harmonic cardiac encoding: Bridging the gap between gated imaging and real-time imaging

Purpose: To develop a unified image reconstruction framework that bridges real-time and gated cardiac MRI, including quantitative MRI. Methods: We introduce Generative Multitasking, which learns an implicit neural temporal basis from sequence timings and an interpretable latent space for cardiac and respiratory motion. Cardiac motion is modeled as a complex harmonic, with phase encoding timing and a latent amplitude capturing beat-to-beat functional variability, linking cardiac phase-resolved ("gated-like") and time-resolved ("real-time-like") views. We implemented the framework using a conditional variational autoencoder (CVAE) and evaluated it for free-breathing, non-ECG-gated radial GRE in three settings: steady-state cine imaging, multicontrast T2prep/IR imaging, and dual-flip-angle T1/T2 mapping, compared with conventional Multitasking. Results: Generative Multitasking provided flexible cardiac motion representation, enabling reconstruction of archetypal cardiac phase-resolved cines (like gating) as well as time-resolved series that reveal beat-to-beat variability (like real-time imaging). Conditioning on the previous k-space angle and modifying this term at inference removed eddy-current artifacts without globally smoothing high temporal frequencies. For quantitative mapping, Generative Multitasking reduced intraseptal T1 and T2 coefficients of variation compared with conventional Multitasking (T1: 0.13 vs. 0.31; T2: 0.12 vs. 0.32; p<0.001), indicating higher SNR. Conclusion: Generative Multitasking uses a CVAE with complex harmonic cardiac coordinates to unify gated and real-time CMR within a single free-breathing, non-ECG-gated acquisition. It allows flexible cardiac motion representation, suppresses trajectory-dependent artifacts, and improves T1 and T2 mapping, suggesting a path toward cine, multicontrast, and quantitative imaging without separate gated and real-time scans.


[62] 2512.00309

Distributed Integrated Sensing and Edge AI Exploiting Prior Information

This paper investigates a distributed ISEA system under a Bayesian framework, focusing on incorporating task-relevant priors to maximize inference performance. At the sensing level, an RWB estimator with a GM prior is designed. By weighting class-conditional posterior means with responsibilities, RWB effectively denoises features and outperforms ML at low SNR. At the communication level, two theoretical proxies are introduced: the computation-optimal and decision-optimal proxies. Optimal transceiver designs in terms of closed-form power allocation are derived for both TDM and FDM settings, revealing threshold-based and dual-decomposition structures. Results show that the discriminant-aware allocation yields additional inference gains.


[63] 2512.04418

Enabling Fast Polar SC Decoding with IR-HARQ

To extend the applications of polar codes within next-generation wireless communication systems, it is essential to incorporate support for Incremental Redundancy (IR) Hybrid Automatic Repeat Request (HARQ) schemes. For very high-throughput applications, Successive Cancellation (SC) decoding is particularly appealing for polar codes owing to its high area efficiency. In this paper, we propose modifications to SC decoders that employ special nodes to accelerate decoding. Our modifications enable the use of polar IR-HARQ with SC decoding for high throughput applications. Compared to the unmodified SC IR-HARQ scheme, our proposed approach allows us to achieve a 72% reduction in node traversals with a polar code of length 2048. Simulation results confirm that the proposed special node modifications do not cause any degradation in FER performance.


[64] 2601.06796

Artificial Intelligence Driven Channel Coding and Resource Optimization for Wireless Networks: A Systematic Survey

The ongoing evolution of 5G and its enhanced version, 5G+, has significantly transformed the telecommunications landscape, driving an unprecedented demand for ultra-high-speed data transmission, ultra-low latency, and resilient connectivity. These capabilities are essential for enabling mission-critical applications such as the Internet of Things, autonomous vehicles, and smart city infrastructures. This survey investigates the important role of Artificial Intelligence (AI) in addressing the key challenges faced by 5G/5G+ networks, including interference mitigation, dynamic resource allocation, and maintaining seamless network operation. The study particularly focuses on AI-driven innovations in coding theory, which offer advanced solutions to the limitations of conventional error correction and modulation techniques. By employing deep learning, reinforcement learning, and neural network-based approaches, including convolutional neural networks, recurrent neural networks, and Transformer-based models, this research demonstrates significant advancements in error correction performance, decoding efficiency, and adaptive transmission strategies. Additionally, the integration of AI with emerging technologies, such as massive multiple-input and multiple-output, intelligent reflecting surfaces, and privacy-enhancing mechanisms, is discussed, highlighting their potential to propel the next generation of wireless networks. This survey provides an insightful overview of the transformative impact of AI on modern wireless communication, establishing a foundation for scalable, adaptive, and more efficient network architectures.


[65] 2602.03855

Majorization-Minimization Networks for Inverse Problems: An Application to EEG Imaging

Inverse problems are often ill-posed and require optimization schemes with strong stability and convergence guarantees. While learning-based approaches such as deep unrolling and meta-learning achieve strong empirical performance, they typically lack explicit control over descent and curvature, limiting robustness. We propose a learned Majorization-Minimization (MM) framework for inverse problems within a bilevel optimization setting. Instead of learning a full optimizer, we learn a structured curvature majorant that governs each MM step while preserving classical MM descent guarantees. The majorant is parameterized by a lightweight recurrent neural network and explicitly constrained to satisfy valid MM conditions. For cosine-similarity losses, we derive explicit curvature bounds yielding diagonal majorants. When analytic bounds are unavailable, we rely on efficient Hessian-vector product-based spectral estimation to automatically upper-bound local curvature without forming the Hessian explicitly. Experiments on EEG source imaging demonstrate improved accuracy, stability, and cross-dataset generalization over deep-unrolled and meta-learning baselines.


[66] 2603.28318

Integrated sensing and communications in the 3GPP New Radio: sensing limits

Integrated Sensing and Communications (ISAC) is regarded as a key element of the beyond-fifth-generation (5G) and sixth-generation (6G) systems, raising the question of whether current 5G New Radio (NR) signal structures can meet the sensing accuracy requirements specified by the Third Generation Partnership Project (3GPP). This paper addresses this issue by analyzing the fundamental limits of range and velocity estimation through the Cramér-Rao lower bound (CRLB) for a monostatic unmanned aerial vehicle (UAV) sensing use case currently under consideration in the 3GPP standardization process. The study focuses on standardized signals and also evaluates the potential performance gains achievable with reference signals specifically designed for sensing purposes. The compact CRLB expressions derived in this work highlight the fundamental trade-offs between estimation accuracy and system parameters. The results further indicate that information from multiple slots must be exploited in the estimation process to attain the performance targets defined by the 3GPP. As a result, the 5G NR positioning reference signal (PRS), whose patterns may be suboptimal for velocity estimation when using single-slot resources, becomes suitable when multislot estimation is employed. Finally, we propose a two-step iterative range and radial-velocity estimator that attains the CRLB over a significantly wider range of distances than conventional maximum-likelihood (ML) estimators, for which the well-known threshold effect severely limits the distance range over which the accuracy requirements imposed by the 3GPP are satisfied.


[67] 2603.28714

VAANI: Capturing the language landscape for an inclusive digital India

Voice based technologies have the potential to bridge digital accessibility gaps; however, existing datasets fail to capture the linguistic and regional diversity of Indic languages. We present Project VAANI, a large scale multimodal dataset designed to represent India's linguistic landscape across 165 districts. Speech data is collected using image based prompts to elicit spontaneous responses, while images are curated through a separate pipeline covering diverse themes across regions. The dataset undergoes a rigorous multi stage quality control process, combining automated and manual evaluation to ensure high audio quality and transcription accuracy. We release approximately 289K images, 31,255 hours of speech, and 2,043 hours of transcribed audio spanning 105 languages from 28 states and 3 union territories. Many of these languages are represented at this scale for the first time, making VAANI a foundational resource for inclusive speech technology. The dataset enables the development of robust, multilingual, and multimodal models, and supports research in speech recognition, language understanding, and cross-modal learning for underrepresented languages.


[68] 2604.00774

Neural Vector Lyapunov-Razumikhin Certificates for Delayed Interconnected Systems

Ensuring scalable input-to-state stability (sISS) is critical for the safety and reliability of large-scale interconnected systems, especially in the presence of communication delays. While learning-based controllers can achieve strong empirical performance, their black-box nature makes it difficult to provide formal and scalable stability guarantees. To address this gap, we propose a framework to synthesize and verify neural vector Lyapunov-Razumikhin certificates for discrete-time delayed interconnected systems. Our contributions are three-fold. First, we establish a sufficient condition for discrete-time sISS via vector Lyapunov-Razumikhin functions, which enables certification for large-scale delayed interconnected systems. Second, we develop a scalable synthesis and verification framework that learns the neural certificates and verifies the certificates on reachability-constrained delay domains with scalability analysis. Third, we validate our approach on mixed-autonomy platoons, drone formations, and microgrids against multiple baselines, showing improved verification efficiency with competitive control performance.


[69] 2605.06198

Orthogonal Least Squares with Integrated Information Theoretic Criteria for Joint Number of Targets and DoA Estimation

We address the joint estimation of the number of targets and their direction-of-arrivals (DoAs) using antenna arrays. Target-number estimation can be formulated as a model-order selection problem and solved with the information theoretic criteria (ITC). The ITC minimize an objective function that balances a likelihood term and a complexity penalty. However, direct application of the ITC requires maximum-likelihood DoA estimates for each candidate model order, which is computationally prohibitive because it entails a multidimensional search over all angle combinations. To reduce complexity, many radar processing exploit greedy methods such as orthogonal least squares (OLS). In this paper, we explore three distinct methods to integrate the ITC model-order selection into the OLS estimation procedure for joint target-number and DoA estimation. Specifically, we propose the disjoint rank-based, the joint selection-based, and the hybrid rank-and-selection-based ITC-OLS algorithms. Each algorithm is derived under both the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) frameworks. Numerical simulations show that the proposed hybrid ITC-OLS algorithm consistently outperforms both the other proposed variants and a baseline method from the literature.


[70] 2605.13931

FSD50K-Solo: Automated Curation of Single-Source Sound Events

High-quality training datasets are essential for the performance of neural networks. However, the audio domain still lacks a large-scale, strongly-labeled, and single-source sound event dataset. The FSD50K dataset, despite being relatively large and open, contains a considerable fraction of multi-source samples where background interference or overlapping events could limit the usefulness of the data. To address this challenge, we introduce a data curation framework designed for large-scale open audio corpora. Our approach leverages a generative diffusion model to synthesize clean single-class events to construct controlled noisy mixtures for supervision. We subsequently employ a pre-trained audio encoder coupled with a discriminative classifier to automatically identify and filter out multi-source samples. Experiments show that our framework achieves strong performance on a human expert-curated test set. Finally, we release FSD50K-Solo, a model-curated subset of FSD50K containing single-source audio samples identified by our method. Beyond FSD50K, our method establishes a scalable paradigm for curating open source audio corpora.


[71] 2605.23137

STAMBRIDGE: Spectral-Temporal Amplitude-aware Mid-Feature Bridge for EEG Visual Decoding

Electroencephalography (EEG) visual decoding remains challenging due to the modality gap between low-SNR neural signals and highly structured vision--language spaces, making direct cross-modal alignment unstable. To address this, we propose STAMBRIDGE, a versatile two-stage framework that sequentially tackles feature conditioning and cross-modal alignment. First, we introduce a Spectral-Temporal Amplitude-aware Modulation (STAM) to extract well-conditioned EEG representations. By replacing hard frequency masking with amplitude-derived soft channel weighting and multi-scale temporal convolutions, STAM explicitly preserves frequency-aware transients while reducing the risk of time-domain ringing artifacts. Building upon these robust neural features, we further introduce a model-agnostic Mid-Feature Semantic Bridge (MFSB) that constructs a regularized intermediate space through directed cross-modal interactions, enabling staged distillation and more stable semantic alignment. Experiments on the THINGS-EEG benchmark show competitive 200-way zero-shot retrieval performance, with 34.50\% Top-1 and 65.95\% Top-5 accuracy. In addition, embeddings learned by STAMBRIDGE produce semantically coherent image reconstructions with a diffusion model, demonstrating robust EEG-to-vision semantic alignment. The code is available at: this https URL.


[72] 2605.23560

SafeSABR: Risk-Calibrated Adaptive Bitrate Streaming over Starlink Networks

Starlink, as a representative low Earth orbit (LEO) satellite broadband system, makes high-bitrate video streaming possible in regions where terrestrial broadband is unavailable. However, its access links exhibit rapid throughput fluctuations caused by satellite mobility and handovers. Existing learned adaptive bitrate (ABR) algorithms can achieve high average quality of experience (QoE), yet high-bitrate Starlink streaming exposes severe session-level rebuffering that is not captured by average QoE alone. To address it, this paper proposes SafeSABR, a risk-calibrated learned ABR framework for Starlink networks. SafeSABR formulates Starlink ABR as a QoE--severe-risk tradeoff and follows a three-stage design: behavior-cloning pretraining learns a high-QoE ABR prior, risk-calibrated reinforcement learning (RL) fine-tuning reduces severe-tail action tendencies, and a runtime safety auditor uses safe-capacity lower bounds to check policy-requested bitrates before execution. Experiments on real Starlink traces compare SafeSABR with online, prediction-assisted, and learned ABR baselines. Compared with advanced methods, SafeSABR reduces severe-stall sessions from 22.8% to 7.2% and worst-5% session rebuffering from 54.30 s to 22.68 s, with a 1.8% QoE cost. Component analyses further show that risk-calibrated fine-tuning and safe-capacity auditing reduce unsafe bitrate decisions and downstream severe-session rebuffering. These results show that combining risk-calibrated policy learning with decision-aware safe throughput forecasting can move learned ABR toward a safer QoE--severe-risk operating point under volatile Starlink networks.


[73] 2605.25306

Nonlinear-Gain Distributed Zeroth-Order Optimization for Networked Black-Box Control

This letter studies distributed stochastic optimization over a peer-to-peer network when agents can query only zeroth-order function values. We propose ZOOM-PB, a coordinate-sampling distributed zeroth-order method equipped with a fractional-power powerball map. Unlike existing distributed zeroth-order methods that mainly refine gradient estimation or introduce primal--dual tracking, the proposed mechanism acts as a nonlinear feedback gain on the estimated gradient: it amplifies weak signals in flat regions and attenuates large stochastic estimates without adding transmitted states. Under standard smoothness, oracle-variance, and network-connectivity assumptions, ZOOM-PB achieves the leading nonconvex stationarity rate $\mathcal{O}(\sqrt{p/(nT)})$, where $p$ is the decision dimension, $n$ is the number of agents, and $T$ is the iteration horizon. Under the Polyak--Łojasiewicz condition, it further attains the leading objective residual rate $\mathcal{O}(p/(nT))$. Thus the method preserves the known distributed ZO order while changing the finite-time behavior through a local nonlinear control gain. Simulations on black-box learning and sensor-driven UAV source seeking show faster empirical convergence in weak-signal regimes.


[74] 2605.26875

G-iMUSIC: Greedy Iterative MUSIC Algorithms for Multi-Target DoA Estimation

This paper presents novel algorithms for multi-target direction-of-arrival (DoA) estimation in array signal processing. Although the maximum likelihood estimator (MLE) asymptotically attains the Cramér-Rao bound, its exponential complexity motivates practical alternatives, such as greedy or subspace-based methods. In this context, greedy methods such as orthogonal matching pursuit (OMP) and orthogonal least squares (OLS) are sensitive to early selection errors, especially for angularly proximate targets, whereas subspace-based methods such as multiple signal classification (MUSIC) present angular super-resolution capabilities but degrade under strong inter-target signal correlation. To overcome these limitations, we propose two greedy iterative MUSIC (G-iMUSIC) algorithms, namely OMP-iMUSIC and OLS-iMUSIC, derived from a unified framework that links subspace and greedy estimations. Unlike prior iMUSIC approaches, the proposed methods require only one initial eigen value decomposition (EVD) and avoid computing eigendecomposition at each iteration. They also admit Fast Fourier Transform (FFT)-accelerated implementations for uniform linear arrays (ULAs), enabling low-complexity operation. Monte Carlo simulations demonstrate improved detection and precision over conventional OMP, OLS, and MUSIC, as well as reduced processing time compared to greedy baselines. Finally, we introduce diagnostic metrics that interpret performance across signal correlation and angular proximity regimes, supporting generalization beyond the specific orthogonal frequency-division multiplexing (OFDM) radar scenario considered.


[75] 2305.06426

Planning a Community Approach to Diabetes Care in Low- and Middle-Income Countries Using Optimization

Diabetes is a global health priority, especially in low- and-middle-income countries, where over 50% of premature deaths are attributed to high blood glucose. Community Health Worker (CHW) programs can provide affordable and culturally tailored solutions for early detection and management of diabetes. We introduce an optimization framework to determine personalized CHW visits that maximize glycemic control at a community level. Our framework explicitly models the trade-off between screening new patients and providing management visits to individuals who are enrolled in treatment. We account for patients' motivational states, which affect their decisions to enroll or drop out of treatment and, therefore, the effectiveness of the intervention. By estimating patients' health and motivational states, our model builds visit plans accounting for patients' tradeoffs when deciding to enroll in treatment, leading to reduced dropout rates and improved resource allocation. We apply our approach to generate CHW visit plans using operational data from urban slums in India. We find that our approach can reduce fasting blood glucose by up to 25% with the same capacity as the best baseline method. Our experiments also demonstrate that our approach performs well with imperfect information.


[76] 2307.06240

DSSE: a drone swarm search environment

The Drone Swarm Search project is an environment, based on \textsc{PettingZoo}, that is to be used in conjunction with multi-agent (or single-agent) reinforcement learning algorithms. It is an environment in which the agents (drones), have to find the targets (shipwrecked people). The agents do not know the position of the target and do not receive rewards related to their own distance to the target(s). However, the agents receive the probabilities of the target(s) being in a certain cell of the map. The aim of this project is to aid in the study of reinforcement learning algorithms that require dynamic probabilities as inputs. A peer-reviewed paper describing version 2 of this software has been published in JOSS: this https URL.


[77] 2311.01296

Matrix imaging as a tool for high-resolution monitoring of deep volcanic plumbing systems with seismic noise

Volcanic eruptions necessitate precise monitoring of magma pressure and inflation for improved forecasting. Understanding deep magma storage is crucial for hazard assessment, yet imaging these systems is challenging due to complex heterogeneities that disrupt standard seismic migration techniques. Here we map the magmatic and hydrothermal system of the La Soufrière volcano in Guadeloupe by analyzing seismic noise data from a sparse geophone array under a matrix formalism. Seismic noise interferometry provides a reflection matrix containing the signature of echoes from deep heterogeneities. Using wave correlations resistant to disorder, matrix imaging successfully unscrambles wave distortions, revealing La Soufrière's internal structure down to 10 kilometers with 100-meter resolution. This method surpasses the diffraction limit imposed by geophone array aperture, providing crucial data for modeling and high-resolution monitoring. We see matrix imaging as a revolutionary tool for understanding volcanic systems and enhancing observatories' abilities to monitor dynamics and forecast eruptions.


[78] 2501.06491

Improving Requirements Classification with SMOTE-Tomek Preprocessing

This study emphasizes the domain of requirements engineering by applying the SMOTE-Tomek preprocessing technique, combined with stratified K-fold cross-validation, to address class imbalance in the PROMISE dataset. This dataset comprises 969 categorized requirements, classified into functional and non-functional types. The proposed approach enhances the representation of minority classes while maintaining the integrity of validation folds, leading to a notable improvement in classification accuracy. Logistic regression achieved 76.16\%, significantly surpassing the baseline of 58.31\%. These results highlight the applicability and efficiency of machine learning models as scalable and interpretable solutions.


[79] 2505.17233

Semantic-Aware Interpretable Multimodal Music Auto-Tagging

Music auto-tagging is essential for organizing and discovering music in extensive digital libraries. While foundation models achieve exceptional performance in this domain, their outputs often lack interpretability, limiting trust and usability for researchers and end-users alike. In this work, we present an interpretable framework for music auto-tagging that leverages groups of musically meaningful multimodal features, derived from signal processing, deep learning, ontology engineering, and natural language processing. To enhance interpretability, we cluster features semantically and employ an expectation maximization algorithm, assigning distinct weights to each group based on its contribution to the tagging process. Our method achieves competitive tagging performance while offering a deeper understanding of the decision-making process, paving the way for more transparent and user-centric music tagging systems.


[80] 2506.08846

Addressing Pitfalls in Auditing Practices of Automatic Speech Recognition Technologies: A Case Study of People with Aphasia

Automatic Speech Recognition (ASR) systems' growing use warrants robust auditing approaches to ensure equitable transcription quality, especially for people with speech disorders like aphasia who disproportionately depend on ASR. While academic and industry audits have revealed performance disparities across user populations, standard auditing practices often overlook nuances that risk masking harm to marginalized groups. We identify three common pitfalls in standard ASR audits: (1) adhering to one method of text standardization, which can mask variance in ASR performance and ignore the standardization preferences of marginalized communities; (2) displaying high-level demographic findings without considering performance disparities by nuanced intersectional subgroups, or conditioning on relevant acoustic properties; and (3) reporting only one gold-standard metric (Word Error Rate), which inadequately quantifies common generative AI errors like hallucinations. We propose a holistic auditing framework addressing these pitfalls, and in a case study of six popular ASR systems, find consistently worse ASR performance for speakers with aphasia relative to a control group. We call on practitioners to implement these robust, community-driven ASR auditing practices better suited for the rapidly changing ASR landscape.


[81] 2509.14075

RCM Constraint-Consistent Dynamic Control in Surgical Robots

Robotic-assisted minimally invasive surgery (RAMIS) requires accurate enforcement of the remote center of motion (RCM) constraint to ensure safe tool motion through a trocar. Existing virtual RCM controllers are commonly formulated either at the kinematic level or as task-space objectives, which makes torque-level enforcement under trocar motion and physical interaction difficult to formulate consistently. This paper models the RCM as a rheonomic holonomic constraint and incorporates it into a projection-based inverse-dynamics controller with explicit constrained/free-motion torque decomposition. The resulting formulation unifies kinematic RCM enforcement and task-space tracking at the torque level, while preserving a constraint-consistent structure for residual regulation and null-space compliance. The proposed controller is validated in simulation and on a RAMIS training platform against representative projection-based and constrained-dynamics baselines. Across spiral tracking, varying insertion depth, moving trocar conditions, and human interaction, the method achieves lower RCM residuals and smoother torque profiles while maintaining accurate tool-tip tracking. These results support the use of constraint-consistent torque control for reliable virtual RCM enforcement in surgical robotics. The project page is available at this https URL


[82] 2510.03534

Long-Term Mapping of the Douro River Plume with Multi-Agent Reinforcement Learning

We study the problem of long-term (multiple days) mapping of a river plume using multiple autonomous underwater vehicles (AUVs), focusing on the Douro river representative use-case. We propose an energy - and communication - efficient multi-agent reinforcement learning approach in which a central coordinator intermittently communicates with the AUVs, collecting measurements and issuing commands. Our approach integrates spatiotemporal Gaussian process regression (GPR) with a multi-head Q-network controller that regulates direction and speed for each AUV. Simulations using the Delft3D ocean model demonstrate that our method consistently outperforms both single- and multi-agent benchmarks, with scaling the number of agents both improving mean squared error (MSE) and operational endurance. In some instances, our algorithm demonstrates that doubling the number of AUVs can more than double endurance while maintaining or improving accuracy, underscoring the benefits of multi-agent coordination. Our learned policies generalize across unseen seasonal regimes over different months and years, demonstrating promise for future developments of data-driven long-term monitoring of dynamic plume environments.


[83] 2510.15541

An Empirical Study on Variance-based MC Dropout Uncertainty-Error Correlation in 2D Brain Tumor Segmentation

Accurate brain tumor segmentation from MRI is vital for diagnosis and treatment planning. Although Monte Carlo (MC) Dropout is widely used to estimate model uncertainty, the effectiveness of variance-based uncertainty - computed as pixel-wise variance across stochastic forward passes - in identifying segmentation errors, particularly near tumor boundaries, remains insufficiently studied. This study empirically examines the relationship between variance-based MC Dropout uncertainty and segmentation error in 2D brain tumor MRI segmentation using a U-Net trained under four augmentation settings: none, horizontal flip, rotation, and scaling. Uncertainty was estimated as the pixel-wise variance across 50 stochastic forward passes and correlated with pixel-wise errors using Pearson and Spearman coefficients. Results show weak global correlations (r ~ 0.30-0.38) and negligible boundary correlations (|r| < 0.05). Although differences across augmentations were statistically significant (p < 0.001), they lacked practical relevance. These findings suggest that variance-based MC Dropout uncertainty provides limited cues for global and boundary error localization, and that the choice of uncertainty representation critically affects the utility of MC Dropout in medical image segmentation. Alternative representations such as predictive entropy or mutual information may better capture segmentation errors, particularly at boundaries.


[84] 2512.09786

TinyDéjàVu: Smaller RAM and Faster Inference with Neural Networks on MCUs for Sensor Data Streams

Examples of embedded intelligence include a wide variety of tiny neural networks used on-board wireless sensors and actuators, which are expected to continuously perform inference on time-series of the data they sense. In order to fit lifetime and energy consumption requirements when operating on battery, such hardware is exclusively based on microcontroller with as little memory as possible, e.g., 128 kB of RAM. In this context, optimizing data flows during inference across neural network layers becomes crucial. In this paper, we introduce a new framework, TinyDéjàVu, and novel algorithms we designed to drastically reduce the RAM budget required by inference using various neural network models for sensor data time-series on typical microcontroller hardware. We publish the implementation of TinyDéjàVu as open source, and we perform reproducible benchmarks on common microcontroller hardware (Arm Cortex-M). We show that TinyDéjàVu can save up to 90\% of RAM usage with equal compute latency compared to prior work (StreamiNNC) on overlapping sliding window inputs.


[85] 2512.12649

Bayesian Optimization Parameter Tuning Framework for a Lyapunov Based Path Following Controller

Parameter tuning in real-world experiments is constrained by the limited evaluation budget available on hardware. The path-following controller studied in this paper reflects a typical situation in nonlinear geometric controller, where multiple gains influence the dynamics through coupled nonlinear terms. Such interdependence makes manual tuning inefficient and unlikely to yield satisfactory performance within a practical number of trials. To address this challenge, we propose a Bayesian optimization (BO) framework that treats the closed-loop system as a black box and selects controller gains using a Gaussian-process surrogate. BO offers model-free exploration, quantified uncertainty, and data-efficient search, making it well suited for tuning tasks where each evaluation is costly. The framework is implemented on Honda's AI-Formula three-wheeled robot and assessed through repeated full-lap experiments on a fixed test track. The results show that BO improves controller performance within 32 trials, including 15 warm-start initial evaluations, indicating that it can efficiently locate high-performing regions of the parameter space under real-world conditions. These findings demonstrate that BO provides a practical, reliable, and data-efficient tuning approach for nonlinear path-following controllers on real robotic platforms.


[86] 2601.01616

Real Time NILM Based Power Monitoring of Identical Induction Motors Representing Cutting Machines in Textile Industry

The textile industry in Bangladesh is one of the most energy-intensive sectors, yet its monitoring practices remain largely outdated, resulting in inefficient power usage and high operational costs. To address this, we propose a real-time Non-Intrusive Load Monitoring (NILM)-based framework tailored for industrial applications, with a focus on identical motor-driven loads representing textile cutting machines. A hardware setup comprising voltage and current sensors, Arduino Mega and ESP8266 was developed to capture aggregate and individual load data, which was stored and processed on cloud platforms. A new dataset was created from three identical induction motors and auxiliary loads, totaling over 180,000 samples, to evaluate the state-of-the-art MATNILM model under challenging industrial conditions. Results indicate that while aggregate energy estimation was reasonably accurate, per-appliance disaggregation faced difficulties, particularly when multiple identical machines operated simultaneously. Despite these challenges, the integrated system demonstrated practical real-time monitoring with remote accessibility through the Blynk application. This work highlights both the potential and limitations of NILM in industrial contexts, offering insights into future improvements such as higher-frequency data collection, larger-scale datasets and advanced deep learning approaches for handling identical loads.


[87] 2601.04505

CircuitLM: A Multi-Agent LLM-Aided Design Framework for Generating Circuit Schematics from Natural Language Prompts

Generating accurate circuit schematics from high-level natural language descriptions remains a persistent challenge in electronic design automation (EDA), as large language models (LLMs) frequently hallucinate components, violate strict physical constraints, and produce non-machine-readable outputs. To address this, we present CircuitLM, a multi-agent pipeline that translates user prompts into structured, visually interpretable $\texttt{CircuitJSON}$ schematics. The framework mitigates hallucination and ensures physical viability by grounding generation in a curated, embedding-powered component knowledge base through five sequential stages: (i) component identification, (ii) canonical pinout retrieval, (iii) chain-of-thought reasoning, (iv) JSON schematic synthesis, and (v) interactive force-directed visualization. We evaluate the system on a dataset of 100 unique circuit-design prompts using five state-of-the-art LLMs. To systematically assess performance, we deploy a rigorous dual-layered evaluation methodology: a deterministic Electrical Rule Checking (ERC) engine categorizes topological faults by strict severity (Critical, Major, Minor, Warning), while an LLM-as-a-judge meta-evaluator identifies complex, context-aware design flaws that bypass standard rule-based checkers. Ultimately, this work demonstrates how targeted retrieval combined with deterministic and semantic verification can bridge natural language to structurally viable, schematic-ready hardware and safe circuit prototyping. Our code and data are publicly available at this https URL.


[88] 2601.09239

DSA-Tokenizer: Disentangled Semantic-Acoustic Tokenization via Flow Matching-based Hierarchical Fusion

Speech tokenizers are a key building block of fully discrete Speech this http URL tokenizers either prioritize semantic encoding,fuse semantic content with acoustic style inseparably,or achieve incomplete semantic-acoustic this http URL achieve better disentanglement,we propose DSA-Tokenizer,which explicitly disentangles speech into discrete semantic and acoustic tokens via distinct optimization this http URL,semantic tokens are supervised by ASR to capture linguistic content,while acoustic tokens focus on mel-spectrograms restoration to encode this http URL further introduce a hierarchical Flow Matching decoder and a joint reconstruction-context inpainting training strategy,allowing the model to support both high-fidelity reconstruction and cross-utterance voice this http URL speed up inference,we distill the DiT decoder to reduce sampling steps of inference to 4 and improve synthesis quality with GAN this http URL demonstrate that DSA-Tokenizer provides strong semantic-acoustic disentanglement,reliable controllable voice cloning,and efficient high-fidelity generation with low WER/CER.Moreover,our results suggest that disentangled tokenization provides a more effective interface for downstream large-model speech this http URL samples are avaialble at this https URL.


[89] 2603.13003

From Passive Monitoring to Active Defence: Resilient Control of Manipulators Under Cyberattacks

Cyber-physical robotic systems are vulnerable to false data injection attacks (FDIAs), in which an adversary corrupts sensor signals while evading residual-based passive anomaly detectors such as the chi-squared test. Such stealthy attacks can induce substantial end-effector deviations without triggering alarms. This paper studies the resilience of redundant manipulators to stealthy FDIAs and advances the architecture from passive monitoring to active defence. We formulate a closed-loop model comprising a feedback-linearized manipulator, a steady-state Kalman filter, and a chi-squared-based anomaly detector. Building on this passive monitoring layer, we propose an active control-level defence that attenuates the control input through a monotone function of an anomaly score generated by a novel actuation-projected, measurement-free state predictor. The proposed design provides probabilistic guarantees on nominal actuation loss and preserves closed-loop stability. From the attacker perspective, we derive a convex QCQP for computing one-step optimal stealthy attacks. Simulations on a 6-DOF planar manipulator show that the proposed defence significantly reduces attack-induced end-effector deviation while preserving nominal task performance in the absence of attacks.


[90] 2604.24352

Data-Driven Adaptive Resource Allocation for Reliable Low-Latency Uplink Communications in Rural Cellular 5G Multi-Connectivity

Reliable low-latency communication is a key requirement for mission-critical and mobile autonomous systems, including teleoperation, autonomous navigation, and real-time uplink-dominant telemetry applications. While commercial 5G networks often provide adequate downlink performance, uplink performance in rural deployments may be constrained by radio-resource limitations and uplink power-control mechanisms. This paper presents a comprehensive experimental evaluation of multi-connectivity strategies over commercial 5G Non-Standalone networks, based on measurement campaigns conducted in urban, suburban, and rural environments. The study analyzes per-packet uplink and downlink latency, packet loss, and radio-layer KPIs across two mobile network operators. The measurements indicate that latency and reliability cannot be inferred solely from coverage indicators such as RSRP. In coverage-constrained scenarios, performance appears to be strongly influenced by uplink power-limited operation and partially correlated impairments across operators. Several multi-connectivity strategies are evaluated, including link aggregation, switching-based policies, and conditional packet duplication. A Primary-Anchored Adaptive Failover (PAAF) framework is introduced to selectively activate redundancy based on radio, latency and service cost considerations. The results suggest that Partial Duplication (PD) approaches can approach the reliability of multi-connectivity while substantially reducing duplication overhead in the evaluated rural scenario.


[91] 2605.00457

Policy-Driven DRL-Based TXOP Adaptation in NR-U and Wi-Fi Coexistence

The coexistence of NR-U and Wi-Fi in unlicensed spectrum introduces a challenging coexistence management problem, where heterogeneous channel access mechanisms lead to a significant imbalance in spectrum utilization and degraded Wi-Fi performance. To address this challenge, we propose a policy-driven deep reinforcement learning (DRL) framework for adaptive transmission opportunity (TXOP) control, in which the coexistence process is formulated as a Markov decision process (MDP) and a deep Q-network (DQN) learns control policies through online interaction. A key contribution is the introduction of a policy layer via reward design, enabling explicit control of coexistence tradeoffs among fairness, throughput, and utility. Three policies, namely absolute fairness, moderate fairness, and utility-based fairness, are developed to achieve different operating points. Simulation results show that the proposed framework achieves a Jain fairness index above 0.9 under strict fairness control. Compared to absolute fairness, moderate fairness improves aggregate throughput by 68.22%, while the utility-based policy further enhances utility by 177.6%. These results demonstrate that policy-driven control provides a flexible and effective solution for managing tradeoffs in heterogeneous coexistence networks.


[92] 2605.22661

A Generalized Nash Equilibrium-Seeking Scheme for Trauma Resuscitation

Trauma resuscitation is a clinical process for treating life-threatening physiological disorders in safety-critical environments, driven by the experience of healthcare workers (HCWs). Designing and optimizing quantifiable metrics that accurately capture HCW decisions may augment current resuscitation procedures with the potential to improve patient outcomes. This motivates our socio-technical formulation of trauma resuscitation as a distributed generalized Nash equilibrium (GNE)-seeking game with coupled inequality constraints. This method is optimized over a time-varying communication graph. We introduce novel insights from clinical experience to model HCWs behavior. This work facilitates the best possible resuscitation outcome given HCWs workloads, schedules, competencies, and limited resources.