New articles on Electrical Engineering and Systems Science


[1] 2603.26695

Complementarity-Preserving Generative Theory for Multimodal ECG Synthesis: A Quantum-Inspired Approach

Multimodal deep learning has substantially improved electrocardiogram (ECG) classification by jointly leveraging time, frequency, and time-frequency representations. However, existing generative models typically synthesize these modalities independently, resulting in synthetic ECG data that are visually plausible yet physiologically inconsistent across domains. This work establishes a Complementarity-Preserving Generative Theory (CPGT), which posits that physiologically valid multimodal signal generation requires explicit preservation of cross-domain complementarity rather than loosely coupled modality synthesis. We instantiate CPGT through Q-CFD-GAN, a quantum-inspired generative framework that models multimodal ECG structure within a complex-valued latent space and enforces complementarity-aware constraints regulating mutual information, redundancy, and morphological coherence. Experimental evaluation demonstrates that Q-CFD-GAN reduces latent embedding variance by 82%, decreases classifier-based plausibility error by 26.6%, and restores tri-domain complementarity from 0.56 to 0.91, while achieving the lowest observed morphology deviation (3.8%). These findings show that preserving multimodal information geometry, rather than optimizing modality-specific fidelity alone, is essential for generating synthetic ECG signals that remain physiologically meaningful and suitable for downstream clinical machine-learning applications.


[2] 2603.26697

Physicochemical-Neural Fusion for Semi-Closed-Circuit Respiratory Autonomy in Extreme Environments

This paper introduces Galactic Bioware's Life Support System, a semi-closed-circuit breathing apparatus designed for integration into a positive-pressure firefighting suit and governed by an AI control system. The breathing loop incorporates a soda lime CO2 scrubber, a silica gel dehumidifier, and pure O2 replenishment with finite consumables. One-way exhaust valves maintain positive pressure while creating a semi-closed system in which outward venting gradually depletes the gas inventory. Part I develops the physicochemical foundations from first principles, including state-consistent thermochemistry, stoichiometric capacity limits, adsorption isotherms, and oxygen-management constraints arising from both fire safety and toxicity. Part II introduces an AI control architecture that fuses three sensor tiers, external environmental sensing, internal suit atmosphere sensing (with triple-redundant O2 cells and median voting), and firefighter biometrics. The controller combines receding-horizon model-predictive control (MPC) with a learned metabolic model and a reinforcement learning (RL) policy advisor, with all candidate actuator commands passing through a final control-barrier-function safety filter before reaching the hardware. This architecture is intended to optimize performance under unknown mission duration and exertion profiles. In this paper we introduce an 18-state, 3-control nonlinear state-space formulation using only sensors viable in structural firefighting, with triple-redundant O2 sensing and median voting. Finally, we introduce an MPC framework with a dynamic resource scarcity multiplier, an RL policy advisor for warm-starting, and a final control-barrier-function safety filter through which all actuator commands must pass, demonstrating 18-34% endurance improvement in simulation over PID baselines while maintaining tighter physiological and fire-safety margins.


[3] 2603.26699

EMPD: An Event-based Multimodal Physiological Dataset for Remote Pulse Wave Detection

Remote photoplethysmography (rPPG) based on traditional frame-based cameras often struggles with motion artifacts and limited temporal resolution. To address these limitations, we introduce EMPD (Event-based Multimodal Physiological Dataset), the first benchmark dataset specifically designed for non-contact physiological sensing via event cameras. The dataset leverages a laser-assisted acquisition system where a high-coherence laser modulates subtle skin vibrations from the radial artery into significant signals detectable by a neuromorphic sensor. The hardware platform integrates a high-resolution event camera to capture micro-motions and intensity transients, an industrial RGB camera to provide traditional rPPG benchmarks, and a clinical-grade pulse oximeter to record ground truth PPG waveforms. EMPD contains 193 valid records collected from 83 subjects, covering a wide heart rate range (40-110 BPM) under both resting and post-exercise conditions. By providing precisely synchronized multimodal data with microsecond-level temporal precision, EMPD serves as a crucial resource for developing robust algorithms in the field of neuromorphic physiological monitoring. The dataset is publicly available at: this https URL


[4] 2603.26704

Deep Learning Multi-Horizon Irradiance Nowcasting: A Comparative Evaluation of Three Methods for Leveraging Sky Images

We investigate three distinct methods of incorporating all-sky imager (ASI) images into deep learning (DL) irradiance nowcasting. The first method relies on a convolutional neural network (CNN) to extract features directly from raw RGB images. The second method uses state-of-the-art algorithms to engineer 2D feature maps informed by domain knowledge, e.g., cloud segmentation, the cloud motion vector, solar position, and cloud base height. These feature maps are then passed to a CNN to extract compound features. The final method relies on aggregating the engineered 2D feature maps into time-series input. Each of the three methods were then used as part of a DL model trained on a high-frequency, 29-day dataset to generate multi-horizon forecasts of global horizontal irradiance up to 15 minutes ahead. The models were then evaluated using root mean squared error and skill score on 7 selected days of data. Aggregated engineered ASI features as model input yielded superior forecasting performance, demonstrating that integration of ASI images into DL nowcasting models is possible without complex spatially-ordered DL-architectures and inputs, underscoring opportunities for alternative image processing methods as well as the potential for improved spatial DL feature processing methods.


[5] 2603.26706

Evaluating Smartphone GNSS Accuracy for Geofenced 6 GHz Operations

The recently deployed 6 GHz spectrum in the U.S. utilizes distinct power categories, with the latest proposed "Geofenced Variable Power" (GVP) category permitting indoor and outdoor operations without continuous Automated Frequency Coordination (AFC) by relying instead on local databases of exclusion zones. Consequently, the safe operation of GVP devices depends entirely on reliable GNSS localization to respect these geofences. However, GNSS accuracy is highly variable and significantly degrades in environments like urban canyons or indoors. This paper presents the first comprehensive empirical study evaluating GNSS reliability specifically for GVP compliance. Utilizing the SigCap Android application, we document and compare GNSS accuracy across an extensive array of real-world conditions, encompassing urban versus suburban landscapes, varying mobility states (stationary, walking, driving), and indoor versus outdoor settings. The results demonstrate that while device hardware causes variations in GNSS accuracy, the operational environment is the primary driver of error. Indoor settings and dense urban areas consistently degrade localization. Moreover, outdoor positions adjacent to buildings often surprisingly produce significant inaccuracies, even near low-elevation structures. We further analyze the contribution of different GNSS constellations to device positioning and show that satellites from non-U.S.-licensed constellations-although currently used in a substantial portion of location fixes-are not permitted for regulatory geolocation under FCC requirements.


[6] 2603.26709

Neural Aided Adaptive Innovation-Based Invariant Kalman Filter

Autonomous platforms require accurate positioning to complete their tasks. To this end, a Kalman filter-based algorithms, such as the extended Kalman filter or invariant Kalman filter, utilizing inertial and external sensor fusion are applied. To cope with real-world scenarios, adaptive noise estimation methods have been developed primarily for classical Euclidean formulations. However, these methods remain largely unexplored in the tangent Lie space, despite it provides a principled geometric framework with favorable error dynamics on Lie groups. To fill this gap, we combine invariant filtering theory with neural-aided adaptive noise estimation in real-world settings. To this end, we derive a novel theoretical extension of classical innovation-based process noise adaptation formulated directly within the Lie-group framework. We further propose a lightweight neural network that estimates the process noise covariance parameters directly from raw inertial data. Trained entirely in a sim2real framework via domain adaptation, the network captures motion-dependent and sensor-dependent noise characteristics without requiring labeled real-world data. To examine our proposed neural-aided adaptive invariant Kalman filter, we focus on the challenging real-world scenario of autonomous underwater navigation. Experimental results demonstrate superior performance compared to existing methods in terms of position root mean square error. These results validate our sim2real pipeline and further confirm that geometric invariance significantly enhances learning-based adaptation and that adaptive noise estimation in the tangent Lie space offers a powerful mechanism for improving navigation accuracy in nonlinear autonomous platforms.


[7] 2603.26716

FEMBA on the Edge: Physiologically-Aware Pre-Training, Quantization, and Deployment of a Bidirectional Mamba EEG Foundation Model on an Ultra-low Power Microcontroller

Objective: To enable continuous, long-term neuro-monitoring on wearable devices by overcoming the computational bottlenecks of Transformer-based Electroencephalography (EEG) foundation models and the quantization challenges inherent to State-Space Models (SSMs). Methods: We present FEMBA, a bidirectional Mamba architecture pre-trained on over 21,000 hours of EEG. We introduce a novel Physiologically-Aware pre-training objective, consisting of a reconstruction with low-pass filtering, to prioritize neural oscillations over high-frequency artifacts. To address the activation outliers common in SSMs, we employ Quantization-Aware Training (QAT) to compress the model to 2-bit weights. The framework is deployed on a parallel ultra-low-power RISC-V microcontroller (GAP9) using a custom double-buffered memory streaming scheme. Results: The proposed low-pass pre-training improves downstream AUROC on TUAB from 0.863 to 0.893 and AUPR from 0.862 to 0.898 compared to the best contrastive baseline. QAT successfully compresses weights with negligible performance loss, whereas standard post-training quantization degrades accuracy by approximately \textbf{30\%}. The embedded implementation achieves deterministic real-time inference (\textbf{1.70~s} per 5~s window) and reduces the memory footprint by \textbf{74\%} (to $\approx$2~MB), achieving competitive accuracy with up to \textbf{27$\times$} fewer FLOPs than Transformer benchmarks. Conclusion: FEMBA demonstrates that Mamba-based foundation models can be effectively quantized and deployed on extreme-edge hardware without sacrificing the representation quality required for robust clinical analysis. Significance: This work establishes the first full-stack framework for deploying large-scale EEG foundation models on ultra-low-power wearables, facilitating continuous, SSM based monitoring for epilepsy and sleep disorders.


[8] 2603.26721

Stress Classification from ECG Signals Using Vision Transformer

Vision Transformers have shown tremendous success in numerous computer vision applications; however, they have not been exploited for stress assessment using physiological signals such as Electrocardiogram (ECG). In order to get the maximum benefit from the vision transformer for multilevel stress assessment, in this paper, we transform the raw ECG data into 2D spectrograms using short time Fourier transform (STFT). These spectrograms are divided into patches for feeding to the transformer encoder. We also perform experiments with 1D CNN and ResNet-18 (CNN model). We perform leave-onesubject-out cross validation (LOSOCV) experiments on WESAD and Ryerson Multimedia Lab (RML) dataset. One of the biggest challenges of LOSOCV based experiments is to tackle the problem of intersubject variability. In this research, we address the issue of intersubject variability and show our success using 2D spectrograms and the attention mechanism of transformer. Experiments show that vision transformer handles the effect of intersubject variability much better than CNN-based models and beats all previous state-of-the-art methods by a considerable margin. Moreover, our method is end-to-end, does not require handcrafted features, and can learn robust representations. The proposed method achieved 71.01% and 76.7% accuracies with RML dataset and WESAD dataset respectively for three class classification and 88.3% for binary classification on WESAD.


[9] 2603.26785

Beyond Benchmarks: A Framework for Post Deployment Validation of CT Lung Nodule Detection AI

Background: Artificial intelligence (AI) assisted lung nodule detection systems are increasingly deployed in clinical settings without site-specific validation. Performance reported under benchmark conditions may not reflect real-world behavior when acquisition parameters differ from training data. Purpose: To propose and demonstrate a physics-guided framework for evaluating the sensitivity of a deployed lung nodule detection model to systematic variation in CT acquisition parameters. Methods: Twenty-one cases from the publicly available LIDC-IDRI dataset were evaluated using a MONAI RetinaNet model pretrained on LUNA16 (fold 0, no fine-tuning). Five imaging conditions were tested: baseline, 25% dose reduction, 50% dose reduction, 3 mm slice thickness, and 5 mm slice thickness. Dose reduction was simulated via image-domain Gaussian noise; slice thickness via moving average along the z-axis. Detection sensitivity was computed at a confidence threshold of 0.5 with a 15 mm matching criterion. Results: Baseline sensitivity was 45.2% (57/126 consensus nodules). Dose reduction produced slight degradation: 41.3% at 25% dose and 42.1% at 50% dose. The 5 mm slice thickness condition produced a marked drop to 26.2% - a 19 percentage point reduction representing a 42% relative decrease from baseline. This finding was consistent across confidence thresholds from 0.1 to 0.9. Per-case analysis revealed heterogeneous performance including two cases with complete detection failure at baseline. Conclusion: Slice thickness represents a more fundamental constraint on AI detection performance than image noise under the conditions tested. The proposed framework is reproducible, requires no proprietary scanner data, and is designed to serve as the basis for ongoing post-deployment QA in resource-constrained environment.


[10] 2603.26795

HASS: Hierarchical Simulation of Logopenic Aphasic Speech for Scalable PPA Detection

Building a diagnosis model for primary progressive aphasia (PPA) has been challenging due to the data scarcity. Collecting clinical data at scale is limited by the high vulnerability of clinical population and the high cost of expert labeling. To circumvent this, previous studies simulate dysfluent speech to generate training data. However, those approaches are not comprehensive enough to simulate PPA as holistic, multi-level phenotypes, instead relying on isolated dysfluencies. To address this, we propose a novel, clinically grounded simulation framework, Hierarchical Aphasic Speech Simulation (HASS). HASS aims to simulate behaviors of logopenic variant of PPA (lvPPA) with varying degrees of severity. To this end, semantic, phonological, and temporal deficits of lvPPA are systematically identified by clinical experts, and simulated. We demonstrate that our framework enables more accurate and generalizable detection models.


[11] 2603.26820

Toward Actionable Digital Twins for Radiation-Based Imaging and Therapy: Mathematical Formulation, Modular Workflow, and an OpenKBP-Based Dose-Surrogate Prototype

Digital twins for radiation-based imaging and therapy are most useful when they assimilate patient data, quantify predictive uncertainty, and support clinically constrained decisions. This paper presents a modular framework for actionable digital twins in radiation-based imaging and therapy and instantiates its reproducible open-data component using the \openkbpfull{} benchmark. The framework couples PatientData, Model, Solver, Calibration, and Decision modules and formalizes latent-state updating, uncertainty propagation, and chance-constrained action selection. As an initial implementation, we build a GPU-ready PyTorch/MONAI reimplementation of the \openkbp{} starter pipeline: an 11-channel, 19.2M-parameter 3D U-Net trained with a masked loss over the feasible region and equipped with Monte Carlo dropout for voxel-wise epistemic uncertainty. To emulate the update loop on a static benchmark, we introduce decoder-only proxy recalibration and illustrate uncertainty-aware virtual-therapy evaluation using DVH-based and biological utilities. A complete three-fraction loop including recalibration, Monte Carlo inference, and spatial optimization executes in 10.3~s. On the 100-patient test set, the model achieved mean dose and DVH scores of 2.65 and 1.82~Gy, respectively, with 0.58~s mean inference time per patient. The \openkbp{} case study thus serves as a reproducible test bed for dose prediction, uncertainty propagation, and proxy closed-loop adaptation, while future institutional studies will address longitudinal calibration with delivered-dose logs and repeat imaging.


[12] 2603.26832

External Benchmarking of Lung Ultrasound Models for Pneumothorax-Related Signs: A Manifest-Based Multi-Source Study

Background and Aims: Reproducible external benchmarks for pneumothorax-related lung ultrasound (LUS) AI are scarce, and binary lung-sliding classification may obscure clinically important signs. We therefore developed a manifest-based external benchmark and used it to test both cross-domain generalization and task validity. Methods: We curated 280 clips from 190 publicly accessible LUS source videos and released a reconstruction manifest containing URLs, timestamps, crop coordinates, labels, and probe shape. Labels were normal lung sliding, absent lung sliding, lung point, and lung pulse. A previously published single-site binary classifier was evaluated on this benchmark; challenge-state analysis examined lung point and lung pulse using the predicted probability of absent sliding, P(absent). Results: The single-site comparator achieved ROC-AUC 0.9625 in-domain but 0.7050 on the heterogeneous external benchmark; restricting external evaluation to linear clips still yielded ROC-AUC 0.7212. In challenge-state analysis, mean P(absent) ranked absent (0.504) > lung point (0.313) > normal (0.186) > lung pulse (0.143). Lung pulse differed from absent clips (p=0.000470) but not from normal clips (p=0.813), indicating that the binary model treated pulse as normal-like despite absent sliding. Lung point differed from both absent (p=0.000468) and normal (p=0.000026), supporting its interpretation as an intermediate ambiguity state rather than a clean binary class. Conclusion: A manifest-based, multi-source benchmark can support reproducible external evaluation without redistributing source videos. Binary lung-sliding classification is an incomplete proxy for pneumothorax reasoning because it obscures blind-spot and ambiguity states such as lung pulse and lung point.


[13] 2603.26834

Hybrid Diffusion Model for Breast Ultrasound Image Augmentation

We propose a hybrid diffusion-based augmentation framework to overcome the critical challenge of ultrasound data augmentation in breast ultrasound (BUS) datasets. Unlike conventional diffusion-based augmentations, our approach improves visual fidelity and preserves ultrasound texture by combining text-to-image generation with image-to-image (img2img) refinement, as well as fine-tuning with low-rank adaptation (LoRA) and textual inversion (TI). Our method generated realistic, class-consistent images on an open-source Kaggle breast ultrasound image dataset (BUSI). Compared to the Stable Diffusion v1.5 baseline, incorporating TI and img2img refinement reduced the Frechet Inception Distance (FID) from 45.97 to 33.29, demonstrating a substantial gain in fidelity while maintaining comparable downstream classification performance. Overall, the proposed framework effectively mitigates the low-fidelity limitations of synthetic ultrasound images and enhances the quality of augmentation for robust diagnostic modeling.


[14] 2603.26835

ANVIL: Accelerator-Native Video Interpolation via Codec Motion Vector Priors

Mobile displays refresh at 90-120 Hz, yet most video is encoded at 24-30 frames per second; real-time frame-rate doubling requires each synthesized frame within 33.3 ms on mobile neural processing units. We show that mainstream flow-based video frame interpolation faces three structural deployment barriers on mobile accelerators: spatial sampling operators exceed the frame budget or lack hardware support, iterative flow refinement collapses under 8-bit post-training quantization, and memory-bound operators dominate the inference graph. ANVIL addresses these barriers by reusing motion vectors already computed by the H.264 decoder to prealign input frames, removing learned optical flow, spatial sampling, and iterative accumulation from the accelerator graph. The remaining residual is refined by a convolution-dominated network whose inference graph is composed almost entirely of compute-bound operators. On a Snapdragon 8 Gen 3 device, ANVIL achieves 12.8 ms 1080p network inference in 8-bit integer precision; an open-source Android player sustains 28.4 ms median end-to-end latency per interpolated frame pair over 54,623 consecutively logged samples during 30-minute continuous playback. Per-operator causal analysis identifies quantized accumulation on recurrent flow states as a key mechanism behind integer quantization failure in iterative methods. The current design targets H.264 playback scenarios with decoder-exposed motion vectors.


[15] 2603.26836

Reliability-Aware Weighted Multi-Scale Spatio-Temporal Maps for Heart Rate Monitoring

Remote photoplethysmography (rPPG) allows for the contactless estimation of physiological signals from facial videos by analyzing subtle skin color changes. However, rPPG signals are extremely susceptible to illumination changes, motion, shadows, and specular reflections, resulting in low-quality signals in unconstrained environments. To overcome these issues, we present a Reliability-Aware Weighted Multi-Scale Spatio-Temporal (WMST) map that models pixel reliability through the suppression of environmental noises. These noises are modeled using different weighting strategies to focus on more physiologically valid areas. Leveraging the WMST map, we develop an SSL contrastive learning approach based on Swin-Unet, where positive pairs are generated from conventional rPPG signals and temporally expanded WMST maps. Moreover, we introduce a new High-High-High (HHH) wavelet map as a negative example that maintains motion and structural details while filtering out physiological information. Here, our aim is to estimate heart rate (HR), and the experiments on public rPPG benchmarks show that our approach enhances motion and illumination robustness with lower HR estimation error and higher Pearson correlation than existing Self-Supervised Learning (SSL) based rPPG methods.


[16] 2603.26840

Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

Multimodal Emotion Recognition in Conversations (MERC) aims to predict speakers' emotional states in multi-turn dialogues through text, audio, and visual cues. In real-world settings, conversation scenarios differ significantly in speakers, topics, styles, and noise levels. Existing MERC methods generally neglect these cross-scenario variations, limiting their ability to transfer models trained on a source domain to unseen target domains. To address this issue, we propose a Dual-branch Graph Domain Adaptation framework (DGDA) for multimodal emotion recognition under cross-scenario conditions. We first construct an emotion interaction graph to characterize complex emotional dependencies among utterances. A dual-branch encoder, consisting of a hypergraph neural network (HGNN) and a path neural network (PathNN), is then designed to explicitly model multivariate relationships and implicitly capture global dependencies. To enable out-of-domain generalization, a domain adversarial discriminator is introduced to learn invariant representations across domains. Furthermore, a regularization loss is incorporated to suppress the negative influence of noisy labels. To the best of our knowledge, DGDA is the first MERC framework that jointly addresses domain shift and label noise. Theoretical analysis provides tighter generalization bounds, and extensive experiments on IEMOCAP and MELD demonstrate that DGDA consistently outperforms strong baselines and better adapts to cross-scenario conversations. Our code is available at this https URL.


[17] 2603.26843

Identity leakage through accent cues in voice anonymisation

Voice anonymisation is used to conceal voice identity while preserving linguistic content. Even if anonymisation seems strong, non-timbral cues such as accent that remain post-anonymisation can help re-identification and reveal sensitive socio-demographic traits. We report a study of residual accent information involving multiple anonymisation systems. We highlight the role of accent using speaker verification, accent verification, and accent classification using a set of embeddings focusing on timbral, non-timbral and accent-related information and show the extent to which related cues facilitate reidentification post anonymisation. Results show that, while some systems are robust to reidentification attempts using accent cues, others leave residual, speaker-dependent, accentrelated cues which can be used to reveal the voice identity. We also highlight accent-dependent variation in anonymisation performance, raising fairness concerns, and show that a system with characterlevel conditioning can help obfuscate identity-revealing accent cues, reducing accent-identification accuracy by 68% on average and improving overall anonymisation performance by 11% relative.


[18] 2603.26844

Uncertainty-Aware Mapping from 3D Keypoints to Anatomical Landmarks for Markerless Biomechanics

Markerless biomechanics increasingly relies on 3D skeletal keypoints extracted from video, yet downstream biomechanical mappings typically treat these estimates as deterministic, providing no principled mechanism for frame-wise quality control. In this work, we investigate predictive uncertainty as a quantitative measure of confidence for mapping 3D pose keypoints to 3D anatomical landmarks, a critical step preceding inverse kinematics and musculoskeletal analysis. Within a temporal learning framework, we model both uncertainty arising from observation noise and uncertainty related to model limitations. Using synchronized motion capture ground truth on AMASS, we evaluate uncertainty at frame and joint level through error--uncertainty rank correlation, risk--coverage analysis, and catastrophic outlier detection. Across experiments, uncertainty estimates, particularly those associated with model uncertainty, exhibit a strong monotonic association with landmark error (Spearman $\rho \approx 0.63$), enabling selective retention of reliable frames (error reduced to $\approx 16.8$ mm at 10% coverage) and accurate detection of severe failures (ROC-AUC $\approx 0.92$ for errors $>50$ mm). Reliability ranking remains stable under controlled input degradation, including Gaussian noise and simulated missing joints. In contrast, uncertainty attributable to observation noise provides limited additional benefit in this setting, suggesting that dominant failures in keypoint-to-landmark mapping are driven primarily by model uncertainty. Our results establish predictive uncertainty as a practical, frame-wise tool for automatic quality control in markerless biomechanical pipelines.


[19] 2603.26902

Gaussian Mixture Model Based Bayesian Learning for Sparse Channel Estimation in Orthogonal Time Frequency Space Modulated Systems

A novel Gaussian mixture model (GMM) aided sparse Bayesian learning (SBL) framework is proposed for channel state information (CSI) estimation in orthogonal time-frequency space (OTFS) modulated systems. The key attribute of the proposed algorithm lies in casting CSI recovery as an SBL inference problem, where posterior distributions are iteratively refined under a hierarchical GMM prior. Using this approach, the sparsity-inducing variances beneficially promote sparsity in the delay Doppler (DD) domain, while additionally augmenting the capability of SBL to exploit channel statistics more effectively. Moreover, to fully exploit the GMMs ability to approximate arbitrary probability density functions and model complex multipath fading scenarios, the channel statistics are represented using a complex Gaussian mixture. Simultaneously, the method leverages time-domain (TD) pilots without requiring wasteful DD domain guard intervals, thereby ensuring low pilot overhead and high spectral efficiency. The CSI recovered is subsequently applied in a linear minimum mean square error (MMSE) detector for reliable data detection. To benchmark performance, the Oracle-MMSE and the Bayesian Cramèr Rao lower bound (BCRLB) are also derived. Our simulation results demonstrate significant performance improvement over the state of the art sparse estimation methods.


[20] 2603.26956

Optimal Hiding with Partial Information of the Seeker's Route

We consider a hide-and-seek game between a Hider and a Seeker over a finite set of locations. The Hider chooses one location to conceal a stationary treasure, while the Seeker visits the locations sequentially along a route. As the search progresses, the Hider observes a prefix of the Seeker's route. After observing this information, the Hider has the option to relocate the treasure at most once to another unvisited location by paying a switching cost. We study two seeker models. In the first, the Seeker is unaware of the fact that the Hider can relocate. In the second, the Seeker select its route while accounting for the possibility that the Hider observes its path and reallocates. For the restricted case, we define the value-of-information created by the reveal and derive upper bounds in terms of the switching cost using a worst-case evaluation over routes. We also show that seeker awareness reduces the game value, with the difference between the restricted and feedback models bounded by the entry-wise gap between the corresponding payoff matrices. Numerical examples show how this benefit decreases as the switching cost increases and as the reveal occurs later along the route.


[21] 2603.26998

Beyond Freshness and Semantics: A Coupon-Collector Framework for Effective Status Updates

For status update systems operating over unreliable energy-constrained wireless channels, we address Weaver's long-standing Level-C question: do my packets actually improve the plant's behavior? Each fresh sample carries a stochastic expiration time -- governed by the plant's instability dynamics -- after which the information becomes useless for control. Casting the problem as a coupon-collector variant with expiring coupons, we (i) formulate a two-dimensional average-reward MDP, (ii) prove that the optimal schedule is doubly thresholded in the receiver's freshness timer and the sender's stored lifetime, (iii) derive a closed-form policy for deterministic lifetimes, and (iv) design a Structure-Aware Q-learning algorithm (SAQ) that learns the optimal policy without knowing the channel success probability or lifetime distribution. Simulations validate our theoretical predictions: SAQ matches optimal Value Iteration performance while converging significantly faster than baseline Q-learning, and expiration-aware scheduling achieves up to 50% higher reward than age-based baselines by adapting transmissions to state-dependent urgency -- thereby delivering Level-C effectiveness under tight resource constraints.


[22] 2603.26999

A Duality-Based Optimization Formulation of Safe Control Design with State Uncertainties

State estimation uncertainty is prevalent in real-world applications, hindering the application of safety-critical control. Existing methods address this by strengthening a Control Barrier Function (CBF) condition either to handle actuation errors induced by state uncertainty, or to enforce stricter, more conservative sufficient conditions. In this work, we take a more direct approach and formulate a robust safety filter by analyzing the image of the set of all possible states under the CBF dynamics. We first prove that convexifying this image set does not change the set of possible inputs. Then, by leveraging duality, we propose an equivalent and tractable reformulation for cases where this convex hull can be expressed as a polytope or ellipsoid. Simulation results show the approach in this paper to be less conservative than existing alternatives.


[23] 2603.27001

PHONOS: PHOnetic Neutralization for Online Streaming Applications

Speaker anonymization (SA) systems modify timbre while leaving regional or non-native accents intact, which is problematic because accents can narrow the anonymity set. To address this issue, we present PHONOS, a streaming module for real-time SA that neutralizes non-native accent to sound native-like. Our approach pre-generates golden speaker utterances that preserve source timbre and rhythm but replace foreign segmentals with native ones using silence-aware DTW alignment and zero-shot voice conversion. These utterances supervise a causal accent translator that maps non-native content tokens to native equivalents with at most 40ms look-ahead, trained using joint cross-entropy and CTC losses. Our evaluations show an 81% reduction in non-native accent confidence, with listening-test ratings consistent with this shift, and reduced speaker linkability as accent-neutralized utterances move away from the original speaker in embedding space while having latency under 241 ms on single GPU.


[24] 2603.27018

On-Device Super Resolution Imaging Using Low-Cost SPAD Array and Embedded Lightweight Deep Learning

This work presents a lightweight super-resolution (LiteSR) neural network for depth and intensity images acquired from a consumer-grade single-photon avalanche diode (SPAD) array with a 48x32 spatial resolution. The proposed framework reconstructs high-resolution (HR) images of size 256x256. Both synthetic and real datasets are used for performance evaluation. Extensive quantitative metrics demonstrate high reconstruction fidelity on synthetic datasets, while experiments on real indoor and outdoor measurements further confirm the robustness of the proposed approach. Moreover, the SPAD sensor is interfaced with an Arduino UNO Q microcontroller, which receives low-resolution (LR) depth and intensity images and feeds them into a compressed, pre-trained deep learning (DL) model, enabling real-time SR video streaming. In addition to the 256x256 setting, a range of target HR resolutions is evaluated to determine the maximum achievable upscaling resolution (512x512) with LiteSR, including scenarios with noise-corrupted LR inputs. The proposed LiteSR-embedded system co-design provides a scalable, cost-effective solution to enhance the spatial resolution of current consumer-grade SPAD arrays to meet HR imaging requirements.


[25] 2603.27020

Multicluster Design and Control of Large-Scale Affine Formations

Conventional affine formation control (AFC) empowers a network of agents with flexible but collective motions - a potential which has not yet been exploited for large-scale swarms. One of the key bottlenecks lies in the design of an interaction graph, characterized by the Laplacian-like stress matrix. Efficient and scalable design solutions often yield suboptimal solutions on various performance metrics, e.g., convergence speed and communication cost, to name a few. The current state-of-the-art algorithms for finding optimal solutions are computationally expensive and therefore not scalable. In this work, we propose a more efficient optimal design for any generic configuration, with the potential to further reduce complexity for a large class of nongeneric rotationally symmetric configurations. Furthermore, we introduce a multicluster control framework that offers an additional scalability improvement, enabling not only collective affine motions as in conventional AFC but also partially independent motions naturally desired for large-scale swarms. The overall design is compatible with a swarm size of several hundred agents with fast formation convergence, as compared to up to only a few dozen agents by existing methods. Experimentally, we benchmark the performance of our algorithm compared with several state-of-the-art solutions and demonstrate the capabilities of our proposed control strategies.


[26] 2603.27024

Data-driven discovery and control of multistable nonlinear systems and hysteresis via structured Neural ODEs

Many engineered physical processes exhibit nonlinear but asymptotically stable dynamics that converge to a finite set of equilibria determined by control inputs. Identifying such systems from data is challenging: stable dynamics provide limited excitation and model discovery is often non-unique. We propose a minimally structured Neural Ordinary Differential Equation (NODE) architecture that enforces trajectory stability and provides a tractable parameterization for multistable systems, by learning a vector field in the form $F(x,u) = f(x)\,(x - g(x,u))$, where $f(x) < 0$ elementwise ensures contraction and $g(x,u)$ determines the multi-attractor locations. Across several nonlinear benchmarks, the proposed structure is efficient on short time horizon training, captures multiple basins of attraction, and enables efficient gradient-based feedback control through the implicit equilibrium map $g$.


[27] 2603.27051

Proprioceptive feedback paradigm for safe and resilient motion control

Proprioception is a human sense that provides feedback from muscles and joints about body position and motion. This key capability keeps us upright, moving, and responding quickly to slips or stumbles. In this paper we discuss a proprioception-like feature (machine proprioceptive feedback - MPF) for motion control systems. An unexpected response of one actuator, or one agent in a multi-agent system, is compensated by other actuators/agents through fast feedback loops that react only to the unexpected portion. The paper appropriates the predictor-corrector mechanism of decentralized, multi-agent controllers as "proprioceptive feedback" for centrally controlled ones. It analyzes a nature and degree of impairment that can be managed and offer two options, full- MPF and split-MPF, with different wiring architectures as well as different stability and safety properties. Multi-vehicle interchange lane-swap traffic simulations confirm the analytical results.


[28] 2603.27068

A Unified Codebook Design for Curvature-Reconfigurable Apertures: Seamless Near to Far Field Coverage

Beam training for extremely large-scale arrays with curvature-reconfigurable apertures (CuRAs) faces the critical challenge of severe, geometry-dependent angle-range coupling. While most existing designs compartmentalize near field and far field scenarios, we propose a unified, distance-adaptive hierarchical codebook framework for 1-D and 2-D CuRAs that seamlessly bridges both propagation regimes. Under a spherical-wave model, we first characterize the beamforming-gain correlation in a polar angular domain, deriving an angle-dependent angular sampling rule to capture the varying curvature. To achieve full-range coverage, we introduce a direction-dependent effective Rayleigh distance (ERD) as a soft boundary to gate the range sampling. Crucially, by sampling uniformly in the reciprocal-range domain, the proposed codebook provides precise, dense focusing within the ERD and automatically degenerates into sparse, angle-only steering beyond it. This mechanism eliminates the need for hard mode-switching between near- and far-field operations. Simulation results demonstrate that our unified design consistently outperforms representative baselines in spectral efficiency and alignment accuracy, offering a comprehensive solution for full-range CuRA communications.


[29] 2603.27081

A Controllability Perspective on Steering Follow-the-Regularized-Leader Learners in Games

Follow-the-regularized-leader (FTRL) algorithms have become popular in the context of games, providing easy-to-implement methods for each agent, as well as theoretical guarantees that the strategies of all agents will converge to some equilibrium concept (provided that all agents follow the appropriate dynamics). However, with these methods, each agent ignores the coupling in the game, and treats their payoff vectors as exogenously given. In this paper, we take the perspective of one agent (the controller) deciding their mixed strategies in a finite game, while one or more other agents update their mixed strategies according to continuous-time FTRL. Viewing the learners' dynamics as a nonlinear control system evolving on the relative interior of a simplex or product of simplices, we ask when the controller can steer the learners to a target state, using only its own mixed strategy and without modifying the game's payoff structure. For the two-player case we provide a necessary and sufficient criterion for controllability based on the existence of a fully mixed neutralizing controller strategy and a rank condition on the projected payoff map. For multi-learner interactions we give two sufficient controllability conditions, one based on uniform neutralization and one based on a periodic-drift hypothesis together with a Lie-algebra rank condition. We illustrate these results on canonical examples such as Rock-Paper-Scissors and a construction related to Brockett's integrator.


[30] 2603.27118

Quantitative measurements of biological/chemical concentrations using smartphone cameras

This paper presents a smartphone-based imaging system capable of quantifying the concentration of an assortment of biological/chemical assay samples. The main objective is to construct an image database which characterizes the relationship between color information and concentrations of the biological/chemical assay sample. For this aim, a designated optical setup combined with image processing and data analyzing techniques was implemented. A series of experiments conducted on selected assays, including fluorescein, RNA Mango, homogenized milk and yeast have demonstrated that the proposed system estimates the concentration of fluorescent materials and colloidal mixtures comparable to currently used commercial and laboratory instruments. Furthermore, by utilizing the camera and computational power of smartphones, eventual development can be directed toward extremely compact, inexpensive and portable analysis and diagnostic systems which will allow experiments and tests to be conducted in remote or impoverished areas.


[31] 2603.27161

Time Window-Based Netload Range Cost Curves for Coordinated Transmission and Distribution Planning Under Uncertainty

Mechanisms to coordinate transmission and distribution planning should be regulatory compliant and keep the spheres of DSO and TSO decisions separate, without requiring disclosure of proprietary data or unrealistic computationally expensive T&D co-simulations. The concept of Netload Range Cost Curves (NRCC) has been recently proposed as simple non-invasive form of coordinating T&D investments under distribution netload uncertainty. This paper extends the NRCC concept to accommodate the temporal dimension of the T&D planning process. We propose to compute a hierarchy of certified temporal interface products that represent the different levels of flexibility that distribution networks can provide transmission grids with at the planning stage. The first product (P1) maps distribution investment into scenario-robust, per-window service envelopes within which any TSO service call (to modify load within specified bounds) is guaranteed distribution-network-feasible. The second product (P2) adds lexicographic rebound minimization, preserving P1-optimal service capacity while certifying post-service recovery under three governance variants with qualitatively distinct rebound-budget responses. In our numerical results, based on a real distribution feeder, we compare the performance of our proposed time-window-based flexibility products to an atemporal product (P0) that offers a static bound on the aggregate distribution grid netload across all time periods. Our results demonstrate the superiority of our proposed products in properly valuing the benefits of incremental investments in storage to allow for temporal flexibility.


[32] 2603.27177

Path-Following Guidance for Unmanned Aerial Vehicle with Bounded Lateral Acceleration

This paper addresses the three-dimensional path-following guidance problem for unmanned aerial vehicles under explicit actuator constraints. Unlike conventional approaches that assume unbounded control inputs or handle saturation heuristically, the proposed method incorporates bounded lateral acceleration directly into the guidance design. A nonlinear guidance framework is developed employing a nested saturation-based control technique. The proposed guidance strategy guarantees bounded control inputs while ensuring exponential convergence of cross-track errors to zero. The formulation is applicable to general smooth paths and is systematically extended from planar to three-dimensional scenarios using a path-tangent coordinate framework. Rigorous stability analysis based on Lyapunov theory establishes convergence and feasibility properties of the closed-loop system. Numerical simulations on representative paths, including straight-line, circular, and sinusoidal paths, demonstrate that the proposed method achieves superior tracking performance, reduced control effort, and robustness against disturbances compared to existing guidance laws. The simplicity of the design and its compatibility with practical actuator limits make it suitable for real-world UAV applications.


[33] 2603.27192

Switch-DFT: Adaptive Waveform and MIMO Switching for Energy-Efficient Base Stations

Energy efficiency has emerged as a critical challenge in modern base stations (BSs), as the power amplifier (PA) consumes a substantial portion of the total power due to its limited efficiency. We investigate waveform and mode adaptation to enhance the energy efficiency of BSs. We propose Switch-DFT, an adaptive switching framework that selects between cyclic prefix orthogonal frequency division multiplexing (CP-OFDM) and discrete Fourier transform-spread-OFDM (DFT-s-OFDM) waveforms, as well as between single-input multiple-output (SIMO) and multiple-input multiple-output (MIMO) modes. Switch-DFT improves efficiency by reducing PA backoff with DFT-s-OFDM and achieves the target rate at lower power by leveraging higher MIMO throughput. This results in superior energy efficiency over a wide range of the spectral efficiencies compared with static configurations.


[34] 2603.27263

DeepBayesFlow: A Bayesian Structured Variational Framework for Generalizable Prostate Segmentation via Expressive Posteriors and SDE-Girsanov Uncertainty Modeling

Automatic prostate MRI segmentation faces persistent challenges due to inter-patient anatomical variability, blurred tissue boundaries, and distribution shifts arising from diverse imaging protocols. To address these issues, we propose DeepBayesFlow, a novel Bayesian segmentation framework designed to enhance both robustness and generalization across clinical domains. DeepBayesFlow introduces three key innovations: a learnable NF-Posterior module based on normalizing flows that models complex, data-adaptive latent distributions; a NCVI inference mechanism that removes conjugacy constraints to enable flexible posterior learning in high-dimensional settings; and a SDE-Girsanov module that refines latent representations via time-continuous diffusion and formal measure transformation, injecting temporal coherence and physically grounded uncertainty into the inference process. Together, these components allow DeepBayesFlow to capture domain-invariant structural priors while dynamically adapting to domain-specific variations, achieving accurate and interpretable segmentation across heterogeneous prostate MRI datasets.


[35] 2603.27286

Irrational pursuit-evasion differential games: A cumulative prospect theory approach

This paper considers for the first time pursuit-evasion (PE) differential games with irrational perceptions of both pursuer and evader on probabilistic characteristics of environmental uncertainty. Firstly, the irrational perceptions of risk aversion and probability sensitivity are modeled and incorporated within a Bayesian PE differential game framework by using Cumulative Prospect Theory (CPT) approach; Secondly, several sufficient conditions of capturability are established in terms of system dynamics and irrational parameters; Finally, the existence of CPT-Nash equilibria is rigorously analyzed by invoking Brouwer's fixed-point theorem. The new results reveal that irrational behaviors benefit the pursuer in some cases and the evader in others. Certain captures that are unachievable under rational behaviors can be achieved under irrational ones. By bridging irrational behavioral theory with game-theoretic control, this framework establishes a rigorous theoretical foundation for practical control engineering within complex human-machine systems.


[36] 2603.27328

Quaternion-based Unscented Kalman Filter for Robust Wrench Estimation of Human-UAV Physical Interaction

This paper introduces an advanced Quaternion-based Unscented Kalman Filter (QUKF) for real-time, robust estimation of system states and external wrenches in assistive aerial payload transportation systems that engage in direct physical interaction. Unlike conventional filtering techniques, the proposed approach employs a unit-quaternion representation to inherently avoid singularities and ensure globally consistent, drift-free estimation of the platform's pose and interaction wrenches. A rigorous quaternion-based dynamic model is formulated to capture coupled translational and rotational dynamics under interaction forces. Building on this model, a comprehensive QUKF framework is established for state prediction, measurement updates, and external wrench estimation. The proposed formulation fully preserves the nonlinear characteristics of rotational motion, enabling more accurate and numerically stable estimation during physical interaction compared to linearized filtering schemes. Extensive simulations validate the effectiveness of the QUKF, showing significant improvements over the Extended Kalman Filter (EKF). Specifically, the QUKF achieved a 79.41\% reduction in Root Mean Squared Error (RMSE) for torque estimation, with average RMSE improvements of 79\% and 56\%, for position and angular rates, respectively. These findings demonstrate enhanced robustness to measurement noise and modeling uncertainties, providing a reliable foundation for safe, stable, and responsive human-UAV physical interaction in cooperative payload transportation tasks.


[37] 2603.27337

Learning swarm behaviour from a flock of homing pigeons using inverse optimal control

In this work, Global Position System (GPS) data from a flock of homing pigeons are analysed. The flocking behaviour of the considered homing pigeons is formulated as a swarm optimal trajectory tracking control problem. The swarm problem in this work is modeled with the idea that one or two pigeons at the forefront lead the flock. Each follower pigeon is assumed to follow a leader pigeon immediately ahead of themselves, instead of directly following the leaders at the forefront of the flock. The trajectory of each follower pigeon is assumed to be a solution of an optimal trajectory tracking control problem. An optimal control problem framework is created for each follower pigeon. An important aspect of an optimal control problem is the cost function. A minimum principle based method for multiple flight data is proposed, which can help in learning the unknown weights of the cost function of the optimal trajectory tracking control problem for each follower pigeon, from flight trajectories' information obtained from GPS data.


[38] 2603.27342

SHroom: A Python Framework for Ambisonics Room Acoustics Simulation and Binaural Rendering

Spherical Harmonics ROOM), an open-source Python library for room acoustics simulation using Ambisonics, available at this https URL and installable via \texttt{pip install pyshroom}. \textbf{shroom} projects image-source contributions onto a Spherical Harmonics (SH) basis, yielding a composable pipeline for binaural decoding, spherical array simulation, and real-time head rotation. Benchmarked against \texttt{pyroomacoustics} with an $N=30$ reference, \textbf{shroom} with Magnitude Least Squares (MagLS) achieves perceptual transparency (2.02~dB Log Spectral Distance (LSD) at $N=5$, within the 1--2~dB Just Noticeable Difference (JND)) while its fixed-once decode amortises over multiple sources ($K=1$-to-$8$: slowdown narrows from $7\times$ to $3.1\times$). For dynamic head rotation, \textbf{shroom} applies a Wigner-D multiply at $<1$~ms/frame, making it the only architecturally viable real-time choice.


[39] 2603.27357

Guided Lensless Polarization Imaging

Polarization imaging captures the polarization state of light, revealing information invisible to the human eye yet valuable in domains such as biomedical diagnostics, autonomous driving, and remote sensing. However, conventional polarization cameras are often expensive, bulky, or both, limiting their practical use. Lensless imaging offers a compact, low-cost alternative by replacing the lens with a simple optical element like a diffuser and performing computational reconstruction, but existing lensless polarization systems suffer from limited reconstruction quality. To overcome these limitations, we introduce a RGB-guided lensless polarization imaging system that combines a compact polarization-RGB sensor with an auxiliary, widely available conventional RGB camera providing structural guidance. We reconstruct multi-angle polarization images for each RGB color channel through a two-stage pipeline: a physics-based inversion recovers an initial polarization image, followed by a Transformer-based fusion network that refines this reconstruction using the RGB guidance image from the conventional RGB camera. Our two-stage method significantly improves reconstruction quality and fidelity over lensless-only baselines, generalizes across datasets and imaging conditions, and achieves high-quality real-world results on our physical prototype lensless camera without any fine-tuning.


[40] 2603.27374

Safe Adaptive-Sampling Control via Robust M-Step Hold Model Predictive Control

In adaptive-sampling control, the control frequency can be adjusted during task execution. Ensuring that these on-the-fly changes do not jeopardize the safety of the system being controlled requires careful attention. We introduce robust M-step hold model predictive control (MPC) to address this. This MPC formulation provides robust constraint satisfaction for an uncertain discrete-time system model with a fixed sampling time subject to an adaptable multi-step input hold (referred to as M-step hold). We show how to ensure recursive feasibility of the MPC utilizing M-step hold extensions of robust invariant sets, and demonstrate how to use our framework to enable safe adaptive-sampling control via the online selection of M. We evaluate the utility of the robust M-step hold MPC formulation in a cruise control example.


[41] 2603.27427

Dissipativity-Based Distributed Control and Communication Topology Co-Design for Nonlinear DC Microgrids

This paper presents a dissipativity-based distributed droop-free control and communication topology co-design framework for voltage regulation and current sharing in nonlinear DC microgrids (MGs), where ZIP loads and voltage source converter (VSC) input saturation constitute the primary nonlinear challenges. The constant power load (CPL) component of ZIP loads introduces a destabilizing nonlinearity through its negative incremental impedance characteristic, while VSC input saturation imposes hard amplitude constraints on the voltage command signals applied to each distributed generator (DG), collectively making the control design significantly more challenging. The DC MG is modeled as a networked system of DGs, transmission lines, and ZIP loads coupled through a static interconnection matrix. Each DG is equipped with a local PI-based controller and a distributed consensus-based global controller, from which a nonlinear networked error dynamics model is derived. The CPL nonlinearity and the VSC saturation are each characterized via sector-boundedness, where the latter is handled through a dead-zone decomposition. Both nonlinearities are simultaneously absorbed into the dissipativity analysis using the S-procedure and Young's inequality, certifying an input feedforward output feedback passivity (IF-OFP) property for each DG subsystem. Controller gains, passivity indices, and the communication topology are co-designed by solving locally and globally formulated Linear Matrix Inequality (LMI) problems. Necessary feasibility conditions are identified and embedded into the local LMI problems, enabling a one-shot co-design algorithm that avoids iterative procedures. Simulation results validate the effectiveness of the proposed framework under multiple operating scenarios, demonstrating robust performance superior to conventional control approaches.


[42] 2603.27446

Communication-Induced Bifurcation and Collective Dynamics in Power Packet Networks: A Thermodynamic Approach to Information-Constrained Energy Grids

This paper investigates the nonlinear dynamics and phase transitions in power packet network connected with routers, conceptualized as macroscopic information-ratchets. In the emerging paradigm of cyber-physical energy systems, the interplay between stochastic energy fluctuations and the thermodynamic cost of control information defines fundamental operational limits. We first formulate the dynamics of a single router using a Langevin framework, incorporating an exponential cost function for information acquisition. Our analysis reveals a discontinuous (first-order) phase transition, where the system adopts a strategic abandon of regulation as noise intensity exceeds a critical threshold $D_c$. This transition represents a fundamental information-barrier inherent to autonomous energy management. Here, we extend this model to network configurations, where multiple routers are linked through diffusive coupling, sharing energy between them. We demonstrate that the network topology and coupling strength significantly extend the bifurcation points, with collective resilient behaviors against local fluctuations. These results provide a rigorous mathematical basis for the design of future complex communication-energy network, suggesting that the stability of proposed systems is governed by the synergistic balance between physical energy flow and the thermodynamics of information exchange. It will serve to design future complex communication-energy networks, including internal energy management for autonomous robots.


[43] 2603.27471

Driving Condition-Aware Multi-Agent Integrated Power and Thermal Management for Hybrid Electric Vehicles

Effective co-optimization of energy management strategy (EMS) and thermal management (TM) is crucial for optimizing fuel efficiency in hybrid electric vehicles (HEVs). Driving conditions significantly influence the performance of both EMS and TM in HEVs. This study presents a novel driving condition-aware integrated thermal and energy management (ITEM) framework. In this context, after analyzing and segmenting driving data into micro-trips, two primary features (average speed and maximum acceleration) are measured. Using the K-means approach, the micro-trips are clustered into three main groups. Finally, a deep neural network is employed to develop a real-time driving recognition model. An ITEM is then developed based on multi-agent deep reinforcement learning (DRL), leveraging the proposed real-time driving recognition model. The primary objectives are to improve the fuel economy and reduce TM power consumption while maintaining a pleasant cabin temperature for passengers. Our simulation results illustrate the effectiveness of the suggested framework and the positive impact of recognizing driving conditions on ITEM, improving fuel economy by 16.14% and reducing TM power consumption by 8.22% compared to the benchmark strategy.


[44] 2603.27495

Jutted BMOCZ for Non-Coherent OFDM

In this work, we propose a zero constellation for binary modulation on conjugate-reciprocal zeros (BMOCZ), called jutted BMOCZ (J-BMOCZ), and study its application to non-coherent orthogonal frequency division multiplexing (OFDM). With J-BMOCZ, we introduce asymmetry to the zero constellation for Huffman BMOCZ, which removes ambiguity at the receiver under a uniform rotation of the zeros. The asymmetry is controlled by the magnitude of "jutted" zeros and enables the receiver to estimate zero rotation using a simple cross-correlation. The proposed method, however, leads to a natural trade-off between asymmetry and zero stability. Accordingly, we introduce a reliability metric to measure the stability of a polynomial's zeros under an additive perturbation of the coefficients, and we apply the metric to optimize the J-BMOCZ zero constellation parameters. We then combine the advantages of J-BMOCZ and Huffman BMOCZ to design a hybrid waveform for OFDM with BMOCZ (OFDM-BMOCZ). The pilot-free waveform enables blind synchronization/detection and has a fixed peak-to-average power ratio that is independent of the message. Finally, we assess the proposed scheme through simulation and demonstrate non-coherent OFDM-BMOCZ using low-cost software-defined radios.


[45] 2603.27540

Energy-Efficient Velocity Profile Optimization for Movable Antenna-Enabled Sensing Systems

Movable antennas (MAs) enable the reconfiguration of array geometry within a bounded region to exploit sub-wavelength spatial degrees of freedom in wireless communication and sensing systems. However, most prior research has predominantly focused on the communication and sensing performance, overlooking the mechanical power consumption inherent in antenna movement. To bridge this gap, this paper investigates a velocity profile optimization framework for MA-assisted direction-of-arrival (DoA) estimation, explicitly balancing sensing accuracy with mechanical energy consumption of MAs. We first establish a Newtonian-based mechanical energy model, and formulate a functional optimization problem for sensing energy efficiency (EE) maximization. By applying the calculus of variations, this formulation is transformed into an infinite-dimensional problem defined by the Euler-Lagrange equation. To solve it, we propose a spectral discretization framework based on the Galerkin method, which expands the velocity profile over a sinusoidal basis. In the regime where energy consumption is dominated by linear damping, we prove that the optimal velocity profile follows a closed-form sinusoidal shape. For more general scenarios involving strong nonlinear aerodynamic drag, we leverage the Markov-Lukács theorem to transform the kinematic constraints into strictly convex sum-of-squares (SOS) conditions. Consequently, the infinite-dimensional problem is reformulated as a tractable finite-dimensional nonlinear algebraic system, which is solved by a two-layer algorithm combining Dinkelbach's method with successive convex approximation (SCA). Numerical results demonstrate that our optimized velocity profile significantly outperforms baselines in terms of EE across various system configurations. Insights into the optimized velocity profiles and practical design guidelines are also provided.


[46] 2603.27544

Stacked Intelligent Metasurfaces for Near-Field Multi-User Covert Communications

Reconfigurable intelligent surfaces have emerged as a cutting-edge technology for next-generation wireless communications that are capable of reconfiguring the wireless environment using a large number of cost-effective reflecting elements. However, a significant body of prior studies has focused on single-layer surfaces that lack the capability of significantly mitigating inter-user interference. Moreover, previous studies mostly consider far-field operation and neglect working in the near-field region. In this paper, we propose a stacked intelligent metasurfaces (SIM)-assisted near-field multi-user multiple-input-single-output covert communication system. More specifically, we have a multi-antenna base station that is assisted with a SIM to serve multiple single-antenna users in the presence of multiple single-antenna wardens. We aim at optimizing the beamfocusing vectors at the BS and SIM phase shift matrices to maximize the sum covert rate under maximum transmit power budget constraint, quality-of-service (QoS) constraint for all users, and covertness constraint. Since the formulated problem is highly non-convex due to the coupling between the variables, we adopt alternating optimization to tackle it, where we divide the problem into beamfocusing sub-problem and SIM phase shift sub-problem, which are solved alternately until convergence. We leverage successive convex approximation (SCA) to solve the two sub-problems. Additionally, we formulate the SIM phase shift sub-problem using the widely adopted projected gradient ascent (PGA) method for comparison purposes. The conducted simulations reveal that the SCA-based algorithm outperforms the existing PGA-based algorithm as well as other benchmarks in terms of the achieved sum covert rate, demonstrating its consistent performance and robustness under various system parameter configurations.


[47] 2603.27560

Velocity-Free Horizontal Position Control of Quadrotor Aircraft via Nonlinear Negative Imaginary Systems Theory

This paper presents a velocity-free position control strategy for quadrotor unmanned aerial vehicles based on nonlinear negative imaginary (NNI) systems theory. Unlike conventional position control schemes that require velocity measurements or estimation, the proposed approach achieves asymptotic stability using only position feedback. We establish that the quadrotor horizontal position subsystem, when augmented with proportional feedback, exhibits the NNI property with respect to appropriately defined horizontal thrust inputs. A strictly negative imaginary integral resonant controller is then designed for the outer loop, and robust asymptotic stability is guaranteed through satisfaction of explicit sector-bound conditions relating controller and plant parameters. The theoretical framework accommodates model uncertainties and external disturbances while eliminating the need for velocity sensors. Simulation results validate the theoretical predictions and demonstrate effective position tracking performance.


[48] 2603.27576

MPC-Based Trajectory Tracking for a Quadrotor UAV with Uniform Semi-Global Asymptotic Stability Guarantees

This paper proposes a model predictive trajectory tracking approach for quadrotors subject to input constraints. Our proposed approach relies on a hierarchical control strategy with an outer-loop feedback generating the required thrust and desired attitude and an inner-loop feedback regulating the actual attitude to the desired one. For the outer-loop translational dynamics, the generation of the virtual control input is formulated as a constrained model predictive control problem with time-varying input constraints and a control strategy, endowed with uniform global asymptotic stability guarantees, is proposed. For the inner-loop rotational dynamics, a hybrid geometric controller is adopted, achieving semi-global exponential tracking of the desired attitude. Finally, we prove that the overall cascaded system is semi-globally asymptotically stable. Simulation results illustrate the effectiveness of the proposed approach.


[49] 2603.27580

Structure-Preserving Learning of Nonholonomic Dynamics

Data-driven modeling is playing an increasing role in robotics and control, yet standard learning methods typically ignore the geometric structure of nonholonomic systems. As a consequence, the learned dynamics may violate the nonholonomic constraints and produce physically inconsistent motions. In this paper, we introduce a structure-preserving Gaussian process (GP) framework for learning nonholonomic dynamics. Our main ingredient is a nonholonomic matrix-valued kernel that incorporates the constraint distribution directly into the GP prior. This construction ensures that the learned vector field satisfies the nonholonomic constraints for all inputs. We show that the proposed kernel is positive semidefinite, characterize its associated reproducing kernel Hilbert space as a space of admissible vector fields, and prove that the resulting estimator admits a coordinate representation adapted to the constraint distribution. We also establish the consistency of the learned model. Numerical simulations on a vertical rolling disk illustrate the effectiveness of the proposed approach.


[50] 2603.27581

Centrality-Based Security Allocation in Networked Control Systems

This paper addresses the security allocation problem within networked control systems, which consist of multiple interconnected control systems under the influence of two opposing agents: a defender and a malicious adversary. The adversary aims to maximize the worst-case attack impact on system performance while remaining undetected by launching stealthy data injection attacks on one or several interconnected control systems. Conversely, the defender's objective is to allocate security resources to detect and mitigate these worst-case attacks. A novel centrality-based approach is proposed to guide the allocation of security resources to the most connected or influential subsystems within the network. The methodology involves comparing the worst-case attack impact for both the optimal and centrality-based security allocation solutions. The results demonstrate that the centrality measure approach enables significantly faster allocation of security resources with acceptable levels of performance loss compared to the optimal solution, making it suitable for large-scale networks. The proposed method is validated through numerical examples using Erdos-Renyi graphs.


[51] 2603.27592

Secure Reinforcement Learning: On Model-Free Detection of Man in the Middle Attacks

We consider the problem of learning-based man-in-the-middle (MITM) attacks in cyber-physical systems (CPS), and extend our previously proposed Bellman Deviation Detection (BDD) framework for model-free reinforcement learning (RL). We refine the standard MDP attack model by allowing the reward function to depend on both the current and subsequent states, thereby capturing reward variations induced by errors in the adversary's transition estimate. We also derive an optimal system-identification strategy for the adversary that minimizes detectable value deviations. Further, we prove that the agent's asymptotic learning time required to secure the system scales linearly with the adversary's learning time, and that this matches the optimal lower bound. Hence, the proposed detection scheme is order-optimal in detection efficiency. Finally, we extend the framework to asynchronous and intermittent attack scenarios, where reliable detection is preserved.


[52] 2603.27604

Time-varying System Identification of Bedform Dynamics Using Modal Decomposition

Measuring sediment transport in riverbeds has long been a challenging research problem in geomorphology and river engineering. Traditional approaches rely on direct measurements using sediment samplers. Although such measurements are often considered ground truth, they are intrusive, labor-intensive, and prone to large variability. As an alternative, sediment flux can be inferred indirectly from the kinematics of migrating bedforms and temporal changes in bathymetry. While such approaches are helpful, bedform dynamics are nonlinear and multiscale, making it difficult to determine the contributions of different scales to the overall sediment flux. Fourier decomposition has been applied to examine bedform scaling, but it treats spatial and temporal variability separately. In this work, we introduce Dynamic Mode Decomposition (DMD) as a data-driven framework for analyzing riverbed evolution. By incorporating this representation into the Exner equation, we establish a link between modal dynamics and net sediment flux. This formulation provides a surrogate measure for scale-dependent sediment transport, enabling new insights into multiscale bedform-driven sediment flux in fluvial channels.


[53] 2603.27615

Adaptive differentiating filter: case study of PID feedback control

This paper presents an adaptive causal discrete-time filter for derivative estimation, exemplified by its use in estimating relative velocity in a mechatronic application. The filter is based on a constrained least squares estimator with window adaptation. It demonstrates low sensitivity to low-amplitude measurement noise, while preserving a wide bandwidth for large-amplitude changes in the process signal. Favorable performance properties of the filter are discussed and demonstrated in a practical case study of PID feedback controller and compared experimentally to a standard linear low-pass filter-based differentiator and a robust sliding-mode based homogeneous differentiator.


[54] 2603.27677

Safety-Constrained Optimal Control for Unknown System Dynamics

In this paper, we present a framework for solving continuous optimal control problems when the true system dynamics are approximated through an imperfect model. We derive a control strategy by applying Pontryagin's Minimum Principle to the model-based Hamiltonian functional, which includes an additional penalty term that captures the deviation between the model and the true system. We then derive conditions under which this model-based strategy coincides with the optimal control strategy for the true system under mild convexity assumptions. We demonstrate the framework on a real robotic testbed for the cruise control application with safety distance constraints.


[55] 2603.27708

A Nonlinear Incremental Approach for Replay Attack Detection

Replay attacks comprise replaying previously recorded sensor measurements and injecting malicious signals into a physical plant, causing great damage to cyber-physical systems. Replay attack detection has been widely studied for linear systems, whereas limited research has been reported for nonlinear cases. In this paper, the replay attack is studied in the context of a nonlinear plant controlled by an observer-based output feedback controller. We first analyze replay attack detection using an innovation-based detector and reveal that this detector alone may fail to detect such attacks. Consequently, we turn to a watermark-based design framework to improve the detection. In the proposed framework, the effects of the watermark on attack detection and closed-loop system performance loss are quantified by two indices, which exploit the incremental gains of nonlinear systems. To balance the detection performance and control system performance loss, an explicit optimization problem is formulated. Moreover, to achieve a better balance, we generalize the proposed watermark design framework to co-design the watermark, controller and observer. Numerical simulations are presented to validate the proposed frameworks.


[56] 2603.27722

Tertiary-Mode STAR-RIS for Secure NOMA: Integrating Transmission, Reflection, and Jamming

This paper investigates the physical layer security of a non-orthogonal multiple access (NOMA) system assisted by a tertiary-mode simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS), which can perform transmission, reflection, and jamming simultaneously. The system comprises a base station (BS) serving two users located on opposite sides of the STAR-RIS, assuming perfect channel state information (CSI) at the transmitter. To enhance secrecy performance, a subset of STAR-RIS elements is adaptively configured for jamming. A penalty-based alternating optimization algorithm is developed to jointly optimize the BS's active beamforming and the STAR-RIS's passive beamforming and mode selection. Simulation results demonstrate that the proposed design substantially improves the achievable sum rate and secrecy performance compared to conventional RIS-assisted and no-RIS benchmarks, highlighting the potential of tertiary-mode STAR-RIS for secure and efficient next-generation wireless communications.


[57] 2603.27726

Wideband Near-Field Sensing in ISAC: Unified Algorithm Design and Decoupled Effect Analysis

To advance integrated sensing and communications (ISAC) in sixth-generation (6G) extremely large-scale multiple-input multiple-output (XL-MIMO) networks, a low-complexity compressed sensing (CS)-based dictionary design is proposed for wideband near-field (WB-NF) target localization. Currently, the massive signal dimensions in the WB-NF regime impose severe computational burdens and high spatial-frequency coherence on conventional grid-based algorithms. Furthermore, a unified framework exploiting both wideband (WB) and near-field (NF) effects is lacking, and the analytical conditions for simplifying this model into decoupled approximations remain uncharacterized. To address these challenges, the proposed algorithm mathematically decouples the mutual coherence function and introduces a novel angle-distance sampling grid with customized distance adjustments, drastically reducing dictionary dimensions while ensuring low coherence. To isolate the individual WB and NF impacts, two coherence-based metrics are formulated to establish the effective boundaries of the narrowband near-field (NB-NF) and wideband far-field (WB-FF) regions, where respective multiple signal classification (MUSIC) algorithms are utilized. Simulations demonstrate that the CS-based method achieves robust performance across the entire regime, and the established boundaries provide crucial theoretical guidelines for WB and NF effect decoupling.


[58] 2603.27733

Extremum-Based Joint Compression and Detection for Distributed Sensing

We study joint compression and detection in distributed sensing systems motivated by emerging applications such as IoT-based localization. Two spatially separated sensors observe noisy signals and can exchange only a $k$-bit message over a reliable one-way low-rate link. One sensor compresses its observation into a $k$-bit description to help the other decide whether their observations share a common underlying signal or are statistically independent. We propose a simple extremum-based strategy, in which the encoder sends the index of its largest sample and the decoder performs a scalar threshold test. We derive exact nonasymptotic false-alarm and misdetection probabilities and validate the analysis with representative simulations.


[59] 2603.27763

GSW: Generalized "Self-Wiener" Denoising

We revisit the recently proposed ``self-Wiener" (SW) filtering method for robust deconvolution, and generalize it to the classical denoising problem. The resulting estimator, termed generalized SW (GSW) filtering, retains the nonlinear shrinkage structure of SW but introduces a tunable threshold parameter. This tunability enables GSW to flexibly adapt to varying signal-to-noise ratio (SNR) regimes by balancing noise suppression and signal preservation. We derive closed-form expressions for its mean-square error (MSE) performance in both low- and high-SNR regimes, and demonstrate that GSW closely approximates the oracle MMSE at high SNR while maintaining strong robustness at low SNR. Simulation results validate the analytical findings, showing that GSW consistently achieves favorable denoising performance across a wide range of SNRs. Its analytical tractability, parameter flexibility, and close connection to the optimal Wiener filter structure make it a promising tool for practical applications including compressive sensing, sparse signal recovery, and domain-specific shrinkage in wavelet, Fourier, and potentially learned orthonormal representations.


[60] 2603.27816

Impact of Inverter-Based Resources on the Protection of the Electrical Grid

In recent years, the contribution of renewable energy resources to the electrical grid has increased drastically; the most common of these are photovoltaic solar panels and wind turbines. These resources rely on inverters to interface with the grid, which do not inherently exhibit the same fault characteristics as synchronous generators. Consistently, they can strain grid reliability and security, cause increased number of blackouts, and, in some cases, allow relatively minor faults to turn into cascading failures. Solar and wind energy provide benefits and can support grid stability; however, several challenges and gaps in understanding must be explored and addressed before this can be realized. This paper provides a comprehensive literature review of grid codes, modeling techniques, and tools, as well as current methods for responding to various faults. It also presents an overview of the industry's state as it relates to grid fault response in the presence of inverter-based resources.


[61] 2603.27831

A Sensitivity Analysis of Flexibility from GPU-Heavy Data Centers

The rapid growth of GPU-heavy data centers has significantly increased electricity demand and creating challenges for grid stability. Our paper investigates the extent to which an energy-aware job scheduling algorithm can provide flexibility in GPU-heavy data centers. Compared with the traditional first-in first-out (FIFO) baseline, we show that more efficient job scheduling not only increases profit, but also brings latent power flexibility during peak price period. This flexibility is achieved by moving lower energy jobs, preferentially executing jobs with lower GPU utilization and smaller node requirements, when the electricity price is high. We demonstrate that data centers with lower queue length and higher variance in job characteristics such as job GPU utilization and job size, offer the greatest flexibility potential. Finally we show that data center flexibility is highly price sensitive, a 7% demand reduction is achieved with a small incentive, but unrealistically high prices are required to achieve a 33% reduction.


[62] 2603.27837

Estimation of Regions of Attraction for Nonlinear Systems via Coordinate-Transformed TS Models

This paper presents a novel method for estimating larger Region of Attractions (ROAs) for continuous-time nonlinear systems modeled via the Takagi-Sugeno (TS) framework. While classical approaches rely on a single TS representation derived from the original nonlinear system to compute an ROA using Lyapunov-based analysis, the proposed method enhances this process through a systematic coordinate transformation strategy. Specifically, we construct multiple TS models, each obtained from the original nonlinear system under a distinct linear coordinate transformation. Each transformed system yields a local ROA estimate, and the overall ROA is taken as the union of these individual estimates. This strategy leverages the variability introduced by the transformations to reduce conservatism and expand the certified stable region. Numerical examples demonstrate that this approach consistently provides larger ROAs compared to conventional single-model TS-based techniques, highlighting its effectiveness and potential for improved nonlinear stability analysis.


[63] 2603.27882

iBEAMS: A Unified Framework for Secure and Energy-Efficient ISAC-MIMO Systems leveraging Bayesian Enhanced learning, and Adaptive Game-Theoretic Multi-Layer Strategies

Next generation ISAC networks operating in the mmWave and THz bands must provide physical layer secrecy against potential eavesdroppers (mobile and static) while coordinating distributed hybrid edge nodes under stringent power and QoS constraints. However, these requirements are rarely addressed in a unified manner in existing ISAC physical layer security designs. This paper proposes iBEAMS, a hierarchical Stackelberg--GNE--Bayesian framework for secure and energy efficient ISAC with distributed hybrid nodes. The proposed architecture integrates: (i) a Stackelberg leader at the ISAC base station that jointly optimizes total transmit power, power splitting among confidential data, artificial noise, and sensing, and broadcasts incentive prices to shape follower utilities; (ii) a Generalized Nash Equilibrium Game in which hybrid nodes select transmit powers and transmission versus jamming roles under coupled interference constraints and base-station-imposed leakage penalties; and (iii) a Bayesian cooperative refinement layer that forms geometry-aware jamming coalitions aligned with the posterior distribution of the eavesdropper's Angle of Arrival. Simulations over carrier frequencies from 28 GHz to 3 THz demonstrate hierarchical convergence of both base station and hybrid node decisions with stable cooperative friendly jamming. iBEAMS attains approximately 4.4--4.7 bps/Hz average secrecy rate, achieves about $2\times$ higher Secrecy Energy Efficiency (SEE), and delivers 30--70% higher SEE than a Stackelberg-decision-based baseline, while maintaining zero outage at 28 GHz. Moreover, the posterior-aligned jamming remains sharply directive and resilient under mobile eavesdroppers and increasing adversary density, indicating that iBEAMS can simultaneously act against static and mobile adversaries while coordinating hybrid edge nodes under limited power and QoS constraints.


[64] 2603.27893

MPC as a Copilot: A Predictive Filter Framework with Safety and Stability Guarantees

Ensuring both safety and stability remains a fundamental challenge in learning-based control, where goal-oriented policies often neglect system constraints and closed-loop state convergence. To address this limitation, this paper introduces the Predictive Safety--Stability Filter (PS2F), a unified predictive filter framework that guarantees constraint satisfaction and asymptotic stability within a single architecture. The PS2F framework comprises two cascaded optimal control problems: a nominal model predictive control (MPC) layer that serves solely as a copilot, implicitly defining a Lyapunov function and generating safety- and stability-certified predicted trajectories, and a secondary filtering layer that adjusts external command to remain within a provably safe and stable region. This cascaded structure enables PS2F to inherit the theoretical guarantees of nominal MPC while accommodating goal-oriented external commands. Rigorous analysis establishes recursive feasibility and asymptotic stability of the closed-loop system without introducing additional conservatism beyond that associated with the nominal MPC. Furthermore, a time-varying parameterisation allows PS2F to transition smoothly between safety-prioritised and stability-oriented operation modes, providing a principled mechanism for balancing exploration and exploitation. The effectiveness of the proposed framework is demonstrated through comparative numerical experiments.


[65] 2603.27902

On the Computation of Backward Reachable Sets for Max-Plus Linear Systems with Disturbances

This paper investigates one-step backward reachability for uncertain max-plus linear systems with additive disturbances. Given a target set, the problem is to compute the set of states from which there exists an admissible control input such that, for all admissible disturbances, the successor state remains in the target set. This problem is closely related to safety analysis and is challenging due to the high computational complexity of existing approaches. To address this issue, we develop a computational framework based on tropical polyhedra. We assume that the target set, the control set, and the disturbance set are all represented as tropical polyhedra, and study the structural properties of the associated backward operators. In particular, we show that these operators preserve the tropical-polyhedral structure, which enables the constructive computation of reachable sets within the same framework. The proposed approach provides an effective geometric and algebraic tool for reachability analysis of uncertain max-plus linear systems. Illustrative examples are included to demonstrate the proposed method.


[66] 2603.27909

Data is All You Need: Markov Chain Car-Following (MC-CF) Model

Car-following behavior is fundamental to traffic flow theory, yet traditional models often fail to capture the stochasticity of naturalistic driving. This paper introduces a new car-following modeling category called the empirical probabilistic paradigm, which bypasses conventional parametric assumptions. Within this paradigm, we propose the Markov Chain Car-Following (MC-CF) model, which represents state transitions as a Markov process and predicts behavior by randomly sampling accelerations from empirical distributions within discretized state bins. Evaluation of the MC-CF model trained on the Waymo Open Motion Dataset (WOMD) demonstrates that its variants significantly outperform physics-based models including IDM, Gipps, FVDM, and SIDM in both one-step and open-loop trajectory prediction accuracy. Statistical analysis of transition probabilities confirms that the model-generated trajectories are indistinguishable from real-world behavior, successfully reproducing the probabilistic structure of naturalistic driving across all interaction types. Zero-shot generalization on the Naturalistic Phoenix (PHX) dataset further confirms the model's robustness. Finally, microscopic ring road simulations validate the framework's scalability. By incrementally integrating unconstrained free-flow trajectories and high-speed freeway data (TGSIM) alongside a conservative inference strategy, the model drastically reduces collisions, achieving zero crashes in multiple equilibrium and shockwave scenarios, while successfully reproducing naturalistic and stochastic shockwave propagation. Overall, the proposed MC-CF model provides a robust, scalable, and calibration-free foundation for high-fidelity stochastic traffic modeling, uniquely suited for the data-rich future of intelligent transportation.


[67] 2603.27934

Collision Avoidance Control for a Two-wheeled Vehicle under Stochastic Vibration using an Almost Sure Control Barrier Function

In recent years, many control problems of autonomous mobile robots have been developed. In particular, the robots are required to be safe; that is, they need to be controlled to avoid colliding with people or objects while traveling. In addition, since safety should be ensured even under irregular disturbances, the control for safety is required to be effective for stochastic systems. In this study, we design an almost sure safety-critical control law, which ensures safety with probability one, for a two-wheeled vehicle based on the stochastic control barrier function approach. In the procedure, we also consider a system model using the relative distance measured by a 2D LiDAR. The validity of the proposed control scheme is confirmed by experiments of a collision avoidance problem for a two-wheeled vehicle under vibration.


[68] 2603.27943

Stochastic Safety-critical Control Compensating Safety Probability for Marine Vessel Tracking

A marine vessel is a nonlinear system subject to irregular disturbances such as wind and waves, which cause tracking errors between the nominal and actual trajectories. In this study, a nonlinear vessel maneuvering model that includes a tracking controller is formulated and then controlled using a linear approximation around the nominal trajectory. The resulting stochastic linearized system is analyzed using a stochastic zeroing control barrier function (ZCBF). A stochastic safety compensator is designed to ensure probabilistic safety, and its effectiveness is verified through numerical simulations.


[69] 2603.27961

Radar Cross Section Characterization of Quantized Reconfigurable Intelligent Surfaces

We present a radar sensing framework based on a low-complexity, quantized reconfigurable intelligent surface (RIS) that enables programmable manipulation of electromagnetic wavefronts for enhanced detection in non-specular and shadowed regions. We develop closed-form expressions for the scattered field and radar cross section (RCS) of phase-quantized RIS apertures based on aperture field theory, accurately capturing the effects of quantized phase, periodicity, and grating lobes on radar detection performance. The theory enables us to analyze the RIS's RCS along both the forward and backward paths from the radar to the target. The theory is benchmarked against full-wave electromagnetic simulations incorporating realistic unit-cell amplitude and phase responses. To validate practical feasibility, a $[16\times10]$ 1-bit RIS operating at 5.5 GHz is fabricated and experimentally characterized inside an anechoic chamber. Measurements of steering angles, beam-squint errors, and peak-to-specular ratios of the RCS patterns exhibit strong agreement with analytical and simulated results. Further experiments demonstrate that the RIS can redirect the beam in a non-specular direction and recover micro-Doppler signatures that remain undetectable with a conventional radar deployment.


[70] 2603.27998

BiFormer3D: Grid-Free Time-Domain Reconstruction of Head-Related Impulse Responses with a Spatially Encoded Transformer

Individualized head-related impulse responses (HRIRs) enable binaural rendering, but dense per-listener measurements are costly. We address HRIR spatial up-sampling from sparse per-listener measurements: given a few measured HRIRs for a listener, predict HRIRs at unmeasured target directions. Prior learning methods often work in the frequency domain, rely on minimum-phase assumptions or separate timing models, and use a fixed direction grid, which can degrade temporal fidelity and spatial continuity. We propose BiFormer3D, a time-domain, grid-free binaural Transformer for reconstructing HRIRs at arbitrary directions from sparse inputs. It uses sinusoidal spatial features, a Conv1D refinement module, and auxiliary interaural time difference (ITD) and interaural level difference (ILD) heads. On SONICOM, it improves normalized mean squared error (NMSE), cosine distance, and ITD/ILD errors over prior methods; ablations validate modules and show minimum-phase pre-processing is unnecessary.


[71] 2603.28011

Learning Certified Neural Network Controllers Using Contraction and Interval Analysis

We present a novel framework that jointly trains a neural network controller and a neural Riemannian metric with rigorous closed-loop contraction guarantees using formal bound propagation. Directly bounding the symmetric Riemannian contraction linear matrix inequality causes unnecessary overconservativeness due to poor dependency management. Instead, we analyze an asymmetric matrix function $G$, where $2^n$ GPU-parallelized corner checks of its interval hull verify that an entire interval subset $X$ is a contraction region in a single shot. This eliminates the sample complexity problems encountered with previous Lipschitz-based guarantees. Additionally, for control-affine systems under a Killing field assumption, our method produces an explicit tracking controller capable of exponentially stabilizing any dynamically feasible trajectory using just two forward inferences of the learned policy. Using JAX and $\texttt{immrax}$ for linear bound propagation, we apply this approach to a full 10-state quadrotor model. In under 10 minutes of post-JIT training, we simultaneously learn a control policy $\pi$, a neural contraction metric $\Theta$, and a verified 10-dimensional contraction region $X$.


[72] 2603.28016

Input-to-state stabilization of linear systems under data-rate constraints

We study feedback stabilization of continuous-time linear systems under finite data-rate constraints in the presence of unknown disturbances. A communication and control strategy based on sampled and quantized state measurements is proposed, where the quantization range is dynamically adjusted using reachable-set propagation and disturbance estimates derived from quantization parameters. The strategy alternates between stabilizing and searching stages to handle escapes from the quantization range and employs an additional quantization symbol to ensure robustness near the equilibrium. It guarantees input-to-state stability (ISS), improving upon existing results that yield only practical ISS or lack explicit data-rate conditions. Simulation results illustrate the effectiveness of the strategy.


[73] 2603.28018

Low-Latency Edge LLM Handover via Joint KV Cache Transfer and Token Prefill

Edge deployment of large language models (LLMs) can reduce latency for interactive services, but mobility introduces service interruptions when an user equipment (UE) hands over between base stations (BSs). To promptly resume decoding, the target-side edge server must recover the UE context state, which can be provisioned either by token forwarding followed by prefill computation or by direct key-value (KV) cache transmission over backhaul. This paper proposes a unified handover (HO) design that jointly selects the prefill length and schedules backhaul KV cache delivery to minimize the worst-user LLM HO delay for multiple UEs. The resulting scheme admits a tractable step-wise solution with explicit feasibility conditions and a constructive rate-scheduling policy. Simulations show that the proposed method consistently outperforms baselines across a wide range of backhaul capacities, prefill speeds, and context sizes, providing practical guidelines for mobility-aware Edge LLM token streaming.


[74] 2603.28081

Transformer-Based Prognostics: Enhancing Network Availability by Improved Monitoring of Optical Fiber Amplifiers

We enhance optical network availability and reliability through a lightweight transformer model that predicts optical fiber amplifier lifetime from condition-based monitoring data, enabling real-time, edge-level predictive maintenance and advancing deployable AI for autonomous network operation.


[75] 2603.28083

Deep Learning Based Site-Specific Channel Inference Using Satellite Images

Site-specific channel inference plays a critical role in the design and evaluation of next-generation wireless communication systems by considering the surrounding propagation environment. However, traditional methods are unscalable, while existing AI-based approaches using satellite image are confined to predicting large-scale fading parameters, lacking the capacity to reconstruct the complete channel impulse response (CIR). To address this limitation, we propose a deep learning-based site-specific channel inference framework using satellite images to predict structured Tapped Delay Line (TDL) parameters. We first establish a joint channel-satellite dataset based on measurements. Then, a novel deep learning network is developed to reconstruct the channel parameters. Specifically, a cross-attention-fused dual-branch pipeline extracts macroscopic and microscopic environmental features, while a recurrent tracking module captures the long-term dynamic evolution of multipath components. Experimental results demonstrate that the proposed method achieves high-quality reconstruction of the CIR in unseen scenarios, with a Power Delay Profile (PDP) Average Cosine Similarity exceeding 0.96. This work provides a pathway toward site-specific channel inference for future dynamic wireless networks.


[76] 2603.28121

Joint Time-Phase Synchronization for Distributed Sensing Networks via Feature-Level Hyper-Plane Regression

Achieving coherent integration in distributed Internet of Things (IoT) sensing networks requires precise synchronization to jointly compensate clock offsets and radio-frequency (RF) phase errors. Conventional two-step protocols suffer from time-phase coupling, where residual timing offsets degrade phase coherence. This paper proposes a generalized hyper-plane regression (GHR) framework for joint calibration by transforming coupled spatiotemporal phase evolution into a unified regression model, enabling effective parameter decoupling. To support resource-constrained IoT edge nodes, a feature-level distributed architecture is developed. By adopting a linear frequency-modulated (LFM) waveform, the model order is reduced, yielding linear computational complexity. In addition, a unidirectional feature transmission mechanism eliminates the communication overhead of bidirectional timestamp exchange, making the approach suitable for resource-constrained IoT networks. Simulation results demonstrate reliable picosecond-level synchronization accuracy under severe noise across kilometer-scale distributed IoT sensing networks.


[77] 2603.28176

Weighted Sum-Rate Maximization for RIS-UAV-assisted Space-Air-Ground Integrated Network with RSMA

In this paper, a rate-splitting multiple access (RSMA) based joint optimization framework for the space-air-ground integrated network (SAGIN) is proposed, where the satellite and base stations employ uniform planar array (UPA) antennas for signal transmission, and unmanned aerial vehicles (UAVs) relay the satellite signals. Earth stations (ESs) and user equipments (UEs) receive signals from satellite and base stations (BSs), respectively, resulting in mutual interference. We first model the channels and signals in this scenario and analyse the interference at BSs and UEs. Then, We formulate a joint optimization problem aimed at maximizing the weighted sum-rate, involving beamforming, RIS-UAV deployment and phase shifts, and rate splitting. However, this problem is highly non-convex. To tackle this challenge, we apply a block coordinate descent (BCD) approach to decompose the problem and employ the weighted minimum mean square error (WMMSE) method to transform the non-convex objective function. For the rate-splitting sub-problem, a greedy algorithm is proposed and a successive convex approximation (SCA) algorithm is used for beamforming. Besides, the alternating direction method of multipliers (ADMM) algorithm is employed for the RIS phase-shift problem with unit-modulus constraints, and an exhaustive search method is adopted for the complex UAV positioning and orientation. Simulation results validate that the proposed algorithm achieves superior performance in terms of user weighted sum-rate.


[78] 2603.28192

Analysis and Design of Reset Control Systems via Base Linear Scaled Graphs

In this letter, we prove that under mild conditions, the scaled graph of a reset control system is bounded by the scaled graph of its underlying base linear system, i.e., the system without resets. Building on this new insight, we establish that the negative feedback interconnection of a linear time-invariant plant and a reset controller is stable, if the scaled graphs of the underlying base linear components are strictly separated. This result simplifies reset system analysis, as stability conditions reduce to verifying properties of linear time-invariant systems. We exploit this result to develop a systematic approach for reset control system design. Our framework also accommodates reset systems with time-regularization, which were not addressed in the context of scaled graphs before.


[79] 2603.28195

Toward Multi-Satellite Cooperative Transmission: A Joint Framework for CSI Acquisition, Feedback, and Phase Synchronization

The stringent link budget, caused by long propagation distances and payload constraints, poses a fundamental bottleneck for single-satellite transmission. Although LEO mega-constellations make multi-satellite cooperative transmission (MSCT), such as distributed precoding (DP), increasingly feasible, its cooperative gains critically rely on stringent time-frequency-phase synchronization (TFP-Sync), which is difficult to maintain under rapid channel variation and feedback latency. To address this issue, this paper proposes a joint CSI acquisition, feedback, and phase-level synchronization (JCAFPS) framework for MSCT. Specifically, to enable reliable, overhead-efficient CSI acquisition, we design a beam-domain adjustable phase-shift tracking reference signal (TRS) transmission scheme, along with criteria for the TRS and CSI-feedback periods. Then, exploiting deterministic orbital motion and dominant LoS propagation, we establish a polynomial model for the temporal evolution of delay and Doppler shift, and derive an OFDM-based multi-satellite signal model under non-ideal synchronization. The analysis reveals that, unlike the single-satellite case, the composite multi-satellite channel exhibits nonlinear time-frequency-varying phase behavior, necessitating symbol- and subcarrier-wise phase precompensation for coherent transmission. Based on these results, we develop a practical closed-loop realization integrating single-TRS-based channel parameter estimation, multi-TRS-based channel prediction, predictive CSI feedback, and user-specific TFP precompensation. Numerical results demonstrate that the proposed framework achieves accurate CSI acquisition and precise TFP-Sync, enabling DP-based dual-satellite cooperative transmission to approach the theoretical 6 dB power gain over single-satellite transmission, while remaining robust under extended prediction durations and enlarged TRS periods.


[80] 2603.28217

An Optimal Battery-Free Approach for Emission Reduction by Storing Solar Surplus in Building Thermal Mass

Decarbonization in buildings calls for advanced control strategies that coordinate on-site renewables, grid electricity, and thermal demand. Literature approaches typically rely on demand side management strategies or on active energy storage, like batteries. However, the first solution often neglects carbon-aware objectives, and could lead to grid overload issues, while batteries entail environmental, end-of-life, and cost concerns. To overcome these limitations, we propose an optimal, carbon-aware optimization strategy that exploits the building's thermal mass as a passive storage, avoiding dedicated batteries. Specifically, when a surplus of renewable energy is available, our strategy computes the optimal share of surplus to store by temporarily adjusting the indoor temperature setpoint within comfort bounds. Thus, by explicitly accounting for forecasts of building energy consumption, solar production, and time-varying grid carbon intensity, our strategy enables emissions-aware load shifting while maintaining comfort. We evaluate the approach by simulating three TRNSYS models of the same system with different thermal mass. In all cases, the results show consistent reductions in grid electricity consumption with respect to a baseline that does not leverage surplus renewable generation. These findings highlight the potential of thermal-mass-based control for building decarbonization.


[81] 2603.28262

Spectral Segmented Linear Regression for Coarse Carrier Frequency Offset Estimation in Optical LEO Satellite Communications

Carrier frequency offset estimation (CFOE) is a critical stage in modern coherent optical communication systems. Although conventional all-digital techniques perform reliably in typical fiber-optic communication links, CFOE often becomes a major bottleneck in low-symbol-rate scenarios with large carrier CFOs (approaching the signal bandwidth) and severe additive noise levels. These conditions are particularly prevalent in links between optical ground stations (OGSs) and low Earth orbit (LEO) satellites, where Doppler-induced frequency shifts of several gigahertz and atmospheric attenuation significantly degrade CFOE performance and can render traditional methods ineffective. In this paper, we propose a robust non-data-aided (NDA) scheme designed for wide-range CFOE. Such a coarse CFOE (C-CFOE) algorithm partially compensates for the CFO, enabling the operation of a subsequent fine CFOE algorithm. By applying low-complexity operations to the spectrum of the received signal, we recast the frequency estimation task as a segmented linear regression (SLR) problem. Numerical simulations in stress-test scenarios involving large CFOs, low SNR, and low symbol rates show that the proposed approach achieves good estimation accuracy and robust convergence. Experimental offline validation further confirms the practical feasibility of the method.


[82] 2603.28264

Clustered Movable Pinching Antennas: Realizing Beamforming Gains and Target Diversity in ISAC Systems with Look-Angle-Dependent RCS

We investigate a novel integrated sensing and communication (ISAC) system enabled by pinching antennas (PAs), which are dynamically activated along a dielectric waveguide. Unlike prior designs, the PAs are organized into multiple clusters of movable antennas. The movement of the antennas within each cluster enables transmit beamforming, while the spatial separation of different clusters allows the system to illuminate the target from multiple angular perspectives.


[83] 2603.28280

Multimodal-NF: A Wireless Dataset for Near-Field Low-Altitude Sensing and Communications

Environment-aware 6G wireless networks demand the deep integration of multimodal and wireless data. However, most existing datasets are confined to 2D terrestrial far-field scenarios, lacking the 3D spatial context and near-field characteristics crucial for low-altitude extremely large-scale multiple-input multiple-output (XL-MIMO) systems. To bridge this gap, this letter introduces Multimodal-NF, a large-scale dataset and specialized generation framework. Operating in the upper midband, it synchronizes high-fidelity near-field channel state information (CSI) and precise wireless labels (e.g., Top-5 beam indices, LoS/NLoS) with comprehensive sensory modalities (RGB images, LiDAR point clouds, and GPS). Crucially, these multimodal priors provide spatial semantics that help reduce the near-field search space and thereby lower the overhead of wireless sensing and communication tasks. Finally, we validate the dataset through representative case studies, demonstrating its utility and effectiveness. The open-source generator and dataset are available at this https URL.


[84] 2603.28283

Distributed User Scheduling in Multi-Cell MIMO O-RAN with QoS Constraints

Distributed scheduling is essential for open radio access network (O-RAN) employing advanced physical-layer techniques such as multi-user MIMO (MU-MIMO), carrier aggregation (CA), and joint transmission (JT). This work investigates the multi-component-carrier (multi-CC) resource block group (RBG) scheduling in MU-MIMO O-RAN with both JT and non-JT users. We formulate a scheduling optimization problem to maximize throughput subject to user-specific quality of service (QoS) requirements while ensuring consistent allocations across cooperating O-RAN radio units (O-RUs) required by JT transmission. The strong variable coupling, non-convexity, and combinatorial complexity make the problem highly challenging. To tackle this, we extend the eigen-based zero-forcing transceiver design to JT users and leverage massive MIMO asymptotic properties to derive a tractable, separable rate approximation. Building on this, we develop two solutions: a centralized block coordinate descent benchmark and a distributed scheduler aligned with the O-RAN architecture. The proposed distributed scheme achieves near-centralized performance with only one round of lightweight coordination among cells, significantly reducing complexity and delay. Extensive simulations validate that our distributed scheduler achieves high scalability, fast convergence, and better QoS satisfaction rate in large-scale MU-MIMO networks.


[85] 2603.28286

Competitor-aware Race Management for Electric Endurance Racing

Electric endurance racing is characterized by severe energy constraints and strong aerodynamic interactions. Determining race-winning policies therefore becomes a fundamentally multi-agent, game-theoretic problem. These policies must jointly govern low-level driver inputs as well as high-level strategic decisions, including energy management and charging. This paper proposes a bi-level framework for competitor-aware race management that combines game-theoretic optimal control with reinforcement learning. At the lower level, a multi-agent game-theoretic optimal control problem is solved to capture aerodynamic effects and asymmetric collision-avoidance constraints inspired by motorsport rules. Using this single-lap problem as the environment, reinforcement learning agents are trained to allocate battery energy and schedule pit stops over an entire race. The framework is demonstrated in a two-agent, 45-lap simulated race. The results show that effective exploitation of aerodynamic interactions is decisive for race outcome, with strategies that prioritize finishing position differing fundamentally from single-agent, minimum-time approaches.


[86] 2603.28305

Toward Distributed User Scheduling and Coordinated Beamforming in Multi-Cell mmWave Networks: A Sensing-Assisted Framework

Providing guaranteed quality of service for cell-edge users remains a longstanding challenge in wireless networks. While coordinated interference management was proposed decades ago, its potential has been limited by computational complexity and backhaul resource constraints. Distributed user scheduling and coordinated beamforming (D-USCB) offers a scalable solution but faces practical challenges in acquiring inter-cell channel state information (CSI), as base stations (BSs) are often restricted to signal strength measurements, and high-dimensional CSI exchange incurs substantial overhead. Inspired by integrated sensing and communication (ISAC), this paper proposes a sensing-assisted D-USCB (SD-USCB) framework to maximize the network throughput of multi-cell mmWave networks. Firstly, the framework leverages channel knowledge maps (CKMs) that map user locations to CSI estimates, where user locations are proactively sensed via ISAC echoes. Secondly, we employ a signal-to-average-leakage-plus-interference-plus-noise ratio (SALINR) metric for distributed ISAC beamforming optimization, in which BSs simultaneously communicate with users and sense their locations. These two components jointly enable distributed coordinated transmission with only user location information exchanged among BSs, thereby substantially reducing backhaul overhead. In addition, we devise efficient distributed user scheduling and ISAC beamforming algorithms to jointly optimize communication and sensing performance. Extensive numerical results demonstrate significant improvements in network throughput, validating the efficacy of the proposed framework.


[87] 2603.28318

Integrated sensing and communications in the 3GPP New Radio: sensing limits

Integrated Sensing and Communications (ISAC) is regarded as a key element of the beyond-fifth-generation (5G) and sixth-generation (6G) systems, raising the question of whether current 5G New Radio (NR) signal structures can meet the sensing accuracy requirements specified by the Third Generation Partnership Project (3GPP). This paper addresses this issue by analyzing the fundamental limits of range and velocity estimation through the Cramér-Rao lower bound (CRLB) for a monostatic unmanned aerial vehicle (UAV) sensing use case currently under consideration in the 3GPP standardization process. The study focuses on standardized signals and also evaluates the potential performance gains achievable with reference signals specifically designed for sensing purposes. The compact CRLB expressions derived in this work highlight the fundamental trade-offs between estimation accuracy and system parameters. The results further indicate that information from multiple slots must be exploited in the estimation process to attain the performance targets defined by the 3GPP. As a result, the 5G NR positioning reference signal (PRS), whose patterns may be suboptimal for velocity estimation when using single-slot resources, becomes suitable when multislot estimation is employed. Finally, we propose a two-step iterative range and radial-velocity estimator that attains the CRLB over a significantly wider range of distances than conventional maximum-likelihood (ML) estimators, for which the well-known threshold effect severely limits the distance range over which the accuracy requirements imposed by the 3GPP are satisfied.


[88] 2603.28323

Data Center Chiller Plant Optimization via Mixed-Integer Nonlinear Differentiable Predictive Control

We present a computationally tractable framework for real-time predictive control of multi-chiller plants that involve both discrete and continuous control decisions coupled through nonlinear dynamics, resulting in a mixed-integer optimal control problem. To address this challenge, we extend Differentiable Predictive Control (DPC) -- a self-supervised, model-based learning methodology for approximately solving parametric optimal control problems -- to accommodate mixed-integer control policies. We benchmark the proposed framework against a state-of-the-art Model Predictive Control (MPC) solver and a fast heuristic Rule-Based Controller (RBC). Simulation results demonstrate that our approach achieves significant energy savings over the RBC while maintaining orders-of-magnitude faster computation times than MPC, offering a scalable and practical alternative to conventional combinatorial mixed-integer control formulations.


[89] 2603.28440

A System-View Optimal Additional Active Power Control of Wind Turbines for Grid Frequency Support

Additional active power control (AAPC) of wind turbines (WTs) is essential to improve the transient frequency stability of low-inertia power systems. Most of the existing research has focused on imitating the frequency response of the synchronous generator (SG), known as virtual inertia control (VIC), but are such control laws optimal for the power systems? Inspired by this question, this paper proposes an optimal AAPC of WTs to maximize the frequency nadir post a major power deficit. By decoupling the WT response and the frequency dynamics, the optimal frequency trajectory is solved based on the trajectory model, and its universality is strictly proven. Then the optimal AAPC of WTs is constructed reversely based on the average system frequency (ASF) model with the optimal frequency trajectory as the desired control results. The proposed method can significantly improve the system frequency nadir. Meanwhile, the event insensitivity makes it can be deployed based on the on-line rolling update under a hypothetic disturbance, avoiding the heavy post-event computational burden. Finally, simulation results in a two-machine power system and the IEEE 39 bus power system verify the effectiveness of the optimal AAPC of WTs.


[90] 2603.28450

An Accurate and Fast Start-up Scheme for Power System Real-time Emergency Control

With the development of PMUs in power systems, the response-based real-time emergency control becomes a promising way to prevent power outages when power systems are subjected to large disturbances. The first step in the emergency control is to start up accurately and fast when needed. To this end, this paper proposes a well-qualified start-up scheme for the power system real-time emergency control. Three key technologies are proposed to ensure the effectiveness of the scheme. They are an instability index, a Critical Machines (CMs) identification algorithm and a two-layer Single Machine Infinite Bus (SMIB) equivalence framework. The concave-convex area based instability index shows good accuracy and high reliability, which is used to identify the transient instability of the system. The CMs identification algorithm can track the changes of CMs and form the proper SMIB system at each moment. The new two-layer SMIB equivalence framework, compared with conventional ones, can significantly reduce the communication burden and improve the computation efficiency. The simulations in two test power systems show that the scheme can identify the transient instability accurately and fast to restore the system to stability after the emergency control. Besides, the proposed method is robust to measurement errors, which enhances its practicality.


[91] 2603.28489

Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms

The rapid evolution of video generation has enabled models to simulate complex physical dynamics and long-horizon causalities, positioning them as potential world simulators. However, a critical gap still remains between the theoretical capacity for world simulation and the heavy computational costs of spatiotemporal modeling. To address this, we comprehensively and systematically review video generation frameworks and techniques that consider efficiency as a crucial requirement for practical world modeling. We introduce a novel taxonomy in three dimensions: efficient modeling paradigms, efficient network architectures, and efficient inference algorithms. We further show that bridging this efficiency gap directly empowers interactive applications such as autonomous driving, embodied AI, and game simulation. Finally, we identify emerging research frontiers in efficient video-based world modeling, arguing that efficiency is a fundamental prerequisite for evolving video generators into general-purpose, real-time, and robust world simulators.


[92] 2603.28498

MRI-to-CT synthesis using drifting models

Accurate MRI-to-CT synthesis could enable MR-only pelvic workflows by providing CT-like images with bone details while avoiding additional ionizing radiation. In this work, we investigate recently proposed drifting models for synthesizing pelvis CT images from MRI and benchmark them against convolutional neural networks (UNet, VAE), a generative adversarial network (WGAN-GP), a physics-inspired probabilistic model (PPFM), and diffusion-based methods (FastDDPM, DDIM, DDPM). Experiments are performed on two complementary datasets: Gold Atlas Male Pelvis and the SynthRAD2023 pelvis subset. Image fidelity and structural consistency are evaluated with SSIM, PSNR, and RMSE, complemented by qualitative assessment of anatomically critical regions such as cortical bone and pelvic soft-tissue interfaces. Across both datasets, the proposed drifting model achieves high SSIM and PSNR and low RMSE, surpassing strong diffusion baselines and conventional CNN-, VAE-, GAN-, and PPFM-based methods. Visual inspection shows sharper cortical bone edges, improved depiction of sacral and femoral head geometry, and reduced artifacts or over-smoothing, particularly at bone-air-soft tissue boundaries. Moreover, the drifting model attains these gains with one-step inference and inference times on the order of milliseconds, yielding a more favorable accuracy-efficiency trade-off than iterative diffusion sampling while remaining competitive in image quality. These findings suggest that drifting models are a promising direction for fast, high-quality pelvic synthetic CT generation from MRI and warrant further investigation for downstream applications such as MRI-only radiotherapy planning and PET/MR attenuation correction.


[93] 2603.28529

Intelligent Radio Resource Slicing for 6G In-Body Subnetworks

6G In-body Subnetworks (IBSs) represent a key enabler for supporting standalone eXtended Reality (XR) applications. IBSs are expected to operate as an underlay to existing cellular networks, giving rise to coexistence challenges when sharing radio resources with other cellular users, such as enhanced Mobile Broadband (eMBB) users. Such resource allocation problem is highly dynamic and inherently non-convex due to heterogeneous service demands and fluctuating channel conditions. In this paper, we propose an intelligent radio resource slicing strategy based on the Soft Actor-Critic (SAC) deep reinforcement learning algorithm. The proposed SAC-based slicing method addresses the coexistence challenge between IBSs and eMBB users by optimizing a refined reward function that explicitly incorporates XR cross-modal delay alignment to ensure immersive experience while preserving eMBB service guarantees. Extensive system-level simulations are performed under realistic network conditions and the results demonstrate that the proposed method can enhance user experience by 12-85% under different network densities compared to baseline methods while maintaining the target data rate for eMBB users.


[94] 2603.28540

Measuring Cross-Jurisdictional Transfer of Medical Device Risk Concepts with Explainable AI

Medical device regulators in the United States(FDA), China (NMPA), and Europe (EU MDR) all use the language of risk, but classify devices through structurally different mechanisms. Whether these apparently shared concepts carry transferable classificatory signal across jurisdictions remains unclear. We test this by reframing explainable AI as an empirical probe of cross-jurisdictional regulatory overlap. Using 141,942 device records, we derive seven EU MDR risk factors, including implantability, invasiveness, and duration of use, and evaluate their contribution across a three-by-three transfer matrix. Under a symmetric extraction pipeline designed to remove jurisdiction-specific advantages, factor contribution is negligible in all jurisdictions, indicating that clean cross-jurisdictional signal is at most marginal. Under jurisdiction specific pipelines, a modest gain appears only in the EU MDR-to-NMPA direction, but sensitivity analyses show that this effect is weak, context-dependent, and partly confounded by extraction and representation choices. Reverse direction probes show strong asymmetry: FDA-derived factors do not transfer meaningfully in any direction, and NMPA-derived factors do not carry signal back to EU MDR. Zero-shot transfer further fails on EU MDR Class I, consistent with a mismatch between residual and positional class definitions. Overall, cross-jurisdictional transfer is sparse, asymmetric, and weak. Shared regulatory vocabulary does not, under this operationalisation, translate into strong portable classification logic. The findings challenge a common assumption in cross-jurisdictional regulatory AI and show how explainable AI can be used to measure, rather than assume, regulatory overlap.


[95] 2603.28559

Joint Energy Efficiency Optimization for Uplink Multiuser Movable Antenna-Based Wireless Systems Assisted by Movable-Element RIS

This paper investigates energy efficiency (EE) optimization for an uplink multiuser system assisted by a movable-element reconfigurable intelligent surface (ME-RIS) and a base station equipped with movable antennas (MA-BS). We jointly optimize the uplink postcoder vectors, user transmit powers, RIS phase shift, and the positions of both the BS antennas and RIS elements to maximize the system EE. The resulting non-convex fractional problem is solved using an alternating optimization (AO) framework, where subproblems are handled via Dinkelbach's method combined with successive convex approximation (SCA). Simulation results show that the proposed scheme achieves significant EE gains over fixed-antenna BS and fixed-element RIS benchmarks.


[96] 2603.28608

Fault-Tolerant MPC Control for Trajectory Tracking

An MPC controller uses a model of the dynamical system to plan an optimal control strategy for a finite horizon, which makes its performance intrinsically tied to the quality of the model. When faults occur, the compromised model will degrade the performance of the MPC with this impact being dependent on the designed cost function. In this paper, we aim to devise a strategy that combines active fault identification while driving the system towards the desired trajectory. The explored approaches make use of an exact formulation of the problem in terms of set-based propagation resorting to Constrained Convex Generators (CCGs) and a suboptimal version that resorts to the SVD decomposition to achieve the active fault isolation in order to adapt the model in runtime.


[97] 2603.28711

Learning a dynamic four-chamber shape model of the human heart for 95,695 UK Biobank participants

The human heart is a sophisticated system composed of four cardiac chambers with distinct shapes, which function in a coordinated manner. Existing shape models of the heart mainly focus on the ventricular chambers and they are derived from relatively small datasets. Here, we present a spatio-temporal (3D+t) statistical shape model of all four cardiac chambers, learnt from a large population of nearly 100,000 participants from the UK Biobank. A deep learning-based pipeline is developed to reconstruct 3D+t four-chamber meshes from the cardiac magnetic resonance images of the UK Biobank imaging population. Based on the reconstructed meshes, a 3D+t statistical shape model is learnt to characterise the shape variations and motion patterns of the four cardiac chambers. We reveal the associations of the four-chamber shape model with demographics, anthropometrics, cardiovascular risk factors, and cardiac diseases. Compared to conventional image-derived phenotypes, we validate that the four-chamber shape-derived phenotypes significantly enhance the performance in downstream tasks, including cardiovascular disease classification and heart age prediction. Furthermore, we demonstrate the effectiveness of shape-derived phenotypes in novel applications such as heart shape retrieval and heart re-identification from longitudinal data. To facilitate future research, we will release the learning-based mesh reconstruction pipeline, the four-chamber cardiac shape model, and return all derived four-chamber meshes to the UK Biobank.


[98] 2603.28714

VAANI: Capturing the language landscape for an inclusive digital India

Project VAANI is an initiative to create an India-representative multi-modal dataset that comprehensively maps India's linguistic diversity, starting with 165 districts across the country in its first two phases. Speech data is collected through a carefully structured process that uses image-based prompts to encourage spontaneous responses. Images are captured through a separate process that encompasses a broad range of topics, gathered from both within and across districts. The collected data undergoes a rigorous multi-stage quality evaluation, including both automated and manual checks to ensure highest possible standards in audio quality and transcription accuracy. Following this thorough validation, we have open-sourced around 289K images, approximately 31,270 hours of audio recordings, and around 2,067 hours of transcribed speech, encompassing 112 languages from 165 districts from 31 States and Union territories. Notably, significant of these languages are being represented for the first time in a dataset of this scale, making the VAANI project a groundbreaking effort in preserving and promoting linguistic inclusivity. This data can be instrumental in building inclusive speech models for India, and in advancing research and development across speech, image, and multimodal applications.


[99] 2603.28717

Can Hierarchical Cross-Modal Fusion Predict Human Perception of AI Dubbed Content?

Evaluating AI generated dubbed content is inherently multi-dimensional, shaped by synchronization, intelligibility, speaker consistency, emotional alignment, and semantic context. Human Mean Opinion Scores (MOS) remain the gold standard but are costly and impractical at scale. We present a hierarchical multimodal architecture for perceptually meaningful dubbing evaluation, integrating complementary cues from audio, video, and text. The model captures fine-grained features such as speaker identity, prosody, and content from audio, facial expressions and scene-level cues from video and semantic context from text, which are progressively fused through intra and inter-modal layers. Lightweight LoRA adapters enable parameter-efficient fine-tuning across modalities. To overcome limited subjective labels, we derive proxy MOS by aggregating objective metrics with weights optimized via active learning. The proposed architecture was trained on 12k Hindi-English bidirectional dubbed clips, followed by fine-tuning with human MOS. Our approach achieves strong perceptual alignment (PCC > 0.75), providing a scalable solution for automatic evaluation of AI-dubbed content.


[100] 2603.28719

Alertness Optimization for Shift Workers Using a Physiology-based Mathematical Model

Sleep is vital for maintaining cognitive function, facilitating metabolic waste removal, and supporting memory consolidation. However, modern societal demands, particularly shift work, often disrupt natural sleep patterns. This can induce excessive sleepiness among shift workers in critical sectors such as healthcare and transportation and increase the risk of accidents. The primary contributors to this issue are misalignments of circadian rhythms and enforced sleep-wake schedules. Regulating circadian rhythms that are tied to alertness can be regarded as a control problem with control inputs in the form of light and sleep schedules. In this paper, we address the problem of optimizing alertness by optimizing light and sleep schedules to improve the cognitive performance of shift workers. A key tool in our approach is a mathematical model that relates the control input variables (sleep and lighting schedules) to the dynamics of the circadian clock and sleep. In the sleep and circadian modeling literature, the newer physiology-based model shows better accuracy in predicting the alertness of shift workers than the phenomenology-based model, but the dynamics of physiological-based model have differential equations with different time scales, which pose challenges in optimization. To overcome the challenge, we propose a hybrid version of the PR model by applying singular perturbation techniques to reduce the system to a non-stiff, differentiable hybrid system. This reformulation facilitates the application of the calculus of variation and the gradient descent method to find the optimal light and sleep schedules that maximize the subjective alertness of shift worker. Our approach is validated through numerical simulations, and the simulation results demonstrate improved alertness compared to other existing schedules.


[101] 2603.28723

Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes

Articulatory-to-acoustic inversion strongly depends on the type of data used. While most previous studies rely on EMA, which is limited by the number of sensors and restricted to accessible articulators, we propose an approach aiming at a complete inversion of the vocal tract, from the glottis to the lips. To this end, we used approximately 3.5 hours of RT-MRI data from a single speaker. The innovation of our approach lies in the use of articulator contours automatically extracted from MRI images, rather than relying on the raw images themselves. By focusing on these contours, the model prioritizes the essential geometric dynamics of the vocal tract while discarding redundant pixel-level information. These contours, alongside denoised audio, were then processed using a Bi-LSTM architecture. Two experiments were conducted: (1) the analysis of the impact of the audio embedding, for which three types of embeddings were evaluated as input to the model (MFCCs, LCCs, and HuBERT), and (2) the study of the influence of the dataset size, which we varied from 10 minutes to 3.5 hours. Evaluation was performed on the test data using RMSE, median error, as well as Tract Variables, to which we added an additional measurement: the larynx height. The average RMSE obtained is 1.48\,mm, compared with the pixel size (1.62\,mm). These results confirm the feasibility of a complete vocal-tract inversion using RT-MRI data.


[102] 2603.28736

Deterministic Modeling of Dynamic ISAC Channels in RF Digital Twin Environments

This paper introduces a methodology to calibrate Radio-Frequency Digital Twins (RF-DTs) for Integrated Sensing and Communication (ISAC) in dynamic wireless environments. The approach leverages high-resolution ray tracing in combination with wideband channel sounding to ensure consistency between simulated and measured propagation. The methodology is validated in urban scenarios featuring both mono-static and bi-static configurations, as well as moving user platforms and vehicles. Results show that the calibrated RF-DT reproduces key propagation effects, including multipath evolution, dynamic scatterers, and Doppler-induced signatures, with close agreement to measurements. These findings confirm that accurate geometry, material modeling, antenna patterns, and diffuse scattering are essential for realistic high-frequency ISAC simulation. By bridging the gap between simulation and measurement, the proposed calibration framework provides a scalable tool for developing and evaluating ISAC algorithms in complex, time-varying environments envisioned for 6G.


[103] 2603.28737

ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining

We introduce ParaSpeechCLAP, a dual-encoder contrastive model that maps speech and text style captions into a common embedding space, supporting a wide range of intrinsic (speaker-level) and situational (utterance-level) descriptors (such as pitch, texture and emotion) far beyond the narrow set handled by existing models. We train specialized ParaSpeechCLAP-Intrinsic and ParaSpeechCLAP-Situational models alongside a unified ParaSpeechCLAP-Combined model, finding that specialization yields stronger performance on individual style dimensions while the unified model excels on compositional evaluation. We further show that ParaSpeechCLAP-Intrinsic benefits from an additional classification loss and class-balanced training. We demonstrate our models' performance on style caption retrieval, speech attribute classification and as an inference-time reward model that improves style-prompted TTS without additional training. ParaSpeechCLAP outperforms baselines on most metrics across all three applications. Our models and code are released at this https URL .


[104] 2603.28749

Spatial Degrees of Freedom and Channel Strength for Antenna Systems

The number of spatial degrees of freedom (NDoF) and channel strength in antenna systems are examined within a geometric framework. Starting from a correlation-operator representation of the channel between transmitter and receiver regions, we analyze the associated eigenspectrum and relate the NDoF to its spectral transition (corner). We compare the spectrum-based effective NDoF and effective rank metrics, clarifying their behavior for both idealized and realistic eigenvalue distributions. In parallel, we develop geometry-based asymptotic estimates in terms of mutual shadow (view) measures and coupling strength. Specifically, we show that while the projected length or area predicts the number of usable modes in two- and three-dimensional settings, the coupling strength determines the average eigenvalue level. Canonical configurations of parallel lines and regions are used to derive closed-form asymptotic expressions for the effective NDoF, revealing significant deviations from the spectral corner in closely spaced configurations. The results illustrate that these are physically grounded. The proposed theory and techniques are computationally efficient and form a toolbox for estimating the modal richness in near-field channels, with implications for array design, inverse problems, and high-capacity communication systems.


[105] 2603.28754

Sparse State-Space Realizations of Linear Controllers

This paper provides a novel approach for finding sparse state-space realizations of linear systems (e.g., controllers). Sparse controllers are commonly used in distributed control, where a controller is synthesized with some sparsity penalty. Here, motivated by a modeling problem in sensorimotor neuroscience, we study a complementary question: given a linear time-invariant system (e.g., controller) in transfer function form and a desired sparsity pattern, can we find a suitably sparse state-space realization for the transfer function? This problem is highly nonconvex, but we propose an exact method to solve it. We show that the problem reduces to finding an appropriate similarity transform from the modal realization, which in turn reduces to solving a system of multivariate polynomial equations. Finally, we leverage tools from algebraic geometry (namely, the Gröbner basis) to solve this problem exactly. We provide algorithms to find real- and complex-valued sparse realizations and demonstrate their efficacy on several examples.


[106] 2603.28758

$\mathcal{L}_1$-Certified Distributionally Robust Planning for Safety-Constrained Adaptive Control

Safe operation of autonomous systems requires robustness to both model uncertainty and uncertainty in the environment. We propose a hierarchical framework for stochastic nonlinear systems that integrates distributionally robust model predictive control (DR-MPC) with $\mathcal{L}_1$-adaptive control. The key idea is to use the $\mathcal{L}_1$ adaptive controller's online distributional certificates that bound the Wasserstein distance between nominal and true state distributions, thereby certifying the ambiguity sets used for planning without requiring distribution samples. Environment uncertainty is captured via data-driven ambiguity sets constructed from finite samples. These are incorporated into a DR-MPC planner enforcing distributionally robust chance constraints over a receding horizon. Using Wasserstein duality, the resulting problem admits tractable reformulations and a sample-based implementation. We show theoretically and via numerical experimentation that our framework ensures certifiable safety in the presence of simultaneous system and environment uncertainties.


[107] 2603.26711

Surface-Constrained Offline Warping with Contact-Aware Online Pose Projection for Safe Robotic Trajectory Execution

Robotic manipulation tasks that require repeated tool motion along curved surfaces frequently arise in surface finishing, inspection, and guided interaction. In practice, nominal motion primitives are often designed independently of the deployment surface and later reused across varying geometries. Directly tiling such primitives onto nonplanar surfaces introduces geometric inconsistencies, leading to interpenetration, orientation discontinuities, and cumulative drift over repeated cycles. We present a two-stage framework that separates geometric embedding from execution-level regulation. An offline surface-constrained warping operator embeds a nominal periodic primitive onto curved surfaces through asymmetric diffeomorphic deformation of dual-track waypoints and axis-consistent orientation completion, producing a surface-adapted reference trajectory. An online contact-aware projection operator then enforces bounded deviation relative to this reference using FSR-driven disturbance adaptation and a conic orientation safety constraint. Experiments across multiple analytic surface families and real-robot validation on a sinusoidal surface demonstrate improved geometric continuity, reduced large orientation jumps, and robust contact maintenance compared with direct tiling. These results show that decoupling offline geometric remapping from lightweight online projection enables stable and repeatable surface-embedded trajectory execution under sensor-lite feedbacks.


[108] 2603.26713

Boundary-aware Prototype-driven Adversarial Alignment for Cross-Corpus EEG Emotion Recognition

Electroencephalography (EEG)-based emotion recognition suffers from severe performance degradation when models are transferred across heterogeneous datasets due to physiological variability, experimental paradigm differences, and device inconsistencies. Existing domain adversarial methods primarily enforce global marginal alignment and often overlook class-conditional mismatch and decision boundary distortion, limiting cross-corpus generalization. In this work, we propose a unified Prototype-driven Adversarial Alignment (PAA) framework for cross-corpus EEG emotion recognition. The framework is progressively instantiated in three configurations: PAA-L, which performs prototype-guided local class-conditional alignment; PAA-C, which further incorporates contrastive semantic regularization to enhance intra-class compactness and inter-class separability; and PAA-M, the full boundary-aware configuration that integrates dual relation-aware classifiers within a three-stage adversarial optimization scheme to explicitly refine controversial samples near decision boundaries. By combining prototype-guided subdomain alignment, contrastive discriminative enhancement, and boundary-aware aggregation within a coherent adversarial architecture, the proposed framework reformulates emotion recognition as a relation-driven representation learning problem, reducing sensitivity to label noise and improving cross-domain stability. Extensive experiments on SEED, SEED-IV, and SEED-V demonstrate state-of-the-art performance under four cross-corpus evaluation protocols, with average improvements of 6.72\%, 5.59\%, 6.69\%, and 4.83\%, respectively. Furthermore, the proposed framework generalizes effectively to clinical depression identification scenarios, validating its robustness in real-world heterogeneous settings. The source code is available at \textit{this https URL}


[109] 2603.26763

A Near-Raw Talking-Head Video Dataset for Various Computer Vision Tasks

Talking-head videos constitute a predominant content type in real-time communication, yet publicly available datasets for video processing research in this domain remain scarce and limited in signal fidelity. In this paper, we open-source a near-raw dataset of 847 talking-head recordings (approximately 212 minutes), each 15\,s in duration, captured from 805 participants using 446 unique consumer webcam devices in their natural environments. All recordings are stored using the FFV1 lossless codec, preserving the camera-native signal -- uncompressed (24.4\%) or MJPEG-encoded (75.6\%) -- without additional lossy processing. Each recording is annotated with a Mean Opinion Score (MOS) and ten perceptual quality tokens that jointly explain 64.4\% of the MOS variance. From this corpus, we curate a stratified benchmarking subset of 120 clips in three content conditions: original, background blur, and background replacement. Codec efficiency evaluation across four datasets and four codecs, namely H.264, H.265, H.266, and AV1, yields VMAF BD-rate savings up to $-71.3\%$ (H.266) relative to H.264, with significant encoder$\times$dataset ($\eta_p^2 = .112$) and encoder$\times$content condition ($\eta_p^2 = .149$) interactions, demonstrating that both content type and background processing affect compression efficiency. The dataset offers 5$\times$ the scale of the largest prior talking-head webcam dataset (847 vs.\ 160 clips) with lossless signal fidelity, establishing a resource for training and benchmarking video compression and enhancement models in real-time communication.


[110] 2603.26802

Deep Learning Aided Vision System for Planetary Rovers

This study presents a vision system for planetary rovers, combining real-time perception with offline terrain reconstruction. The real-time module integrates CLAHE enhanced stereo imagery, YOLOv11n based object detection, and a neural network to estimate object distances. The offline module uses the Depth Anything V2 metric monocular depth estimation model to generate depth maps from captured images, which are fused into dense point clouds using Open3D. Real world distance estimates from the real time pipeline provide reliable metric context alongside the qualitative reconstructions. Evaluation on Chandrayaan 3 NavCam stereo imagery, benchmarked against a CAHV based utility, shows that the neural network achieves a median depth error of 2.26 cm within a 1 to 10 meter range. The object detection model maintains a balanced precision recall tradeoff on grayscale lunar scenes. This architecture offers a scalable, compute-efficient vision solution for autonomous planetary exploration.


[111] 2603.26810

Unblur-SLAM: Dense Neural SLAM for Blurry Inputs

We propose Unblur-SLAM, a novel RGB SLAM pipeline for sharp 3D reconstruction from blurred image inputs. In contrast to previous work, our approach is able to handle different types of blur and demonstrates state-of-the-art performance in the presence of both motion blur and defocus blur. Moreover, we adjust the computation effort with the amount of blur in the input image. As a first stage, our method uses a feed-forward image deblurring model for which we propose a suitable training scheme that can improve both tracking and mapping modules. Frames that are successfully deblurred by the feed-forward network obtain refined poses and depth through local-global multi-view optimization and loop closure. Frames that fail the first stage deblurring are directly modeled through the global 3DGS representation and an additional blur network to model multiple blurred sub-frames and simulate the blur formation process in 3D space, thereby learning sharp details and refined sub-frame poses. Experiments on several real-world datasets demonstrate consistent improvements in both pose estimation and sharp reconstruction results of geometry and texture.


[112] 2603.26856

AFSS: Artifact-Focused Self-Synthesis for Mitigating Bias in Audio Deepfake Detection

The rapid advancement of generative models has enabled highly realistic audio deepfakes, yet current detectors suffer from a critical bias problem, leading to poor generalization across unseen datasets. This paper proposes Artifact-Focused Self-Synthesis (AFSS), a method designed to mitigate this bias by generating pseudo-fake samples from real audio via two mechanisms: self-conversion and self-reconstruction. The core insight of AFSS lies in enforcing same-speaker constraints, ensuring that real and pseudo-fake samples share identical speaker identity and semantic content. This forces the detector to focus exclusively on generation artifacts rather than irrelevant confounding factors. Furthermore, we introduce a learnable reweighting loss to dynamically emphasize synthetic samples during training. Extensive experiments across 7 datasets demonstrate that AFSS achieves state-of-the-art performance with an average EER of 5.45\%, including a significant reduction to 1.23\% on WaveFake and 2.70\% on In-the-Wild, all while eliminating the dependency on pre-collected fake datasets. Our code is publicly available at this https URL.


[113] 2603.26859

Beyond Textual Knowledge-Leveraging Multimodal Knowledge Bases for Enhancing Vision-and-Language Navigation

Vision-and-Language Navigation (VLN) requires an agent to navigate through complex unseen environments based on natural language instructions. However, existing methods often struggle to effectively capture key semantic cues and accurately align them with visual observations. To address this limitation, we propose Beyond Textual Knowledge (BTK), a VLN framework that synergistically integrates environment-specific textual knowledge with generative image knowledge bases. BTK employs Qwen3-4B to extract goal-related phrases and utilizes Flux-Schnell to construct two large-scale image knowledge bases: R2R-GP and REVERIE-GP. Additionally, we leverage BLIP-2 to construct a large-scale textual knowledge base derived from panoramic views, providing environment-specific semantic cues. These multimodal knowledge bases are effectively integrated via the Goal-Aware Augmentor and Knowledge Augmentor, significantly enhancing semantic grounding and cross-modal alignment. Extensive experiments on the R2R dataset with 7,189 trajectories and the REVERIE dataset with 21,702 instructions demonstrate that BTK significantly outperforms existing baselines. On the test unseen splits of R2R and REVERIE, SR increased by 5% and 2.07% respectively, and SPL increased by 4% and 3.69% respectively. The source code is available at this https URL.


[114] 2603.26939

Multilingual Stutter Event Detection for English, German, and Mandarin Speech

This paper presents a multi-label stuttering detection system trained on multi-corpus, multilingual data in English, German, and this http URL leveraging annotated stuttering data from three languages and four corpora, the model captures language-independent characteristics of stuttering, enabling robust detection across linguistic contexts. Experimental results demonstrate that multilingual training achieves performance comparable to and, in some cases, even exceeds that of previous systems. These findings suggest that stuttering exhibits cross-linguistic consistency, which supports the development of language-agnostic detection systems. Our work demonstrates the feasibility and advantages of using multilingual data to improve generalizability and reliability in automated stuttering detection.


[115] 2603.26963

On the Optimal Number of Grids for Differentially Private Non-Interactive $K$-Means Clustering

Differentially private $K$-means clustering enables releasing cluster centers derived from a dataset while protecting the privacy of the individuals. Non-interactive clustering techniques based on privatized histograms are attractive because the released data synopsis can be reused for other downstream tasks without additional privacy loss. The choice of the number of grids for discretizing the data points is crucial, as it directly controls the quantization bias and the amount of noise injected to preserve privacy. The widely adopted strategy selects a grid size that is independent of the number of clusters and also relies on empirical tuning. In this work, we revisit this choice and propose a refined grid-size selection rule derived by minimizing an upper bound on the expected deviation in the K-means objective function, leading to a more principled discretization strategy for non-interactive private clustering. Compared to prior work, our grid resolution differs both in its dependence on the number of clusters and in the scaling with dataset size and privacy budget. Extensive numerical results elucidate that the proposed strategy results in accurate clustering compared to the state-of-the-art techniques, even under tight privacy budgets.


[116] 2603.26988

Rhythmic segment analysis: Conceptualizing, visualizing, and measuring rhythmic data

This paper develops a framework for conceptualizing, visualizing, and measuring regularities in rhythmic data. I propose to think about rhythmic data in terms of interval segments: fixed-length groups of consecutive intervals, which can be decomposed into a duration and a pattern (the ratios between the intervals). This simple conceptual framework unifies three rhythmic visualization methods and yields a fourth: the pattern-duration plot. When paired with a cluster transition network, it intuitively reveals regularities in both synthetic and real-world rhythmic data. Moreover, the framework generalizes two common measures of rhythmic structure: rhythm ratios and the normalized pairwise variability index (nPVI). In particular, nPVI can be reconstructed as the average distance from isochrony, and I propose a more general measure of anisochrony to replace it. Finally, the novel concept of quantality may shed light on wider debates regarding small-integer-ratio rhythms.


[117] 2603.27025

Fixed-wing UAV relay optimization for coverage hole recovery

Unmanned aerial vehicles (UAVs) fill coverage holes as wireless relays during emergency situations. Fixed-wing UAVs offer longer flight duration and larger coverage in such situations than rotary-wing counterparts. Maximizing the effectiveness of fixed-wing UAV relay systems requires careful tuning of system and flight parameters. This process is challenging because factors including flight trajectory, timeshare, and user scheduling are not easily optimized. In this paper, we propose an optimization for UAV-based wireless relaying networks based on a setup which is applicable to arbitrary spatial user positions. In the setup, a fixed-wing UAV flies over a circular trajectory and relays data from ground users in a coverage hole to a distant base station (BS). Our optimization iteratively maximizes the average achievable spectral efficiency (SE) for the UAV trajectory, user scheduling, and relay timeshare. The simulation results show that our optimization is effective for varying user distributions and that it performs especially well on distributions with a high standard deviation.


[118] 2603.27159

Online Learning of Kalman Filtering: From Output to State Estimation

In this paper, we study the problem of learning Kalman filtering with unknown system model in partially observed linear dynamical systems. We propose a unified algorithmic framework based on online optimization that can be used to solve both the output estimation and state estimation scenarios. By exploring the properties of the estimation error cost functions, such as conditionally strong convexity, we show that our algorithm achieves a $\log T$-regret in the horizon length $T$ for the output estimation scenario. More importantly, we tackle the more challenging scenario of learning Kalman filtering for state estimation, which is an open problem in the literature. We first characterize a fundamental limitation of the problem, demonstrating the impossibility of any algorithm to achieve sublinear regret in $T$. By further introducing a random query scheme into our algorithm, we show that a $\sqrt{T}$-regret is achievable when rendering the algorithm limited query access to more informative measurements of the system state in practice. Our algorithm and regret readily capture the trade-off between the number of queries and the achieved regret, and shed light on online learning problems with limited observations. We validate the performance of our algorithms using numerical examples.


[119] 2603.27237

Can pre-trained Deep Learning models predict groove ratings?

This study explores the extent to which deep learning models can predict groove and its related perceptual dimensions directly from audio signals. We critically examine the effectiveness of seven state-of-the-art deep learning models in predicting groove ratings and responses to groove-related queries through the extraction of audio embeddings. Additionally, we compare these predictions with traditional handcrafted audio features. To better understand the underlying mechanics, we extend this methodology to analyze predictions based on source-separated instruments, thereby isolating the contributions of individual musical elements. Our analysis reveals a clear separation of groove characteristics driven by the underlying musical style of the tracks (funk, pop, and rock). These findings indicate that deep audio representations can successfully encode complex, style-dependent groove components that traditional features often miss. Ultimately, this work highlights the capacity of advanced deep learning models to capture the multifaceted concept of groove, demonstrating the strong potential of representation learning to advance predictive Music Information Retrieval methodologies.


[120] 2603.27261

MD-RWKV-UNet: Scale-Aware Anatomical Encoding with Cross-Stage Fusion for Multi-Organ Segmentation

Multi-organ segmentation in medical imaging remains challenging due to large anatomical variability, complex inter-organ dependencies, and diverse organ scales and shapes. Conventional encoder-decoder architectures often struggle to capture both fine-grained local details and long-range context, which are crucial for accurate delineation - especially for small or deformable organs. To address these limitations, we propose MD-RWKV-UNet, a dynamic encoder network that enables scale-aware representation and spatially adaptive context modeling. At its core is the MD-RWKV block, a dual-path module that integrates deformable spatial shifts with the Receptance Weighted Key Value mechanism, allowing the receptive field to adapt dynamically to local structural cues. We further incorporate Selective Kernel Attention to enable adaptive selection of convolutional kernels with varying receptive fields, enhancing multi-scale interaction and improving robustness to organ size and shape variation. In parallel, a cross-stage dual-attention fusion strategy aggregates multi-level features across the encoder, preserving low-level structure while enhancing semantic consistency. Unlike methods that stack static convolutions or rely heavily on global attention, our approach provides a lightweight yet expressive solution for dynamic organ modeling. Experiments on Synapse and ACDC demonstrate state-of-the-art performance, particularly in boundary precision and small-organ segmentation.


[121] 2603.27273

Robust Global-Local Behavior Arbitration via Continuous Command Fusion Under LiDAR Errors

Modular autonomous driving systems must coordinate global progress objectives with local safety-driven reactions under imperfect sensing and strict real-time constraints. This paper presents a ROS2-native arbitration module that continuously fuses the outputs of two unchanged and interpretable controllers: a global reference-tracking controller based on Pure Pursuit and a reactive LiDAR-based Gap Follow controller. At each control step, both controllers propose Ackermann commands, and a PPO-trained policy predicts a continuous gate from a compact feature observation to produce a single fused drive command, augmented with practical safety checks. For comparison under identical ROS topic inputs and control rate, we implement a lightweight sampling-based predictive baseline. Robustness is evaluated using a ROS2 impairment protocol that injects LiDAR noise, delay, and dropout, and additionally sweeps forward-cone false short-range outliers. In a repeatable close-proximity passing scenario, we report safe success and failure rates together with per-step end-to-end controller runtime as sensing stress increases. The study is intended as a command-level robustness evaluation in a modular ROS2 setting, not as a replacement for planning-level interaction reasoning.


[122] 2603.27305

Reconfiguring room-scale magnetoquasistatic wireless power transfer with hierarchical resonators

Magnetoquasistatic wireless power transfer can deliver substantial power to mobile devices over near-field links. Room-scale implementations, such as quasistatic cavity resonators, extend this capability over large enclosed volumes, but their efficiency drops sharply for centimeter-scale or misoriented receivers because the magnetic field is spatially broad and weakly coupled to small coils. Here, we introduce hierarchical resonators that act as selectively activated relays within a room-scale quasistatic cavity resonator, capturing the ambient magnetic field and re-emitting it to concentrate flux at a target receiver. This architecture reconfigures the wireless power environment on demand and enables localized energy delivery to miniature devices. Experimentally, the hierarchical link improves power transfer efficiency by more than two orders of magnitude relative to direct room-scale transfer and delivers up to 500 mW of DC power to a 15 mm receiver. We further demonstrate selective multi-relay operation and field reorientation for furniture-embedded charging scenarios. These results establish a scalable route to reconfigurable wireless power delivery for miniature and batteryless devices in room-scale environments.


[123] 2603.27306

GUIDE: Guided Updates for In-context Decision Evolution in LLM-Driven Spacecraft Operations

Large language models (LLMs) have been proposed as supervisory agents for spacecraft operations, but existing approaches rely on static prompting and do not improve across repeated executions. We introduce \textsc{GUIDE}, a non-parametric policy improvement framework that enables cross-episode adaptation without weight updates by evolving a structured, state-conditioned playbook of natural-language decision rules. A lightweight acting model performs real-time control, while offline reflection updates the playbook from prior trajectories. Evaluated on an adversarial orbital interception task in the Kerbal Space Program Differential Games environment, GUIDE's evolution consistently outperforms static baselines. Results indicate that context evolution in LLM agents functions as policy search over structured decision rules in real-time closed-loop spacecraft interaction.


[124] 2603.27382

Dynamic Constrained Stabilization on the $n$-sphere

We consider the constrained stabilization problem of second-order systems evolving on the n-sphere. We propose a control strategy with a constraint proximity-based dynamic damping mechanism that ensures safe and almost global asymptotic stabilization of the target point in the presence of star-shaped constraints on the n-sphere. It is also shown that the proposed approach can be used to deal with the constrained rigid-body attitude stabilization. The effectiveness of the proposed approach is demonstrated through simulation results on the 2-sphere in the presence of star-shaped constraint sets.


[125] 2603.27442

Interpretable Physics Extraction from Data for Linear Dynamical Systems using Lie Generator Networks

When the system is linear, why should learning be nonlinear? Linear dynamical systems, the analytical backbone of control theory, signal processing and circuit analysis, have exact closed-form solutions via the state transition matrix. Yet when system parameters must be inferred from data, recent neural approaches offer flexibility at the cost of physical guarantees: Neural ODEs provide flexible trajectory approximation but may violate physical invariants, while energy preserving architectures do not natively represent dissipation essential to real-world systems. We introduce Lie Generator Networks (LGN), which learn a structured generator A and compute trajectories directly via matrix exponentiation. This shift from integration to exponentiation preserves structure by construction. By parameterizing A = S - D (skew-symmetric minus positive diagonal), stability and dissipation emerge from the underlying architecture and are not introduced during training via the loss function. LGN provides a unified framework for linear conservative, dissipative, and time-varying systems. On a 100-dimensional stable RLC ladder, standard derivative-based least-squares system identification can yield unstable eigenvalues. The unconstrained LGN yields stable but physically incorrect spectra, whereas LGN-SD recovers all 100 eigenvalues with over two orders of magnitude lower mean eigenvalue error than unconstrained alternatives. Critically, these eigenvalues reveal poles, natural frequencies, and damping ratios which are interpretable physics that black-box networks do not provide.


[126] 2603.27523

Field-Assisted Molecular Communication: Girsanov-Based Channel Modeling and Dynamic Waveform Optimization

Analytical modeling of field-assisted molecular communication under dynamic electric fields is fundamentally challenging due to the coupling between stochastic transport and complex boundary geometries, which renders conventional partial differential equation (PDE) approaches intractable. In this work, we introduce a stochastic framework based on the Cameron-Martin-Girsanov theorem to address this challenge. By leveraging a change-of-measure technique, we derive analytically tractable channel impulse response (CIR) expressions for both fully-absorbing and passive spherical receivers, where the latter serves as an exact mathematical baseline to validate our framework. Building upon these models, we establish a dynamic waveform design framework for system optimization. Under a maximum a posteriori decision-feedback equalizer (MAP-DFE) framework, we show that the first-slot received probability serves as the primary determinant of the bit error probability (BEP), while inter-symbol interference manifests as higher-order corrections. Exploiting the monotonic response of the fully-absorbing architecture and using the limitations of the passive model to justify this strategic focus, we reformulate BEP minimization into a distance-based optimization problem. We propose a unified, low-complexity Maximize Received Probability (MRP) algorithm, encompassing the Maximize Hitting Probability (MHP) and Maximize Sensing Probability (MSP) methods, to dynamically enhance desired signals and suppress inter-symbol interference. Numerical results validate the accuracy of the proposed modeling approach and demonstrate near-optimal detection performance.


[127] 2603.27548

Control Forward-Backward Consistency: Quantifying the Accuracy of Koopman Control Family Models

This paper extends the forward-backward consistency index, originally introduced in Koopman modeling of systems without input, to the setting of control systems, providing a closed-form computable measure of accuracy for data-driven models associated with the Koopman Control Family (KCF). Building on a forward-backward regression perspective, we introduce the control forward-backward consistency matrix and demonstrate that it possesses several favorable properties. Our main result establishes that the relative root-mean-square error of KCF function predictors is strictly bounded by the square root of the control consistency index, defined as the maximum eigenvalue of the consistency matrix. This provides a sharp, closed-form computable error bound for finite-dimensional KCF models. We further specialize this bound to the widely used lifted linear and bilinear models. We also discuss how the control consistency index can be incorporated into optimization-based modeling and illustrate the methodology via simulations.


[128] 2603.27575

Decentralized MARL for Coarse Correlated Equilibrium in Aggregative Markov Games

This paper studies the problem of decentralized learning of Coarse Correlated Equilibrium (CCE) in aggregative Markov games (AMGs), where each agent's instantaneous reward depends only on its own action and an aggregate quantity. Existing CCE learning algorithms for general Markov games are not designed to leverage the aggregative structure, and research on decentralized CCE learning for AMGs remains limited. We propose an adaptive stage-based V-learning algorithm that exploits the aggregative structure under a fully decentralized information setting. Based on the two-timescale idea, the algorithm partitions learning into stages and adjusts stage lengths based on the variability of aggregate signals, while using no-regret updates within each stage. We prove the algorithm achieves an epsilon-approximate CCE in O(S Amax T5 / epsilon2) episodes, avoiding the curse of multiagents which commonly arises in MARL. Numerical results verify the theoretical findings, and the decentralized, model-free design enables easy extension to large-scale multi-agent scenarios.


[129] 2603.27583

LLM-Enabled Low-Altitude UAV Natural Language Navigation via Signal Temporal Logic Specification Translation and Repair

Natural language (NL) navigation for low-altitude unmanned aerial vehicles (UAVs) offers an intelligent and convenient solution for low-altitude aerial services by enabling an intuitive interface for non-expert operators. However, deploying this capability in urban environments necessitates the precise grounding of underspecified instructions into safety-critical, dynamically feasible motion plans subject to spatiotemporal constraints. To address this challenge, we propose a unified framework that translates NL instructions into Signal Temporal Logic (STL) specifications and subsequently synthesizes trajectories via mixed-integer linear programming (MILP). Specifically, to generate executable STL formulas from free-form NL, we develop a reasoning-enhanced large language model (LLM) leveraging chain-of-thought (CoT) supervision and group-relative policy optimization (GRPO), which ensures high syntactic validity and semantic consistency. Furthermore, to resolve infeasibilities induced by stringent logical or spatial requirements, we introduce a specification repair mechanism. This module combines MILP-based diagnosis with LLM-guided semantic reasoning to selectively relax task constraints while strictly enforcing safety guarantees. Extensive simulations and real-world flight experiments demonstrate that the proposed closed-loop framework significantly improves NL-to-STL translation robustness, enabling safe, interpretable, and adaptable UAV navigation in complex scenarios.


[130] 2603.27711

Low-loss phononic integrated circuits based on a silicon nitride-lithium niobate platform

Microwave-frequency acoustic waves in solids have emerged as a versatile platform for both classical and quantum applications. While phononic integrated devices and circuits are being developed on various material platforms, an ideal phononic integrated circuit (PnIC) platform should simultaneously support low-loss waveguide structures, high-quality-factor resonators, high-performance modulators, and efficient electromechanical transducers. Here, we establish a low-loss gigahertz-frequency PnIC platform based on patterned thin-film silicon nitride (SiN) on lithium niobate (LN) substrate. We develop low-loss PnIC building blocks including waveguides, directional couplers, and high-quality-factor (high-Q) ring resonators. As an application, we demonstrate a 1-GHz phononic oscillator based on a ring resonator, reaching a low phase noise of -159.0 dBc/Hz at a 100-kHz offset frequency. Our low-loss PnICs could meet the requirements in microwave acoustics, quantum phononics, and integrated hybrid systems combining phonons, photons, superconducting qubits, and solid-state defects.


[131] 2603.27798

Towards Emotion Recognition with 3D Pointclouds Obtained from Facial Expression Images

Facial Emotion Recognition is a critical research area within Affective Computing due to its wide-ranging applications in Human Computer Interaction, mental health assessment and fatigue monitoring. Current FER methods predominantly rely on Deep Learning techniques trained on 2D image data, which pose significant privacy concerns and are unsuitable for continuous, real-time monitoring. As an alternative, we propose High-Frequency Wireless Sensing (HFWS) as an enabler of continuous, privacy-aware FER, through the generation of detailed 3D facial pointclouds via on-person sensors embedded in wearables. We present arguments supporting the privacy advantages of HFWS over traditional 2D imaging, particularly under increasingly stringent data protection regulations. A major barrier to adopting HFWS for FER is the scarcity of labeled 3D FER datasets. Towards addressing this issue, we introduce a FLAME-based method to generate 3D facial pointclouds from existing public 2D datasets. Using this approach, we create AffectNet3D, a 3D version of the AffectNet database. To evaluate the quality and usability of the generated data, we design a pointcloud refinement pipeline focused on isolating the facial region, and train the popular PointNet++ model on the refined pointclouds. Fine-tuning the model on a small subset of the unseen 3D FER dataset BU-3DFE yields a classification accuracy exceeding 70%, comparable to oracle-level performance. To further investigate the potential of HFWS-based FER for continuous monitoring, we simulate wearable sensing conditions by masking portions of the generated pointclouds. Experimental results show that models trained on AffectNet3D and fine-tuned with just 25% of BU-3DFE outperform those trained solely on BU-3DFE. These findings highlight the viability of our pipeline and support the feasibility of continuous, privacy-aware FER via wearable HFWS systems.


[132] 2603.27803

Distributed Online Submodular Maximization under Communication Delays: A Simultaneous Decision-Making Approach

We provide a distributed online algorithm for multi-agent submodular maximization under communication delays. We are motivated by the future distributed information-gathering tasks in unknown and dynamic environments, where utility functions naturally exhibit the diminishing-returns property, i.e., submodularity. Existing approaches for online submodular maximization either rely on sequential multi-hop communication, resulting in prohibitive delays and restrictive connectivity assumptions, or restrict each agent's coordination to its one-hop neighborhood only, thereby limiting the coordination performance. To address the issue, we provide the Distributed Online Greedy (DOG) algorithm, which integrates tools from adversarial bandit learning with delayed feedback to enable simultaneous decision-making across arbitrary network topologies. We provide the approximation performance of DOG against an optimal solution, capturing the suboptimality cost due to decentralization as a function of the network structure. Our analyses further reveal a trade-off between coordination performance and convergence time, determined by the magnitude of communication delays. By this trade-off, DOG spans the spectrum between the state-of-the-art fully centralized online coordination approach [1] and fully decentralized one-hop coordination approach [2].


[133] 2603.27833

Optimal Switching in Networked Control Systems: Finite Horizon

In this work, we first prove that the separation principle holds for switched LQR problems under i.i.d. zero-mean disturbances with a symmetric distribution. We then solve the dynamic programming problem and show that the optimal switching policy is a symmetric threshold rule on the accumulated disturbance since the most recent update, while the optimal controller is a discounted linear feedback law independent of the switching policy.


[134] 2603.27853

Fronthaul Network Planning for Hierarchical and Radio-Stripes-Enabled CF-mMIMO in O-RAN

The deployment of ultra-dense networks (UDNs), particularly cell-free massive MIMO (CF-mMIMO), is mainly hindered by costly and capacity-limited fronthaul links. This work proposes a two-tiered optimization framework for cost-effective hybrid fronthaul planning, comprising a Near-Optimal Fronthaul Association and Configuration (NOFAC) algorithm in the first tier and an Integer Linear Program (ILP) in the second, integrating fiber optics, millimeter-wave (mmWave), and free-space optics (FSO) technologies. The proposed framework accommodates various functional split (FS) options (7.2x and 8), decentralized processing levels, and network configurations. We introduce the hierarchical scheme (HS) as a resilient, cost-effective fronthaul solution for CF-mMIMO and compare its performance with radio-stripes (RS)-enabled CF-mMIMO, validating both across diverse dense topologies within the open radio access network (O-RAN) architecture. Results show that the proposed framework achieves better cost-efficiency and higher capacity compared to traditional benchmark schemes such as all-fiber fronthaul network. Our key findings reveal fiber dominance in highly decentralized deployments, mmWave suitability in moderately centralized scenarios, and FSO complements both by bridging deployment gaps. Additionally, FS7.2x consistently outperforms FS8, offering greater capacity at lower cost, affirming its role as the preferred O-RAN functional split. Most importantly, our study underscores the importance of hybrid fronthaul effective planning for UDNs in minimizing infrastructural redundancy, and ensuring scalability to meet current and future traffic demands.


[135] 2603.27912

Safety Guardrails in the Sky: Realizing Control Barrier Functions on the VISTA F-16 Jet

The advancement of autonomous systems -- from legged robots to self-driving vehicles and aircraft -- necessitates executing increasingly high-performance and dynamic motions without ever putting the system or its environment in harm's way. In this paper, we introduce Guardrails -- a novel runtime assurance mechanism that guarantees dynamic safety for autonomous systems, allowing them to safely evolve on the edge of their operational domains. Rooted in the theory of control barrier functions, Guardrails offers a control strategy that carefully blends commands from a human or AI operator with safe control actions to guarantee safe behavior. To demonstrate its capabilities, we implemented Guardrails on an F-16 fighter jet and conducted flight tests where Guardrails supervised a human pilot to enforce g-limits, altitude bounds, geofence constraints, and combinations thereof. Throughout extensive flight testing, Guardrails successfully ensured safety, keeping the pilot in control when safe to do so and minimally modifying unsafe pilot inputs otherwise.


[136] 2603.27939

Adaptive Multi-Dimensional Coordinated Comprehensive Routing Scheme for IoV

The characteristics of high-speed node movement and dynamic topology changes pose great challenges to the design of internet of vehicles (IoV) routing protocols. Existing schemes suffer from common problems such as insufficient adaptability and lack of global consideration, making it difficult to achieve a globally optimal balance between routing reliability, real-time performance and transmission efficiency. This paper proposes an adaptive multi-dimensional coordinated comprehensive routing scheme for IoV environments. A complete IoV system model including network topology, communication links, hierarchical congestion and transmission delay is first constructed, the routing problem is abstracted into a single-objective optimization model with multiple constraints, and a single-hop link comprehensive routing metric integrating link reliability, node local load, network global congestion and link stability is defined. Second, an intelligent transmission switching mechanism is designed: candidate nodes are screened through dual criteria of connectivity and progressiveness, a dual decision-making of primary and backup paths and a threshold switching strategy are introduced to avoid link interruption and congestion, and an adaptive update function is constructed to dynamically adjust weight coefficients and switching thresholds to adapt to changes in network status. Simulation results show that the proposed scheme can effectively adapt to the high dynamic topology and network congestion characteristics of IoV, perform excellently in key indicators such as routing interruption times, packet delivery rate and end-to-end delay, and its comprehensive performance is significantly superior to traditional routing schemes.


[137] 2603.27976

Physics-informed line-of-sight learning for scalable deterministic channel modeling

Deterministic channel modeling maps a physical environment to its site-specific electromagnetic response. Ray tracing produces complete multi-dimensional channel information but remains prohibitively expensive for area-wide deployment. We identify line-of-sight (LoS) region determination as the dominant bottleneck. To address this, we propose D$^2$LoS, a physics-informed neural network that reformulates dense pixel-level LoS prediction into sparse vertex-level visibility classification and projection point regression, avoiding the spectral bias at sharp boundaries. A geometric post-processing step enforces hard physical constraints, yielding exact piecewise-linear boundaries. Because LoS computation depends only on building geometry, cross-band channel information is obtained by updating material parameters without retraining. We also construct RayVerse-100, a ray-level dataset spanning 100 urban scenarios with per-ray complex gain, angle, delay, and geometric trajectory. Evaluated against rigorous ray tracing ground truth, D$^2$LoS achieves 3.28~dB mean absolute error in received power, 4.65$^\circ$ angular spread error, and 20.64~ns delay spread error, while accelerating visibility computation by over 25$\times$.


[138] 2603.28057

Physics-Embedded Feature Learning for AI in Medical Imaging

Deep learning (DL) models have achieved strong performance in an intelligence healthcare setting, yet most existing approaches operate as black boxes and ignore the physical processes that govern tumor growth, limiting interpretability, robustness, and clinical trust. To address this limitation, we propose PhysNet, a physics-embedded DL framework that integrates tumor growth dynamics directly into the feature learning process of a convolutional neural network (CNN). Unlike conventional physics-informed methods that impose physical constraints only at the output level, PhysNet embeds a reaction diffusion model of tumor growth within intermediate feature representations of a ResNet backbone. The architecture jointly performs multi-class tumor classification while learning a latent tumor density field, its temporal evolution, and biologically meaningful physical parameters, including tumor diffusion and growth rates, through end-to-end training. This design is necessary because purely data-driven models, even when highly accurate or ensemble-based, cannot guarantee physically consistent predictions or provide insight into tumor behavior. Experimental results on a large brain MRI dataset demonstrate that PhysNet outperforms multiple state-of-the-art DL baselines, including MobileNetV2, VGG16, VGG19, and ensemble models, achieving superior classification accuracy and F1-score. In addition to improved performance, PhysNet produces interpretable latent representations and learned bio-physical parameters that align with established medical knowledge, highlighting physics-embedded representation learning as a practical pathway toward more trustworthy and clinically meaningful medical AI systems.


[139] 2603.28243

Cost-Matching Model Predictive Control for Efficient Reinforcement Learning in Humanoid Locomotion

In this paper, we propose a cost-matching approach for optimal humanoid locomotion within a Model Predictive Control (MPC)-based Reinforcement Learning (RL) framework. A parameterized MPC formulation with centroidal dynamics is trained to approximate the action-value function obtained from high-fidelity closed-loop data. Specifically, the MPC cost-to-go is evaluated along recorded state-action trajectories, and the parameters are updated to minimize the discrepancy between MPC-predicted values and measured returns. This formulation enables efficient gradient-based learning while avoiding the computational burden of repeatedly solving the MPC problem during training. The proposed method is validated in simulation using a commercial humanoid platform. Results demonstrate improved locomotion performance and robustness to model mismatch and external disturbances compared with manually tuned baselines.


[140] 2603.28252

Secret Key Rate Analysis of RIS-Assisted THz MIMO CV-QKD Systems under Localized and Global Eavesdropping

A multiple-input multiple-output (MIMO) system operating at terahertz (THz) frequencies and consisting of a transmitter, Alice, that encodes secret keys using Gaussian-modulated coherent states, which are communicated to a legitimate receiver, Bob, under the assistance of a reconfigurable intelligent surface (RIS) is considered in this paper. The composite wireless channel comprising the direct Alice-to-Bob signal propagation path and the RIS-enabled reflected one is modeled as a passive linear Gaussian quantum channel, allowing for a unitary dilation that preserves the canonical commutation relations. The security of the considered RIS-empowered MIMO system is analyzed under collective Gaussian entangling attacks, according to which an eavesdropper, Eve, is assumed to have access to environmental modes associated with specific propagation segments. We also study, as a benchmark, the case where Eve has access to the purification of the overall channel. The legitimate receiver, Bob, is designed to deploy homodyne detection and reverse reconciliation for key extraction. Novel expressions for the achievable secret key rate (SKR) of the system are derived for both the considered eavesdropping scenarios. Furthermore, an optimization framework is developed to determine the optimal RIS phase configuration matrix that maximizes the SKR performance. The resulting optimization problem is efficiently solved using particle swarm optimization. Numerical results are presented to demonstrate the system's performance with respect to various free parameters. It is showcased that the considered RIS plays a crucial role in enhancing the SKR of the system as well as in extending the secure communication range. This establishes RIS-assisted THz MIMO CV-QKD as a promising solution for next generation secure wireless networks.


[141] 2603.28310

Compact Continuous-Variable Quantum Key Distribution System Employing Monolithically Integrated Silicon Photonic Transceiver

We demonstrate the first CV-QKD system featuring a custom-designed monolithic silicon photonic dual-polarisation transceiver. Leveraging PS-64-QAM, we achieved 1.9 Mbit/s secret key rate across 25 km of standard single-mode fibre, highlighting the potential of electronic-photonic integration for practical QKD.


[142] 2603.28369

Age of Incorrect Information for Generic Discrete-Time Markov Sources

This work introduces a framework for analyzing the Age of Incorrect Information (AoII) in a real-time monitoring system with a generic discrete-time Markov source. We study a noisy communication system employing a hybrid automatic repeat request (HARQ) protocol, subject to a transmission rate constraint. The optimization problem is formulated as a constrained Markov decision process (CMDP), and it is shown that there exists an optimal policy that is a randomized mixture of two stationary policies. To overcome the intractability of computing the optimal stationary policies, we develop a multiple-threshold policy class where thresholds depend on the source, the receiver, and the packet count. By establishing a Markov renewal structure induced by threshold policies, we derive closed-form expressions for the long-term average AoII and transmission rate. The proposed policy is constructed via a relative value iteration algorithm that leverages the threshold structure to skip computations, combined with a bisection search to satisfy the rate constraint. To accommodate scenarios requiring lower computational complexity, we adapt the same technique to produce a simpler single-threshold policy that trades optimality for efficiency. Numerical experiments exhibit that both thresholdbased policies outperform periodic scheduling, with the multiplethreshold approach matching the performance of the globally optimal policy.


[143] 2603.28390

SVH-BD : Synthetic Vegetation Hyperspectral Benchmark Dataset for Emulation of Remote Sensing Images

This dataset provides a large collection of 10,915 synthetic hyperspectral image cubes paired with pixel-level vegetation trait maps, designed to support research in radiative transfer emulation, vegetation trait retrieval, and uncertainty quantification. Each hyperspectral cube contains 211 bands spanning 400--2500 nm at 10 nm resolution and a fixed spatial layout of 64 \times 64 pixels, offering continuous simulated surface reflectance spectra suitable for emulator development and machine-learning tasks requiring high spectral detail. Vegetation traits were derived by inverting Sentinel-2 Level-2A surface reflectance using a PROSAIL-based lookup-table approach, followed by forward PROSAIL simulations to generate hyperspectral reflectance under physically consistent canopy and illumination conditions. The dataset covers four ecologically diverse regions -- East Africa, Northern France, Eastern India, and Southern Spain -- and includes 5th and 95th percentile uncertainty maps as well as Sentinel-2 scene classification layers. This resource enables benchmarking of inversion methods, development of fast radiative transfer emulators, and studies of spectral--biophysical relationships under controlled yet realistic environmental variability.


[144] 2603.28562

Coalition Formation with Limited Information Sharing for Local Energy Management

Distributed energy systems with prosumers require new methods for coordinating energy exchange among agents. Coalitional control provides a framework in which agents form groups to cooperatively reduce costs; however, existing bottom-up coalition-formation methods typically require full information sharing, raising privacy concerns and imposing significant computational overhead. In this work, we propose a limited information coalition-formation algorithm that requires only limited aggregate information exchange among agents. By constructing an upper bound on the value of candidate coalitions, we eliminate the need to solve optimisation problems for each potential merge, significantly reducing computational complexity while limiting information exchange. We prove that the proposed method guarantees cost no greater than that of decentralised operation. Coalition strategies are optimised using a distributed approach based on the Alternating Direction Method of Multipliers (ADMM), further limiting information sharing within coalitions. We embed the framework within a model predictive control scheme and evaluate it on real-world data, demonstrating improved economic performance over decentralised control with substantially lower computational cost than full-information approaches.


[145] 2603.28563

Learning Where to Look: UCB-Driven Controlled Sensing for Quickest Change Detection

We study the multichannel quickest change detection problem with bandit feedback and controlled sensing, in which an agent sequentially selects one of the data streams to observe at each time-step and aims to detect an unknown change as quickly as possible while controlling false alarms. Assuming known pre- and post-change distributions and allowing an arbitrary subset of streams to be affected by the change, we propose two novel and computationally efficient detection procedures inspired by the Upper Confidence Bound (UCB) multi-armed bandit algorithm. Our methods adaptively concentrate sensing on the most informative streams while preserving false-alarm guarantees. We show that both procedures achieve first-order asymptotic optimality in detection delay under standard false-alarm constraints. We also extend the UCB-driven controlled sensing approach to the setting where the pre- and post-change distributions are unknown, except for a mean-shift in at least one of the channels at the change-point. This setting is particularly relevant to the problem of learning in piecewise stationary environments. Finally, extensive simulations on synthetic benchmarks show that our methods consistently outperform existing state-of-the-art approaches while offering substantial computational savings.


[146] 2603.28625

Dynamic Lookahead Distance via Reinforcement Learning-Based Pure Pursuit for Autonomous Racing

Pure Pursuit (PP) is a widely used path-tracking algorithm in autonomous vehicles due to its simplicity and real-time performance. However, its effectiveness is sensitive to the choice of lookahead distance: shorter values improve cornering but can cause instability on straights, while longer values improve smoothness but reduce accuracy in curves. We propose a hybrid control framework that integrates Proximal Policy Optimization (PPO) with the classical Pure Pursuit controller to adjust the lookahead distance dynamically during racing. The PPO agent maps vehicle speed and multi-horizon curvature features to an online lookahead command. It is trained using Stable-Baselines3 in the F1TENTH Gym simulator with a KL penalty and learning-rate decay for stability, then deployed in a ROS2 environment to guide the controller. Experiments in simulation compare the proposed method against both fixed-lookahead Pure Pursuit and an adaptive Pure Pursuit baseline. Additional real-car experiments compare the learned controller against a fixed-lookahead Pure Pursuit controller. Results show that the learned policy improves lap-time performance and repeated lap completion on unseen tracks, while also transferring zero-shot to hardware. The learned controller adapts the lookahead by increasing it on straights and reducing it in curves, demonstrating effectiveness in augmenting a classical controller by online adaptation of a single interpretable parameter. On unseen tracks, the proposed method achieved 33.16 s on Montreal and 46.05 s on Yas Marina, while tolerating more aggressive speed-profile scaling than the baselines and achieving the best lap times among the tested settings. Initial real-car experiments further support sim-to-real transfer on a 1:10-scale autonomous racing platform


[147] 2603.28747

Constrained Optimization on Matrix Lie Groups via Interior-Point Method

This paper proposes an interior-point framework for constrained optimization problems whose decision variables evolve on matrix Lie groups. The proposed method, termed the Matrix Lie Group Interior-Point Method (MLG-IPM), operates directly on the group structure using a minimal Lie algebra parametrization, avoiding redundant matrix representations and eliminating explicit dependence on Riemannian metrics. A primal-dual formulation is developed in which the Newton system is constructed through sensitivity and curvature matrices. Also, multiplicative updates are performed via the exponential map, ensuring intrinsic feasibility with respect to the group structure while maintaining strict positivity of slack and dual variables through a barrier strategy. A local analysis establishes quadratic convergence under standard regularity assumptions and characterizes the behavior under inexact Newton steps. Statistical comparisons against Riemannian Interior-Point Methods, specifically for optimization problems defined over the Special Orthogonal Group SO(n) and Special Linear Group SL(n), demonstrate that the proposed approach achieves higher success rates, fewer iterations, and superior numerical accuracy. Furthermore, its robustness under perturbations suggests that this method serves as a consistent and reliable alternative for structured manifold optimization.


[148] 1906.05284

Image-Adaptive GAN based Reconstruction

In the recent years, there has been a significant improvement in the quality of samples produced by (deep) generative models such as variational auto-encoders and generative adversarial networks. However, the representation capabilities of these methods still do not capture the full distribution for complex classes of images, such as human faces. This deficiency has been clearly observed in previous works that use pre-trained generative models to solve imaging inverse problems. In this paper, we suggest to mitigate the limited representation capabilities of generators by making them image-adaptive and enforcing compliance of the restoration with the observations via back-projections. We empirically demonstrate the advantages of our proposed approach for image super-resolution and compressed sensing.


[149] 2411.09363

When Mamba Meets xLSTM: An Efficient and Precise Method with the xLSTM-VMUNet Model for Skin lesion Segmentation

Automatic melanoma segmentation is essential for early skin cancer detection, yet challenges arise from the heterogeneity of melanoma, as well as interfering factors like blurred boundaries, low contrast, and imaging artifacts. While numerous algorithms have been developed to address these issues, previous approaches have often overlooked the need to jointly capture spatial and sequential features within dermatological images. This limitation hampers segmentation accuracy, especially in cases with indistinct borders or structurally similar lesions. Additionally, previous models lacked both a global receptive field and high computational efficiency. In this work, we present the xLSTM-VMUNet Model, which jointly capture spatial and sequential features within dermatological images successfully. xLSTM-VMUNet can not only specialize in extracting spatial features from images, focusing on the structural characteristics of skin lesions, but also enhance contextual understanding, allowing more effective handling of complex medical image structures. Experiment results on the ISIC2018 dataset demonstrate that xLSTM-VMUNet outperforms VMUNet by 4.85% on DSC and 6.41% on IoU on the ISIC2017 dataset, by 1.25% on DSC and 2.07% on IoU on the ISIC2018 dataset, with faster convergence and consistently high segmentation performance. Our code is available at this https URL.


[150] 2411.19765

Secure Filtering against Spatio-Temporal False Data Attacks under Asynchronous Sampling

This paper addresses the secure state estimation problem for continuous linear time-invariant systems with non-periodic and asynchronous sampled measurements, where the sensors need to transmit not only measurements but also sampling time-stamps to the fusion center. This measurement and communication setup is well-suited for operating large-scale control systems and, at the same time, introduces new vulnerabilities that can be exploited by adversaries through (i) manipulation of measurements, (ii) manipulation of time-stamps, (iii) elimination of measurements, (iv) generation of completely new false measurements, or a combination of these attacks. To mitigate these attacks, we propose a decentralized estimation algorithm in which each sensor maintains its local state estimate asynchronously based on its measurements. The local states are synchronized through time prediction and fused after time-stamp alignment. In the absence of attacks, state estimates are proven to recover the optimal Kalman estimates by solving a weighted least square problem. In the presence of attacks, solving this weighted least square problem with the aid of $\ell_1$ regularization provides secure state estimates with uniformly bounded error under an observability redundancy assumption. The effectiveness of the proposed algorithm is demonstrated using a benchmark example of the IEEE 14-bus system.


[151] 2412.04620

A CAV-based perimeter-free regional traffic control strategy utilizing existing parking infrastructure

This paper proposes a novel perimeter-free regional traffic management strategy for networks under a connected and autonomous vehicle (CAV) environment. The proposed strategy requires a subset of CAVs to temporarily wait at nearby parking facilities when the network is congested. After a designated holding time, these CAVs are allowed to re-enter the network. Doing so helps reduce congestion and improve overall operational efficiency. Unlike traditional perimeter control approaches, the proposed strategy leverages existing parking infrastructure to temporarily hold vehicles in a way that partially avoids local queue accumulation issues. Further, holding the vehicles with the longest remaining travel distances creates a self-reinforcing mechanism which helps reduce congestion more quickly than perimeter metering control. Simulation results show that the proposed strategy not only reduces travel time for vehicles that are not held, but can also reduce travel times for some of the held vehicles as well. Importantly, its performance has been demonstrated under various configurations of parking locations and capacities and CAV penetration rates.


[152] 2501.08667

TimeFlow: Temporal Conditioning for Longitudinal Brain MRI Registration and Aging Analysis

Longitudinal brain analysis is essential for understanding healthy aging and identifying pathological deviations. Longitudinal registration of sequential brain MRI underpins such analyses. However, existing methods are limited by reliance on densely sampled time series, a trade-off between accuracy and temporal smoothness, and an inability to prospectively forecast future brain states. To overcome these challenges, we introduce \emph{TimeFlow}, a learning-based framework for longitudinal brain MRI registration. TimeFlow uses a U-Net backbone with temporal conditioning to model neuroanatomy as a continuous function of age. Given only two scans from an individual, TimeFlow estimates accurate and temporally coherent deformation fields, enabling non-linear extrapolation to predict future brain states. This is achieved by our proposed inter-/extra-polation consistency constraints applied to both the deformation fields and deformed images. Remarkably, these constraints preserve temporal consistency and continuity without requiring explicit smoothness regularizers or densely sampled sequential data. Extensive experiments demonstrate that TimeFlow outperforms state-of-the-art methods in terms of both future timepoint forecasting and registration accuracy. Moreover, TimeFlow supports novel biological brain aging analyses by differentiating neurodegenerative trajectories from normal aging without requiring segmentation, thereby eliminating the need for labor-intensive annotations and mitigating segmentation inconsistency. TimeFlow offers an accurate, data-efficient, and annotation-free framework for longitudinal analysis of brain aging and chronic diseases, capable of forecasting brain changes beyond the observed study period.


[153] 2503.08915

Reconstruct Anything Model: a lightweight general model for computational imaging

Most existing learning-based methods for solving imaging inverse problems can be roughly divided into two classes: iterative algorithms, such as plug-and-play and diffusion methods leveraging pretrained denoisers, and unrolled architectures that are trained end-to-end for specific imaging problems. Iterative methods in the first class are computationally costly and often yield suboptimal reconstruction performance, whereas unrolled architectures are generally problem-specific and require expensive training. In this work, we propose a novel non-iterative, lightweight architecture that incorporates knowledge about the forward operator (acquisition physics and noise parameters) without relying on unrolling. Our model is trained to solve a wide range of inverse problems, such as deblurring, magnetic resonance imaging, computed tomography, inpainting, and super-resolution, and handles arbitrary image sizes and channels, such as grayscale, complex, and color data. The proposed model can be easily adapted to unseen inverse problems or datasets with a few fine-tuning steps (up to a few images) in a self-supervised way, without ground-truth references. Throughout a series of experiments, we demonstrate state-of-the-art performance from medical imaging to low-photon imaging and microscopy. Our code is available at this https URL.


[154] 2503.13479

EAGLE: Contextual Point Cloud Generation via Adaptive Continuous Normalizing Flow with Self-Attention

As 3D point clouds become the prevailing shape representation in computer vision, generating high-quality point clouds remains a challenging problem. Flow-based models have shown strong potential due to exact likelihood estimation and invertible mappings. However, existing flow-based methods for point clouds typically rely on point-wise feature extractors, which limits their ability to model long-range dependencies and global structural relationships among points. Inspired by the wide adoption of Transformers, we explored the complementary roles of self-attention mechanisms, CNN, and flow-based model. To this end, we propose EAGLE, a probabilistic generative model that integrates self-attention mechanisms with adaptive continuous normalizing flows. The self-attention module explicitly models pairwise dependencies among points, enabling effective capture of global contextual information. In addition, we introduce an adaptive bias correction mechanism within flow-based models, which dynamically adjusts to different input contexts and alleviates bias-drift issues. Extensive experiments on ShapeNet and ModelNet datasets demonstrate the effectiveness of our proposed method.


[155] 2504.15540

Explicit Ensemble Mean Clock Synchronization for Optimal Atomic Time Scale Generation

This paper presents a novel theoretical framework, called explicit ensemble mean (EEM) synchronization. This framework unifies time scale generation, clock synchronization, and oscillator frequency regulation within the systems and control theory paradigm. By exploiting the observable canonical decomposition of a standard atomic ensemble clock model, the system is decomposed into two complementary components: the observable part, which represents the synchronization error, and the unobservable part, which captures the synchronization destination. Within this structure, we mathematically prove that standard Kalman filtering, which is widely used in current time scale generation, not only performs observable state estimation, but also significant unobservable state estimation, and it can be interpreted as a special case of the proposed framework that optimizes long-term frequency stability in terms of the Allan variance. Furthermore, applying state feedback control based on Kalman filtering to each component achieves optimal time scale generation, clock synchronization, and oscillator frequency regulation in a unified manner. The proposed framework provides a foundation for developing explainable timing systems.


[156] 2505.07240

Continuous-Time Control Synthesis for Multiple Quadrotors under Signal Temporal Logic Specifications

Continuous-time control of multiple quadrotors in constrained environments under signal temporal logic (STL) specifications is critical due to their nonlinear dynamics, safety constraints, and the requirement to ensure continuous-time satisfaction of the specifications. To ensure such control, a two-stage framework is proposed to address this challenge. First, based on geometric control, a Lyapunov-based analysis of the rotational tracking dynamics is performed to facilitate multidimensional gain design. In addition, tracking-error bounds for subsequent STL robustness analysis are derived. Second, using the tracking-error bounds, a mixed-integer convex programming (MICP)-based planning framework with a backward-recursive scheme is developed. The framework is used to generate reference trajectories that satisfy multi-agent STL tasks while meeting the trajectory requirements imposed by geometric control. Numerical simulations demonstrate that, compared with uniform gains, the optimized multidimensional gains yield less conservative time-varying bounds, mitigate oscillations, and improve transient performance, while the proposed framework ensures the satisfaction of multi-agent STL tasks in constrained environments with provable tracking guarantees.


[157] 2506.01399

Captivity-Escape Games as a Means for Safety in Online Motion Generation

This paper presents a method that addresses the conservatism, computational effort, and limited numerical accuracy of existing frameworks and methods that ensure safety in online model-based motion generation, commonly referred to as fast and safe tracking. Computational limitations restrict online motion planning to low-fidelity models. However, planning with low-fidelity models compromises safety, as the dynamic feasibility of resulting references is not ensured. This potentially leads to unavoidable tracking errors that may cause safety-critical constraint violations. Existing frameworks mitigate this safety risk by augmenting safety-critical constraints in motion planning by a safety margin that prevents constraint violations under worst-case tracking errors. However, the methods employed in these frameworks determine the safety margin based on a heuristically selected performance of the model used for planning, which likely results in overly conservative references. Furthermore, these methods are computationally intensive, and the state-of-the-art method is limited in numerical accuracy. We adopt a different perspective and address these limitations with a method that mitigates conservatism in existing frameworks by adapting the performance of the model used for planning to a given safety margin. Our method achieves numerical accuracy and requires significantly less computation time than existing methods by leveraging a captivity-escape game, which is a novel zero-sum differential game formulated in this paper. We demonstrate our method using a numerical example and compare it to the state of the art.


[158] 2506.08861

Distributed component-level modeling and control of energy dynamics in electric power systems

The widespread deployment of power electronic technologies is transforming modern power systems into fast, nonlinear, and heterogeneous networks. Conventional modeling and control approaches, rooted in quasi-static analysis and centralized architectures, are inadequate for these converter-dominated systems operating on fast timescales with diverse and proprietary component models. This paper adopts and extends a previously introduced energy space modeling framework grounded in energy conservation principles to address these challenges. We generalize the notion of a port interaction variable, which encodes energy exchange between interconnected components in a unified manner. A multilayered distributed control architecture is proposed in which dynamics of each component are lifted to a linear energy space through well-defined mappings. Distributed control with provable convergence guarantees is derived in energy space using only local states and minimal neighbor information communicated through port interactions. The framework is validated using two examples: voltage regulation in an inverter-controlled RLC circuit and frequency regulation of a synchronous generator. The energy-based controllers show improved transient and steady-state performance with reduced control effort compared to conventional methods.


[159] 2506.17337

Can Generalist Vision Language Models (VLMs) Rival Specialist Medical VLMs? Benchmarking and Strategic Insights

Vision Language Models (VLMs) have shown promise in automating image diagnosis and interpretation in clinical settings. However, developing specialist medical VLMs requires substantial computational resources and carefully curated datasets, and it remains unclear under which conditions generalist and specialist medical VLMs each perform best. This study highlights the complementary strengths of specialist medical and generalist VLMs. Specialists remain valuable in modality-aligned use cases, but we find that efficiently fine-tuned generalist VLMs can achieve comparable or even superior performance in most tasks, particularly when transferring to unseen or rare OOD medical modalities. These results suggest that generalist VLMs, rather than being constrained by their lack of specialist medical pretraining, may offer a scalable and cost-effective pathway for advancing clinical AI development.


[160] 2506.20882

Resilience Through Escalation: A Graph-Based PACE Architecture for Satellite Threat Response

Modern satellite systems face increasing operational risks from jamming, cyberattacks, and electromagnetic disruptions in contested space environments. Traditional redundancy strategies often fall short against such dynamic and multi-vector threats. This paper introduces a resilience by design framework grounded in the PACE (Primary, Alternate, Contingency, Emergency) methodology, originally developed for tactical communications in military operations, and adapts it to satellite systems through a layered state transition model informed by threat scoring frameworks such as CVSS, DREAD, and NASA's risk matrix. We define a dynamic resilience index to quantify system adaptability and implement three PACE variants (static, adaptive, and epsilon-greedy reward optimized) to evaluate resilience under diverse disruption scenarios. Results show that lightweight, decision aware fallback mechanisms can substantially improve survivability and operational continuity for next generation space assets.


[161] 2506.21798

Adaptive Multipath-Based SLAM for Distributed MIMO Systems

Localizing users and mapping the environment using radio signals is a key task in emerging applications such as low-latency communications and safety-critical navigation. Recently introduced multipath-based SLAM methods can jointly localize a mobile agent and map reflective surfaces in radio frequency (RF) environments. Most existing methods assume that map features and their corresponding RF propagation paths are statistically independent. This assumption neglects inherent dependencies that arise when a single reflective surface contributes to multiple propagation paths or when an agent communicates with multiple base stations. Existing approaches that aim to fuse information across propagation paths are further limited by their inability to perform ray tracing in RF environments with nonconvex geometries. In this paper, we propose a Bayesian multipath-based SLAM method for distributed MIMO systems that addresses these limitations. We exploit amplitude statistics to establish adaptive, time-varying detection probabilities. Based on the resulting 'soft' ray-tracing strategy, the proposed method can fuse information across propagation paths in RF environments with nonconvex geometries. A Bayesian estimation framework for the joint estimation of map features and agent state is developed by applying the message passing rules of the sum-product algorithm to a factor graph representation of the proposed statistical model. We further introduce a new initialization procedure for reflective surfaces that enables the introduction of new surface states even when measurements arise solely from double-bounce paths. The proposed method is validated using both synthetic and real RF measurements obtained in challenging scenarios with nonconvex geometries and OLoS conditions. The results demonstrate that it provides accurate localization and mapping performance and approaches the posterior CRLBs.


[162] 2507.18493

Global Observer Design for a Class of Linear Observed Systems on Groups

Linear observed systems on groups encode the geometry of a variety of practical state estimation problems. In this paper, we propose an observer framework for a class of linear observed systems by restricting a bi-invariant system on a Lie group to its normal subgroup. This structural property enables a system embedding of the original system into a linear time-varying system. An observer is constructed by first designing a Kalman-like observer for the embedded system and then reconstructing the group-valued state via optimization. Under an extrinsic observability rank condition, global exponential stability (GES) is achieved provided that one global optimum of the reconstruction optimization is found, reflecting the topological difficulties inherent to the non-Euclidean state space. Semi-global stability is guaranteed when input biases are jointly estimated. The theory is applied to the GES observer design for two-frame systems, capable of modeling a family of navigation problems. Simulations are provided to illustrate the implementation details.


[163] 2507.22513

PINN and GNN-based RF Map Construction for Wireless Communication Systems

Radio frequency (RF) map is a promising technique for capturing the characteristics of multipath signal propagation, offering critical support for channel modeling, coverage analysis, and beamforming in wireless communication networks. This paper proposes a novel RF map construction method based on a combination of physics-informed neural network (PINN) and graph neural network (GNN). The PINN incorporates physical constraints derived from electromagnetic propagation laws to guide the learning process, while the GNN models spatial correlations among receiver locations. By parameterizing multipath signals into received power, delay, and angle of arrival (AoA), and integrating both physical priors and spatial dependencies, the proposed method achieves accurate prediction of multipath parameters. Experimental results demonstrate that the method enables high-precision RF map construction under sparse sampling conditions and delivers robust performance in both indoor and complex outdoor environments, outperforming baseline methods in terms of generalization and accuracy.


[164] 2509.19315

Advancing Few-Shot Pediatric Arrhythmia Classification with a Novel Contrastive Loss and Multimodal Learning

Arrhythmias are a major cause of sudden cardiac death in children, making automated rhythm classification from electrocardiograms (ECGs) clinically important. However, pediatric arrhythmia analysis remains challenging because of age-dependent waveform variability, limited data availability, and a pronounced long-tailed class distribution that hinders recognition of rare but clinically important rhythms. To address these issues, we propose a multimodal end-to-end framework that integrates surface ECG and intracardiac electrogram (IEGM) signals for pediatric arrhythmia classification. The model combines dual-branch feature encoders, attention-based cross-modal fusion, and a lightweight Transformer classifier to learn complementary electrophysiological representations. We further introduce an Adaptive Global Class-Aware Contrastive Loss (AGCACL), which incorporates prototype-based alignment, class-frequency reweighting, and globally informed hard-class modulation to improve intra-class compactness and inter-class separability under class imbalance. We evaluate the proposed method on the pediatric subset of the Leipzig Heart Center ECG-Database and establish a reproducible preprocessing pipeline including rhythm-segment construction, denoising, and label grouping. The proposed approach achieves 96.22% Top-1 accuracy and improves macro precision, macro recall, macro F1 score, and macro F2 score by 4.48, 1.17, 6.98, and 7.34 percentage points, respectively, over the strongest baseline. These results indicate improved minority-sensitive classification performance on the current benchmark. However, further validation under subject-independent and multicenter settings is still required before clinical translation.


[165] 2510.00180

DiffAU: Diffusion-Based Ambisonics Upscaling

Spatial audio enhances immersion by reproducing 3D sound fields, with Ambisonics offering a scalable format for this purpose. While first-order Ambisonics (FOA) notably facilitates hardware-efficient acquisition and storage of sound fields as compared to high-order Ambisonics (HOA), its low spatial resolution limits realism, highlighting the need for Ambisonics upscaling (AU) as an approach for increasing the order of Ambisonics signals. In this work we propose DiffAU, a cascaded AU method that leverages recent developments in diffusion models combined with novel adaptation to spatial audio to generate 3rd order Ambisonics from FOA. By learning data distributions, DiffAU provides a principled approach that rapidly and reliably reproduces HOA in various settings. Experiments in anechoic conditions with multiple speakers, show strong objective and perceptual performance.


[166] 2510.01818

Joint Optimization of Speaker and Spoof Detectors for Spoofing-Robust Automatic Speaker Verification

Spoofing-robust speaker verification (SASV) combines the tasks of speaker and spoof detection to authenticate speakers under adversarial settings. Many SASV systems rely on fusion of speaker and spoof cues at embedding, score or decision levels, based on independently trained subsystems. In this study, we respect similar modularity of the two subsystems, by integrating their outputs using trainable back-end classifiers. In particular, we explore various approaches for directly optimizing the back-end for the recently-proposed SASV performance metric (a-DCF) as a training objective. Our experiments on the ASVspoof 5 dataset demonstrate two important findings: (i) nonlinear score fusion consistently improves a-DCF over linear fusion, and (ii) the combination of weighted cosine scoring for speaker detection with SSL-AASIST for spoof detection achieves state-of-the-art performance, reducing min a-DCF to 0.196 and SPF-EER to 7.6%. These contributions highlight the importance of modular design, calibrated integration, and task-aligned optimization for advancing robust and interpretable SASV systems.


[167] 2510.17775

Sample Complexity Analysis of Multi-Target Detection via Markovian and Hard-Core Multi-Reference Alignment

Motivated by single-particle cryo-electron microscopy, we study the sample complexity of the multi-target detection (MTD) problem, in which an unknown signal appears multiple times at unknown locations within a long, noisy observation. We propose a patching scheme that reduces MTD to a non-i.i.d. multi-reference alignment (MRA) model. In the one-dimensional setting, the latent group elements form a Markov chain, and we show that the convergence rate of any estimator matches that of the corresponding i.i.d. MRA model, up to a logarithmic factor in the number of patches. Moreover, for estimators based on empirical averaging, such as the method of moments, the convergence rates are identical in both settings. We further establish an analogous result in two dimensions, where the latent structure arises from an exponentially mixing random field generated by a hard-core placement model. As a consequence, if the signal in the corresponding i.i.d. MRA model is determined by moments up to order $n_{\min}$, then in the low-SNR regime the number of patches required to estimate the signal in the MTD model scales as $\sigma^{2n_{\min}}$, where $\sigma^2$ denotes the noise variance.


[168] 2510.19608

Optimal Kron-based Reduction of Networks (Opti-KRON) for Three-phase Distribution Feeders

This paper presents a novel structure-preserving, Kron-based reduction framework for unbalanced distribution feeders. The method aggregates electrically similar nodes within a mixed-integer optimization (MIP) problem to produce reduced networks that optimally reproduce the voltage profiles of the original full network. To overcome computational bottlenecks of MIP formulations, we propose an exhaustive-search formulation to identify optimal aggregation decisions while enforcing voltage margin limits. The proposed exhaustive network reduction algorithm is parallelizable on GPUs, which enables scalable network reduction. The resulting reduced networks approximate the full system's voltage profiles with low errors and are suitable for steady-state analysis and optimal power flow studies. The framework is validated on two real utility distribution feeders with 5,991 and 8,381 nodes. The reduced models achieve up to 90% and 80% network reduction, respectively, while the maximum voltage-magnitude error remains below 0.003 p.u. Furthermore, on a 1000-node version of the network, the GPU-accelerated reduction algorithm runs up to 15x faster than its CPU-based counterpart.


[169] 2510.23226

Inertia Partitioning Modular Robust Control Framework for Reconfigurable Multibody Systems

A novel modular modeling and control framework based on Lagrangian mechanics is proposed for multibody systems, motivated by the challenges of modular control of systems with closed kinematic chains and by the need for a modeling framework that remains locally updatable under reconfiguration of body-level geometric and inertial properties. In the framework, modularity is defined with respect to the degrees of freedom of the multibody system, represented in the model by the minimal generalized coordinates, and the inertial properties of each body are partitioned with respect to how they are reflected in the kinetic energy of the system through the motion induced by each degree of freedom. By expressing body contributions through body-fixed-frame Jacobians and spatial inertia matrices, the dynamic model remains locally updatable under changes in geometric and inertial parameters, which is advantageous for reconfigurable multibody systems. For multibody systems in which a mapping between the auxiliary and minimal generalized coordinates is available, the approach accommodates closed kinematic chains in a minimal-coordinate ordinary-differential-equation form without explicit constraint-force calculation or differential-algebraic-equation formulation. Based on the resulting modular equations of motion, a robust model-based controller is designed for trajectory tracking, and practical boundedness of the tracking error is analyzed under bounded uncertainty and external disturbance. The proposed framework is implemented in simulation on a three-degree-of-freedom series-parallel manipulator, where uncertainties and disturbances are introduced to assess robustness. The results are consistent with the expected stability and tracking performance, indicating the potential of the framework for trajectory-tracking control of reconfigurable multibody systems with closed kinematic chains.


[170] 2511.08504

Low Overhead and Scalable Time-Frequency Pilots Design for MIMO OTFS Channel Estimation

Orthogonal Time Frequency Space (OTFS) modulation has recently garnered attention for its robustness in high-mobility wireless communication environments. In OTFS, the data symbols are mapped to the Doppler-Delay (DD) domain. In this paper, we address low-overhead, scalable pilot-aided estimation of channel state information (CSI) for MIMO OTFS systems. Existing channel estimation techniques either require non-overlapping DD domain pilots with guard regions across multiple antennas, thus sacrificing significant communication rate as the number of transmit antennas increases, or allow pilots to overlap between antennas and rely on high-complexity methods to mitigate pilot pollution. We propose a novel pilot placement approach that embeds pilots within the time-frequency (TF) frame of each OTFS burst, along with a new use of TF and DD guard bins to preserve waveform orthogonality on the TF pilot bins and data integrity in the DD domain, respectively. The proposed pilot placement enables low-complexity coarse estimation of the channel parameters. Moreover, the pilot orthogonality allows the construction of a virtual array (VA), enabling the formulation of a sparse signal recovery (SSR) problem in which the coarse estimates are used to build a low-dimensional dictionary matrix. The SSR solution then yields high-resolution estimates of the channel parameters. Simulation results show that the proposed approach achieves good performance with very low overhead and is robust to pilot pollution. Importantly, the required overhead is independent of the number of transmit antennas, ensuring scalability to large MIMO arrays. The proposed approach accounts for practical rectangular transmit pulse shaping and receiver matched filtering, as well as fractional Doppler effects.


[171] 2511.09588

Diffusion-Based Quality Control of Medical Image Segmentations across Organs

Medical image segmentation using deep learning (DL) has enabled the development of automated analysis pipelines for large-scale population studies. However, state-of-the-art DL methods are prone to hallucinations, which can result in anatomically implausible segmentations. With manual correction impractical at scale, automated quality control (QC) techniques have to address the challenge. While promising, existing QC methods are organ-specific, limiting their generalizability and usability beyond their original intended task. To overcome this limitation, we propose no-new Quality Control (nnQC), a robust QC framework based on a diffusion-generative paradigm that self-adapts to any input organ dataset. Central to nnQC is a novel Team of Experts (ToE) architecture, where two specialized experts independently encode 3D spatial awareness, represented by the relative spatial position of an axial slice, and anatomical information derived from visual features from the original image. A weighted conditional module dynamically combines the pair of independent embeddings, or opinions to condition the sampling mechanism within a diffusion process, enabling the generation of a spatially aware pseudo-ground truth for predicting QC scores. Within its framework, nnQC integrates fingerprint adaptation to ensure adaptability across organs, datasets, and imaging modalities. We evaluated nnQC on seven organs using twelve publicly available datasets. Our results demonstrate that nnQC consistently outperforms state-of-the-art methods across all experiments, including cases where segmentation masks are highly degraded or completely missing, confirming its versatility and effectiveness across different organs.


[172] 2511.12051

Near-Real-Time InSAR Phase Estimation for Large-Scale Surface Displacement Monitoring

Operational near-real-time monitoring of Earth's surface deformation using Interferometric Synthetic Aperture Radar (InSAR) requires processing algorithms that efficiently incorporate new acquisitions without reprocessing historical archives. We present sequential phase linking approach using compressed single-look-complex images (SLCs) capable of producing surface displacement estimates within hours of the time of a new acquisition. Our key algorithmic contribution is a mini-stack reference scheme that maintains phase consistency across processing batches without adjusting or re-estimating previous time steps, enabling straightforward operational deployment. We introduce online methods for persistent and distributed scatterer identification that adapt to temporal changes in surface properties through incremental amplitude statistics updates. The processing chain incorporates multiple complementary metrics for pixel quality that are reliable for small SLC stack sizes, and an L1-norm network inversion to limit propagation of unwrapping errors across the time series. We use our algorithm to produce OPERA Surface Displacement from Sentinel-1 product, the first continental-scale surface displacement product over North America. Validation against GPS measurements and InSAR residual analysis demonstrates millimeter-level agreement in velocity estimates in varying environmental conditions. We demonstrate our algorithm's capabilities with a successful recovery of meter-scale co-eruptive displacement at Kilauea volcano during the 2018 eruption, as well as detection of subtle uplift at Three Sisters volcano, Oregon -- a challenging environment for C-band InSAR due to dense vegetation and seasonal snow. We have made all software available as open source libraries, providing a significant advancement to the open scientific community's ability to process large InSAR data sets in a cloud environment.


[173] 2511.12540

A mixed-signal analogue front-end for brain-implantable neural interfaces using a digital fixed-point IIR filter and bulk offset cancellation

Advances in miniaturised implantable neural electronics have paved the way for therapeutic brain-computer interfaces with clinical potential for movement disorders, epilepsy, and broader neurological applications. This paper presents a mixed-signal analogue front end (AFE) designed to record both extracellular action potentials (EAPs) and local field potentials (LFPs). The feedforward path integrates a low-noise amplifier (LNA) and a successive-approximation-register (SAR) analogue-to-digital converter (ADC), while the feedback path employs a fixed-point infinite-impulse-response (IIR) Chebyshev Type II low-pass filter to suppress sub-mHz components via bulk-voltage control of the LNA input differential pair using two R-2R pseudo-resistor digital-to-analogue converters (DACs). The proposed AFE achieves up to 41.42dB gain, consumes 2.178uA per channel, occupies 0.198mm2 per channel, and supports neural signal monitoring from 0.1Hz to 10kHz with 3.59uVrms input-referred integrated noise.


[174] 2511.12931

cryoSENSE: Compressive Sensing Enables High-throughput Microscopy with Sparse and Generative Priors on the Protein Cryo-EM Image Manifold

Cryo-electron microscopy (cryo-EM) enables the atomic-resolution visualization of biomolecules; however, modern direct detectors generate data volumes that far exceed the available storage and transfer bandwidth, thereby constraining practical throughput. We introduce cryoSENSE, the computational realization of a hardware-software co-designed framework for compressive cryo-EM sensing and acquisition. We show that cryo-EM images of proteins lie on low-dimensional manifolds that can be independently represented using sparse priors in predefined bases and generative priors captured by a denoising diffusion model. cryoSENSE leverages these low-dimensional manifolds to enable faithful image reconstruction from spatial and Fourier-domain undersampled measurements while preserving downstream structural resolution. In experiments, cryoSENSE increases acquisition throughput by up to 2.5$\times$ while retaining the original 3D resolution, offering controllable trade-offs between the number of masked measurements and the level of downsampling. Sparse priors favor faithful reconstruction from Fourier-domain measurements and moderate compression, whereas generative diffusion priors achieve accurate recovery from pixel-domain measurements and more severe undersampling. Project website: this https URL.


[175] 2511.15238

Computing Sound Lower and Upper Bounds on Hamilton-Jacobi Reach-Avoid Value Functions

Hamilton-Jacobi (HJ) reachability analysis is a fundamental tool for the safety verification and control synthesis of nonlinear control systems. Classical HJ reachability analysis methods compute value functions over grids which discretize the continuous state space. Such approaches do not account for discretization errors and thus do not guarantee that the sets represented by the computed value functions over-approximate the backward reachable sets (BRS) when given avoid specifications or under-approximate the reach-avoid sets (RAS) when given reach-avoid specifications. We address this issue by presenting an algorithm for computing sound upper and lower bounds on the HJ value functions that guarantee the sound over-approximation of BRS and under-approximation of RAS. Additionally, we develop a refinement algorithm that splits the grid cells which could not be classified as within or outside the BRS or RAS given the computed bounds to obtain corresponding tighter bounds. We validate the effectiveness of our algorithm in two case studies.


[176] 2512.17286

OpenPathNet: An Open-Source RF Multipath Data Generator for AI-Driven Wireless Systems

The convergence of artificial intelligence (AI) and sixth-generation (6G) wireless technologies is driving an urgent need for large-scale, high-fidelity, and reproducible radio frequency (RF) datasets. Existing resources, such as CKMImageNet, primarily provide preprocessed and image-based channel representations, which conceal the fine-grained physical characteristics of signal propagation that are essential for effective AI modeling. To bridge this gap, we present OpenPathNet, an open-source RF multipath data generator accompanied by a publicly released dataset for AI-driven wireless research. Distinct from prior datasets, OpenPathNet offers disaggregated and physically consistent multipath parameters, including per-path gain, time of arrival (ToA), and spatial angles, derived from high-precision ray tracing simulations constructed on real-world environment maps. By adopting a modular, parameterized pipeline, OpenPathNet enables reproducible generation of multipath data and can be readily extended to new environments and configurations, improving scalability and transparency. The released generator and accompanying dataset provide an extensible testbed that holds promise for advancing studies on channel modeling, beam prediction, environment-aware communication, and integrated sensing in AI-enabled 6G systems. The source code and dataset are publicly available at this https URL.


[177] 2512.19846

A Class of Axis-Angle Attitude Control Laws for Rotational Systems

We introduce a new class of attitude control laws for rotational systems; the proposed framework generalizes the use of the Euler \mbox{axis--angle} representation beyond quaternion-based formulations. Using basic Lyapunov stability theory and the notion of extended class $\mathcal{K}$ function, we developed a method for determining and enforcing the global asymptotic stability of the single fixed point of the resulting \mbox{\textit{closed-loop}} (CL) scheme. In contrast with traditional \mbox{quaternion-based} methods, the introduced generalized \mbox{axis--angle} approach enables greater flexibility in the design of the control law, which is of great utility when employed in combination with a switching scheme whose transition state depends on the angular velocity of the controlled rotational system. Through simulation and \mbox{real-time} experimental results, we demonstrate the effectiveness of the developed formulation. According to the recorded data, in the execution of \mbox{high-speed} \mbox{tumble-recovery} maneuvers, the new method consistently achieves shorter stabilization times and requires lower control effort relative to those corresponding to the \mbox{quaternion-based} and \mbox{geometric-control} methods used as benchmarks.


[178] 2602.01537

LMI Optimization Based Multirate Steady-State Kalman Filter Design

This paper presents an LMI-based design framework for multirate steady-state Kalman filters in systems with sensors operating at different sampling rates. The multirate system is formulated as a periodic time-varying system, where the Kalman gains converge to periodic steady-state values that repeat every frame period. Cyclic reformulation transforms this into a time-invariant problem; however, the resulting measurement noise covariance becomes semidefinite rather than positive definite, preventing direct application of standard Riccati equation methods. I address this through a dual LQR formulation with LMI optimization that naturally handles semidefinite covariances. The framework enables multi-objective design, supporting pole placement for guaranteed convergence rates and $l_2$-induced norm constraints for balancing average and worst-case performance. Numerical validation using an automotive navigation system with GPS and wheel speed sensors, including Monte Carlo simulation with 500 independent noise realizations, demonstrates that the proposed filter achieves a position RMSE well below the GPS noise level through effective multirate sensor fusion, and that the LMI solution provides valid upper bounds on the estimation error covariance.


[179] 2602.07029

Guidestar-Free Adaptive Optics with Asymmetric Apertures

This work introduces the first closed-loop adaptive optics (AO) system capable of optically correcting aberrations in real-time without a guidestar or a wavefront sensor. Nearly 40 years ago, Cederquist et al. demonstrated that asymmetric apertures enable phase retrieval (PR) algorithms to perform fully computational wavefront sensing, albeit at a high computational cost. More recently, Chimitt et al. extended this approach with machine learning and demonstrated real-time wavefront sensing using only a single (guidestar-based) point-spread-function (PSF) measurement. Inspired by these works, we introduce a guidestar-free AO framework built around asymmetric apertures and machine learning. Our approach combines three key elements: (1) an asymmetric aperture placed at the system's pupil plane that enables PR-based wavefront sensing, (2) a pair of machine learning algorithms that estimate the PSF from natural scene measurements and reconstruct phase aberrations, and (3) a spatial light modulator that performs optical correction. We experimentally validate this framework on dense natural scenes imaged through unknown obscurants. Our method outperforms state-of-the-art guidestar-free wavefront shaping methods, using an order of magnitude fewer measurements and three orders of magnitude less computation.


[180] 2603.17499

A Tutorial on Learning-Based Radio Map Construction: Data, Paradigms, and Physics-Awarenes

The integration of artificial intelligence into next-generation wireless networks necessitates the accurate construction of radio maps (RMs) as a foundational prerequisite for electromagnetic digital twins. A RM provides the digital representation of the wireless propagation environment, mapping complex geographical and topological boundary conditions to critical spatial-spectral metrics that range from received signal strength to full channel state information matrices. This tutorial presents a comprehensive survey of learning-based RM construction, systematically addressing three intertwined dimensions: data, paradigms, and physics-awareness. From the data perspective, we review physical measurement campaigns, ray tracing simulation engines, and publicly available benchmark datasets, identifying their respective strengths and fundamental limitations. From the paradigm perspective, we establish a core taxonomy that categorizes RM construction into source-aware forward prediction and source-agnostic inverse reconstruction, and examine five principal neural architecture families spanning convolutional neural networks, vision transformers, graph neural networks, generative adversarial networks, and diffusion models. We further survey optics-inspired methods adapted from neural radiance fields and 3D Gaussian splatting for continuous wireless radiation field modeling. From the physics-awareness perspective, we introduce a three-level integration framework encompassing data-level feature engineering, loss-level partial differential equation regularization, and architecture-level structural isomorphism. Open challenges including foundation model development, physical hallucination detection, and amortized inference for real-time deployment are discussed to outline future research directions.


[181] 2603.19153

Mobile Radio Networks and Weather Radars Dualism: Rainfall Measurement Revolution in Densely Populated Areas

This study demonstrates, for the first time, how a network of cellular base stations (BSs) - the infrastructure of mobile radio networks - can be used as a distributed opportunistic radar for rainfall remote sensing. By adapting signal-processing techniques traditionally employed in Doppler weather radar systems, we demonstrate that BS signals can be used to retrieve typical weather radar products, including reflectivity factor, mean Doppler velocity, and spectral width. Due to the high spatial density of BS infrastructure in urban environments, combined with intrinsic technical features such as electronically steerable antenna arrays and wide receiver bandwidths, the proposed approach achieves unprecedented spatial and temporal resolutions, on the order of a few meters and several tens of seconds, respectively. Despite limitations related to low transmitted power, limited antenna gain, and other system constraints, a major challenge arises from ground clutter contamination, which is exacerbated by the nearly horizontal orientation of BS antenna beams. This work provides a thorough assessment of clutter impact and demonstrates that, through appropriate processing, the resulting clutter-filtered radar moments reach a satisfactory level of quality when compared with raw observations and with measurements from independent BSs with overlapped field-of-views. The findings highlight a transformative opportunity for urban hydrometeorology: leveraging existing telecommunications infrastructure to obtain rainfall information with a level of spatial granularity and temporal immediacy like never before.


[182] 2603.20013

Steady State Distributed Kalman Filter

This paper addresses the synthesis of an optimal fixed-gain distributed observer for discrete-time linear systems over wireless sensor networks. The proposed approach targets the steady-state estimation regime and computes fixed observer gains offline from the asymptotic error covariance of the global distributed BLUE estimator. Each node then runs a local observer that exchanges only state estimates with its neighbors, without propagating error covariances or performing online information fusion. Under collective observability and strong network connectivity, the resulting distributed observer achieves optimal asymptotic performance among fixed-gain schemes. In comparison with covariance intersection-based methods, the proposed design yields strictly lower steady state estimation error covariance while requiring minimal communication. Numerical simulations illustrate the effectiveness of the approach and its advantages in terms of accuracy and implementation simplicity.


[183] 2603.20146

A Controller Synthesis Framework for Weakly-Hard Control Systems

Deadline misses are more common in real-world systems than one may expect. The weakly-hard task model has become a standard abstraction to describe and analyze how often these misses occur, and has been especially used in control applications. Most existing control approaches check whether a controller manages to stabilize the system it controls when its implementation occasionally misses deadlines. However, they usually do not incorporate deadline-overrun knowledge during the controller synthesis process. In this paper, we present a framework that explicitly integrates weakly-hard constraints into the control design. Our method supports various overrun handling strategies and guarantees stability and performance under weakly-hard constraints. We validate the synthesized controllers on a Furuta pendulum, a representative control benchmark. The results show that constraint-aware controllers significantly outperform traditional designs, demonstrating the benefits of proactive and informed synthesis for overrun-aware real-time control.


[184] 2603.24596

X-OPD: Cross-Modal On-Policy Distillation for Capability Alignment in Speech LLMs

While the shift from cascaded dialogue systems to end-to-end (E2E) speech Large Language Models (LLMs) improves latency and paralinguistic modeling, E2E models often exhibit a significant performance degradation compared to their text-based counterparts. The standard Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) training methods fail to close this gap. To address this, we propose X-OPD, a novel Cross-Modal On-Policy Distillation framework designed to systematically align the capabilities of Speech LLMs to their text-based counterparts. X-OPD enables the Speech LLM to explore its own distribution via on-policy rollouts, where a text-based teacher model evaluates these trajectories and provides token-level feedback, effectively distilling teacher's capabilities into student's multi-modal representations. Extensive experiments across multiple benchmarks demonstrate that X-OPD significantly narrows the gap in complex tasks while preserving the model's inherent capabilities.


[185] 2603.25161

Distributed Event-Triggered Consensus Control of Discrete-Time Linear Multi-Agent Systems under LQ Performance Constraints

This paper proposes a distributed event-triggered control method that not only guarantees consensus of multi-agent systems but also satisfies a given LQ performance constraint. Taking the standard distributed control scheme with all-time communication as a baseline, we consider the problem of designing an event-triggered communication rule such that the resulting LQ cost satisfies a performance constraint with respect to the baseline cost while consensus is achieved. The main difficulty is that the performance requirement is global, whereas triggering decisions are made locally and asynchronously by individual agents, which cannot directly evaluate the global performance degradation. To address this issue, we decompose allowable degradation across agents and design a triggering rule that uses only locally available information to satisfy the given LQ performance constraint. For general linear agents on an undirected graph, we derive a sufficient condition that guarantees both consensus and the prescribed performance level. We also develop a tractable offline design method for the triggering parameters. Numerical examples illustrate the effectiveness of the proposed method.


[186] 2603.25430

Four-Transistor Four-Diode (4T4D) Series/Parallel Chopper Module for Auto-Balancing STATCOM and Low Control and Development Complexity

Static synchronous compensators (STATCOMs) manage reactive power compensation in modern power grids and have become essential for the integration of renewable energy sources such as wind farms. Cascaded H bridges have become the preferred topology for high-power STATCOMs, but balancing module capacitor voltages remains a persistent challenge. Conventional solutions equip every module with a voltage sensor -- a component that is costly, temperature-sensitive, and prone to aging-related failures. Recent parallel-capable module topologies can balance voltage through switched-capacitor operation. The latest developments reduced the sensor requirement from one per module to one per arm. However, these implementations require twice as many individual transistors compared to series-only topologies. We present a STATCOM solution based on the four-transistor four-diode (4T4D) series\,/\,parallel chopper cell. This topology achieves bidirectional parallelization with only four transistors per module -- exactly as many as a conventional full bridge. Furthermore, we propose a dual-loop control strategy that fully eliminates module voltage sensors by inferring voltage levels from the modulation index. This scheme also improves output quality by regulating the modulation depth. We validated our proposal through simulation and experiments. We built a prototype to interface the grid. The prototype further passed robustness tests with step change, current direction reversal, and grid disturbance. This work demonstrates the first modular STATCOM implementation that combines minimum transistor count with complete elimination of module voltage sensors.


[187] 2411.11549

Sound Value Iteration for Simple Stochastic Games

Algorithmic analysis of Markov decision processes (MDP) and stochastic games (SG) in practice relies on value-iteration (VI) algorithms. Since the basic version of VI does not provide guarantees on the precision of the result, variants of VI have been proposed that offer such guarantees. In particular, sound value iteration (SVI) not only provides precise lower and upper bounds on the result, but also converges faster in the presence of probabilistic cycles. Unfortunately, it is neither applicable to SG, nor to MDP with end components. In this paper, we extend SVI and cover both cases. The technical challenge consists mainly in proper treatment of end components, which require different handling than in the literature. Moreover, we provide several optimizations of SVI. Finally, we also evaluate our prototype implementation experimentally to confirm its advantages on systems with probabilistic cycles.


[188] 2412.16175

Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study

We study continuous-time mean--variance portfolio selection in markets where stock prices are diffusion processes driven by observable factors that are also diffusion processes, yet the coefficients of these processes are unknown. Based on the recently developed reinforcement learning (RL) theory for diffusion processes, we present a general data-driven RL approach that learns the pre-committed investment strategy directly without attempting to learn or estimate the market coefficients. For multi-stock Black--Scholes markets without factors, we further devise an algorithm and prove its performance guarantee by deriving a sublinear regret bound in terms of the Sharpe ratio. We then carry out an extensive empirical study implementing this algorithm to compare its performance and trading characteristics, evaluated under a host of common metrics, with a large number of widely employed portfolio allocation strategies on S\&P 500 constituents. The results demonstrate that the proposed continuous-time RL strategy is consistently among the best, especially in a volatile bear market, and decisively outperforms the model-based continuous-time counterparts by significant margins.


[189] 2501.00191

Equilibria in Network Constrained Markets with System Operator

We study a networked economic system composed of $n$ producers supplying a single homogeneous good to a number of geographically separated markets and of a centralized authority, called the market maker. Producers compete à la Cournot, by choosing the quantities of good to supply to each market they have access to in order to maximize their profit. Every market is characterized by its inverse demand functions returning the unit price of the considered good as a function of the total available quantity. Markets are interconnected by a dispatch network through which quantities of the considered good can flow within finite capacity constraints and possibly satisfying additional linear physical constraints. Such flows are determined by the action of a system operator, who aims at maximizing a designated welfare function. We model such competition as a strategic game with $n+1$ players: the producers and the system operator. For this game, we first establish the existence of pure-strategy Nash equilibria under standard concavity assumptions. We then identify sufficient conditions for the game to be exact potential with an essentially unique Nash equilibrium. Next, we present a general result that connects the optimal action of the system operator with the capacity constraints imposed on the network. For the commonly used Walrasian welfare, our finding proves a connection between capacity bottlenecks in the market network and the emergence of price differences between markets separated by saturated lines. This phenomenon is frequently observed in real-world scenarios, for instance in power networks. Finally, we validate the model with data from the Italian day-ahead electricity market.


[190] 2501.02595

Rotatable Antenna-Enabled Wireless Communication: Modeling and Optimization

Non-fixed flexible antenna architectures, such as fluid antenna system (FAS), movable antenna (MA), and pinching antenna, have garnered significant interest in recent years. In this paper, we propose a new rotatable antenna (RA) model to improve the performance of wireless communication systems. Different from conventional fixed antennas, the proposed RA system can flexibly and independently alter the boresight direction of each antenna via mechanical or electronic means to exploit new spatial degrees-of-freedom (DoFs). Specifically, we investigate an RA-enabled uplink communication system, where the receive beamforming and the boresight directions of all RAs at the base station (BS) are jointly optimized to maximize the minimum signal-to-interference-plus-noise ratio (SINR) among all the users. In the special single-user and free-space propagation setup, the optimal boresight directions of RAs are derived in closed form with the maximum-ratio combining (MRC) beamformer applied at the BS. In the general multi-user and multipath channel setup, we first propose an alternating optimization (AO) algorithm to alternately optimize the receive beamforming and the boresight directions of RAs in an iterative manner. Then, a two-stage algorithm that solves the formulated problem without the need for iteration is proposed to further reduce computational complexity. Moreover, we extend the channel model to incorporate polarization effects and frequency-selective fading while catering to antenna boresight rotation. Simulation results are provided to validate our analytical results and demonstrate that the proposed RA system can significantly improve the communication performance as compared to other benchmark schemes.


[191] 2505.02004

Triple-identity Authentication: The Future of Secure Access

In a typical authentication process, the local system verifies the user's identity using a stored hash value generated by a cross-system hash algorithm. This article shifts the research focus from traditional password encryption to the establishment of gatekeeping mechanisms for effective interactions between a system and the outside world. Here, we propose a triple-identity authentication system to achieve this goal. Specifically, this local system opens the inner structure of its hash algorithm to all user credentials, including the login name, login password, and authentication password. When a login credential is entered, the local system hashes it and then creates a unique identifier using intermediate hash elements randomly selected from the open algorithm. Importantly, this locally generated unique identifier (rather than the stored hash produced by the open algorithm) is utilized to verify the user's combined identity, which is generated by combining the entered credential with the International Mobile Equipment Identity and the International Mobile Subscriber Identity. The verification process is implemented at each interaction point: the login name field, the login password field, and the server's authentication point. Thus, within the context of this triple-identity authentication system, we establish a robust gatekeeping mechanism for system interactions, ultimately providing a level of security that is equivalent to multi-factor authentication.


[192] 2507.18514

On the Role of Age and Semantics of Information in Remote Estimation of Markov Sources

This paper studies semantics-aware remote estimation of Markov sources. We leverage two complementary information attributes: the urgency of lasting impact, which quantifies the significance of consecutive estimation error at the transmitter, and the age of information (AoI), which captures the predictability of outdated information at the receiver. The objective is to minimize the long-run average lasting impact subject to a transmission frequency constraint. The problem is formulated as a constrained Markov decision process (CMDP) with potentially unbounded costs. We show the existence of an optimal simple mixture policy, which randomizes between two neighboring switching policies at a common regeneration state. A closed-form expression for the optimal mixture coefficient is derived. Each switching policy triggers transmission only when the error holding time exceeds a threshold that depends on both the instantaneous estimation error and the AoI. We further derive sufficient conditions under which the thresholds are independent of the instantaneous error and the AoI. Finally, we propose a structure-aware algorithm, Insec-SPI, that computes the optimal policy with reduced computation overhead. Numerical results demonstrate that incorporating both the age and semantics of information significantly improves estimation performance compared to using either attribute alone.


[193] 2508.01277

Foundation Models for Bioacoustics -- a Comparative Review

Automated bioacoustic analysis is essential for biodiversity monitoring and conservation, requiring advanced deep learning models that can adapt to diverse bioacoustic tasks. This article presents a comprehensive review of large-scale pretrained bioacoustic foundation models and systematically investigates their transferability across multiple bioacoustic classification tasks. We overview bioacoustic representation learning by analysing pretraining data sources and benchmarks. On this basis, we review bioacoustic foundation models, dissecting the models' training data, preprocessing, augmentations, architecture, and training paradigm. Additionally, we conduct an extensive empirical study of selected models on the BEANS and BirdSet benchmarks, evaluating generalisability under linear and attentive probing. Our experimental analysis reveals that Perch~2.0 achieves the highest BirdSet score (restricted evaluation) and the strongest linear probing result on BEANS, building on diverse multi-taxa supervised pretraining; that BirdMAE is the best model among probing-based strategies on BirdSet and second on BEANS after BEATs$_{NLM}$, the encoder of NatureLM-audio; that attentive probing is beneficial to extract the full performance of transformer-based models; and that general-purpose audio models trained with self-supervised learning on AudioSet outperform many specialised bird sound models on BEANS when evaluated with attentive probing. These findings provide valuable guidance for practitioners selecting appropriate models to adapt them to new bioacoustic classification tasks via probing.


[194] 2509.19601

Learning Genetic Circuit Modules with Neural Networks: Full Version

In several applications, including in synthetic biology, one often has input/output data on a system composed of many modules, and although the modules' input/output functions and signals may be unknown, knowledge of the composition architecture can significantly reduce the amount of training data required to learn the system's input/output mapping. Learning the modules' input/output functions is also necessary for designing new systems from different composition architectures. Here, we propose a modular learning framework, which incorporates prior knowledge of the system's compositional structure to (a) identify the composing modules' input/output functions from the system's input/output data and (b) achieve this by using a reduced amount of data compared to what would be required without knowledge of the compositional structure. To achieve this, we introduce the notion of modular identifiability, which allows recovery of modules' input/output functions from a subset of the system's input/output data, and provide theoretical guarantees on a class of systems motivated by genetic circuits. We demonstrate the theory on computational studies showing that a neural network (NNET) that accounts for the compositional structure can learn the composing modules' input/output functions and predict the system's output on inputs outside of the training set distribution. By contrast, a neural network that is agnostic of the structure is unable to predict on inputs that fall outside of the training set distribution. By reducing the need for experimental data and allowing module identification, this framework offers the potential to ease the design of synthetic biological circuits and of multi-module systems more generally.


[195] 2510.06961

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

We present the Open ASR Leaderboard, a reproducible benchmarking platform with community contributions from academia and industry. It compares 86 open-source and proprietary systems across 12 datasets, with English short- and long-form and multilingual short-form tracks. We standardize word error rate (WER) and inverse real-time factor (RTFx) evaluation for consistent accuracy-efficiency comparisons across model architectures and toolkits (e.g., ESPNet, NeMo, SpeechBrain, Transformers). We observe that Conformer-based encoders paired with transformer-based decoders achieve the best average WER, while connectionist temporal classification (CTC) and token-and-duration transducer (TDT) decoders offer superior RTFx, making them better suited for long-form and batched processing. All code and dataset loaders are open-sourced to support transparent, extensible evaluation. We present our evaluation methodology to facilitate community-driven benchmarking in ASR and other tasks.


[196] 2512.10270

Optimality Deviation using the Koopman Operator

This paper investigates the impact of approximation error in data-driven optimal control problem of nonlinear systems while using the Koopman operator. While the Koopman operator enables a simplified representation of nonlinear dynamics through a lifted state space, the presence of approximation error inevitably leads to deviations in the computed optimal controller and the resulting value function. We derive explicit upper bounds for these optimality deviations, which characterize the worst-case effect of approximation error. Supported by numerical examples, these theoretical findings provide a quantitative foundation for improving the robustness of data-driven optimal controller design.


[197] 2512.21051

Energy-Gain Control of Time-Varying Systems: Receding Horizon Approximation

Standard formulations of prescribed worst-case disturbance energy-gain control policies for linear time-varying systems depend on all forward model data. In discrete time, this dependence arises through a backward Riccati recursion. This article is about the infinite-horizon $\ell_2$ gain performance of state feedback policies with only finite receding-horizon preview of the model parameters. The proposed synthesis of controllers subject to such a constraint leverages the strict contraction of lifted Riccati operators under uniform controllability and observability. The main approximation result is a sufficient number of preview steps for the incurred performance loss to remain below any set tolerance, relative to the baseline gain bound of the associated infinite-preview controller. Aspects of the result are explored in a numerical example.


[198] 2602.11478

Defining causal mechanism in dual process theory and two types of feedback control

Mental events are considered to supervene on physical events. A supervenient event does not change without a corresponding change in the underlying subvenient physical events. Since wholes and their parts exhibit the same supervenience-subvenience relations, inter-level causation has been expected to serve as a model for mental causation. We proposed an inter-level causation mechanism to construct a model of consciousness and an agent's self-determination. However, a significant gap exists between this mechanism and cognitive functions. Here, we demonstrate how to integrate the inter-level causation mechanism with the widely known dual-process theories. We assume that the supervenience level is composed of multiple supervenient functions (i.e., neural networks), and we argue that inter-level causation can be achieved by controlling the feedback error defined through changing algebraic expressions combining these functions. Using inter-level causation allows for a dual laws model in which each level possesses its own distinct dynamics. In this framework, the feedback error is determined independently by two processes: (1) the selection of equations combining supervenient functions, and (2) the negative feedback error reduction to satisfy the equations through adjustments of neurons and synapses. We interpret these two independent feedback controls as Type 1 and Type 2 processes in the dual process theories. As a result, theories of consciousness, agency, and dual process theory are unified into a single framework, and the characteristic features of Type 1 and Type 2 processes are naturally derived.


[199] 2603.25559

Rotatable Antenna-Empowered Wireless Networks: A Tutorial

Non-fixed flexible antenna architectures, such as fluid antenna system (FAS), movable antenna (MA), and pinching antenna, have garnered significant interest in recent years. Among them, rotatable antenna (RA) has emerged as a promising technology for enhancing wireless communication and sensing performance through flexible antenna orientation/boresight rotation. By enabling mechanical or electronic boresight adjustment without altering physical antenna positions, RA introduces additional spatial degrees of freedom (DoFs) beyond conventional beamforming. In this paper, we provide a comprehensive tutorial on the fundamentals, architectures, and applications of RA-empowered wireless networks. Specifically, we begin by reviewing the historical evolution of RA-related technologies and clarifying the distinctive role of RA among flexible antenna architectures. Then, we establish a unified mathematical framework for RA-enabled systems, including general antenna/array rotation models, as well as channel models that cover near- and far-field propagation characteristics, wideband frequency selectivity, and polarization effects. Building upon this foundation, we investigate antenna/array rotation optimization in representative communication and sensing scenarios. Furthermore, we examine RA channel estimation/acquisition strategies encompassing orientation scheduling mechanisms and signal processing methods that exploit multi-view channel observations. Beyond theoretical modeling and algorithmic design, we discuss practical RA configurations and deployment strategies. We also present recent RA prototypes and experimental results that validate the practical performance gains enabled by antenna rotation. Finally, we highlight promising extensions of RA to emerging wireless paradigms and outline open challenges to inspire future research.