New articles on Electrical Engineering and Systems Science


[1] 2605.22851

VAMP-Diff: VampPrior Latent Diffusion for Photoplethysmography Modeling

Photoplethysmography (PPG) has become a ubiquitous physiological signal; however, current generative models still struggle to preserve realistic waveform morphology and learn a latent structure that captures cardiac and respiratory physiology. PPG generators trained with adversarial losses can produce plausible waveforms, but provide no inference path from a real signal to a latent representation. Variational autoencoders, on the other hand, map the PPG data to latent codes, although their decoders often blur systolic upstrokes and dampen amplitude and spectral details. Diffusion models improve waveform fidelity, but typically lack an inference path for reconstruction and physiological analysis. We propose VampPrior Latent Diffusion (VAMP-Diff), a jointly trained variational diffusion model that combines a temporal PPG encoder, a conditional one-dimensional diffusion decoder, and VampPrior regularization on a compact pooled latent. The model uses full temporal latent during diffusion reconstruction, giving the decoder access to beat timing and morphology while generating samples from learned VampPrior components instead of a fixed Gaussian prior. We demonstrate on the CapnoBase dataset that VAMP-Diff produces realistic PPG signals, reconstructs sharper physiological waveforms than Gaussian-prior baselines, preserves heart-rate information, maintains respiratory-rate consistency, and is sensitive to waveform corruptions through reconstruction error.


[2] 2605.22853

Topological Signal Processing: An Application-Oriented Tutorial

Many modern datasets are large and carry complex structural relationships. Graph-based methods have traditionally been used to represent networked data, modeling individual elements as nodes and pairwise interactions as edges. Furthermore, Graph Signal Processing (GSP) has been developed to analyze signals on graph nodes, such as temperature measurements (node signals) across different regions of a country represented as a graph. Topological Signal Processing (TSP) is an emerging field that generalizes GSP, enabling the analysis of signals defined not only on nodes but also on edges, triangles, and higher-dimensional network elements, modeled as simplicial complexes and related topological structures. This makes TSP naturally well-suited for studying higher-order interactions in complex systems by extending classical signal processing concepts, such as filtering and Fourier transforms, to the topological level. Despite its versatility, TSP remains challenging for many practitioners. Therefore, we present an accessible overview of TSP foundations while drawing connections with application-oriented settings. We focus on processing techniques based on the combinatorial Hodge Laplacian, which generalizes the graph Laplacian to simplicial complexes. In particular, we review key TSP concepts, relate them to real-world examples, and discuss how higher-order structures and signals can be derived from datasets. For instance, we introduce an edge-level signal capturing lagged interactions between nodal signals, and demonstrate its use in a case study on TSP-based analysis of brain imaging data, revealing nontrivial interactions between sets of brain regions. Overall, we aim to promote a broader adoption of TSP by bridging methodological developments with applications, fostering its use among a wide community of theoretical and applied researchers.


[3] 2605.22856

PilotWiMAE: Pilot-Native Representation Learning for Wireless Channels

Channel foundation models assume access to fully observed channels, an assumption that fails in deployment. We introduce PilotWiMAE, a self-supervised framework whose encoder ingests noisy pilot observations directly and whose attention factorizes along the axis separating temporal from joint space-frequency processing, an inductive bias inspired by the physics of the problem. Pilot input shrinks the observation space by up to two orders of magnitude and also removes the unrealistic assumption of full-CSI availability while incurring lower latency. The factorized design generates robust representations by exploiting the separable channel structure and allows a pretraining mask ratio of $99\%$. We pair patch-normalized reconstruction, which captures small-scale fading structure, with an auxiliary scale loss that recovers the large-scale fading features, and use an AWGN curriculum to match pilot noise at pretraining and deployment. Pretrained solely on $3.5$\,GHz and evaluated at $28$\,GHz across in-distribution and out-of-distribution settings, PilotWiMAE's cross-frequency beam selection and channel characterization beat supervised baselines despite operating on a smaller observation space. To weaken the coupling between decoder capacity and representation quality, we further propose a decoder-centric pretraining stage following the encoder-decoder joint pretraining, which allows PilotWiMAE to demonstrate competitive channel estimation without sacrificing representation quality. To foster further work in this direction, we release the PilotWiMAE pretrained weights and training pipeline, together with CSIGen, our Sionna-based ray-tracing channel-generation tool, and the channel datasets used in this work.


[4] 2605.22857

JointHRRP-Net: A Statistically Constrained Decoupling Network for Joint Target and Jamming Recognition in Composite Jamming

High-resolution range profile (HRRP)-based radar automatic target recognition suffers from severe performance degradation in composite jamming environments. Active jamming introduces suppression- and deception-related components into the received range profile. After pulse compression, these components are coupled with target echoes in the HRRP domain, making target-related scattering peaks difficult to distinguish and weakening feature separability. To address this problem, this paper proposes JointHRRP-Net, a unified framework for joint target-jamming recognition. A statistically constrained decoupling module is first developed to generate target-dominant and jamming-dominant latent branches from the mixed HRRP representation. Correlation-guided statistical constraints are imposed to suppress redundant cross-branch information and alleviate target-jamming feature entanglement. A multi-scale temporal encoding module is then designed to model local scattering structures and long-range range-cell dependencies, followed by a dual-expert decision module for single-label target classification and multi-label jamming classification. Experiments under diverse signal-to-jamming ratio (SJR) and signal-to-noise ratio (SNR) levels demonstrate that JointHRRP-Net outperforms representative baseline methods in both target recognition and composite jamming recognition. Open-set evaluation further shows that the learned target representation remains discriminative for unknown-target rejection. These results demonstrate the effectiveness and robustness of JointHRRP-Net in composite jamming scenarios.


[5] 2605.22858

Classification of IED-free EEG Responses for Assisted Epilepsy Diagnosis

Diagnosing epilepsy is challenging when routine EEGs lack interictal epileptiform discharges (IEDs). Intermittent photic stimulation (IPS) and hyperventilation (HV) can increase diagnostic yield, but their interpretation is subjective. We propose a reproducible pipeline that classifies EEG recordings acquired during stimulation procedures, using machine-learning features spanning temporal, spectral, wavelet, and connectivity domains, and a stacked ensemble to combine complementary feature sets. Performance is evaluated with leave-one-subject-out (LOSO) cross-validation on the TUH Epilepsy Corpus and a clinical Erasmus MC (EMC) cohort, including IED-free analyses on TUH. On TUH, ensembles achieve up to 97.8\% AUC / 93.1\% BAC on IED-free resting-state EEG and 94.1\% AUC / 86.8\% BAC on IED-free IPS. On EMC, IPS provides the strongest discrimination (79.4\% AUC / 73.9\% BAC), while HV performance benefits from stratifying subjects by responsiveness. These results indicate that stimulation-evoked activity, particularly IPS, contains meaningful discriminative information for IED-free epilepsy classification and that multi-domain ensembling improves robustness.


[6] 2605.22859

Staging by the Book: Automatic Sleep Stage Classification Using Scoring Rules

Automated sleep staging is commonly approached as a supervised machine learning problem, with deep learning methods dominating recent research. While machine learning models achieve near-human level agreement with human-scored reference sleep stages, their decisions are typically opaque and not designed to follow clinical scoring rules. We propose a transparent alternative: a deterministic, rule-based sleep staging method that explicitly operationalizes the American Academy of Sleep Medicine's (AASM) scoring logic as executable code, coupled with epoch-level natural-language justifications derived from an explanation trace. We evaluate the approach on 50 polysomnography recordings with a 10-scorer majority-vote consensus as reference. Across all recordings, the method agreed with the majority-vote reference in 60.5% of epochs ($\kappa=0.42$), with substantially higher agreement on a dataset used during development (77.1%, $\kappa=0.61$). Agreement with the reference was highest for sleep stage N2 (recall 83.5%) and moderate for sleep stage R (recall 68.7%), while Wake and N1 recall were low. Despite lower agreement with the reference than contemporary deep learning models, the method provides deterministic decisions and natural language explanations aligned with AASM scoring rules, making it a complementary tool for auditing, debugging, and governing deep learning-based sleep staging.


[7] 2605.22861

Statistical Characterization of Wind-Induced Beam Refraction in Water-to-Air Optical Channels

Direct water-to-air (W2A) optical communications experience strong beam refraction at the dynamic sea surface. This letter proposes a novel and tractable statistical channel model for a vertical W2A link between an underwater node and an unmanned aerial vehicle under varying wind speeds, modeling wind-induced pointing errors with a Beta mixture fitted via the Expectation-Maximization algorithm. By accounting for link interruptions due to total internal reflection (TIR) and receiver field-of-view limitations, we derive closed-form expressions for the channel distribution and link outage probability. Our analysis reveals a fundamental TIR-induced outage floor limiting link reliability and providing insight for robust W2A system design.


[8] 2605.22893

L-FAME: Longitudinal Focused Attention Meditation EEG Dataset and Benchmark

We introduce a novel Longitudinal Focused Attention Meditation Electroencephalography (L-FAME) dataset and an accompanying benchmark, designed to foster research into the neural effects of various meditation practices and the evolution of these effects over a six-week training period. The dataset contains EEG recordings and psychological assessments from 74 healthy college participants, collected at two distinct time points: pre-intervention and post-intervention. Participants were randomly assigned to one of three distinct meditation groups: two mantra-based techniques (SA-TA-NA-MA and Hare Krishna) and one Breath Focus practice. Leveraging this unique longitudinal and comparative dataset, we propose a benchmark suite comprising three distinct classification tasks: (1) cognitive state decoding to distinguish between resting and meditation states, (2) fine-grained classification of the specific meditation techniques, and (3) cross-session adaptation to evaluate model generalization across the longitudinal time gap. We provide comprehensive baseline results for these tasks utilizing a range of classical machine learning algorithms and deep learning architectures. The complete dataset, preprocessing pipelines, and benchmark evaluation code will be publicly released, offering a valuable resource and a standardized framework for the development and comparison of new analytical methods in computational meditation research and EEG-based machine learning. The dataset is available at this https URL


[9] 2605.22961

OctCGS: Octree-Contextual Gaussian Splatting with Explicit Multi-Order Propagation Modeling for Channel Knowledge Map Construction

Channel knowledge maps (CKMs) learn the relation between transmitter (Tx) and receiver (Rx) positions and channel knowledge to support environment-aware wireless communications. Implicit neural methods can model continuous channel variation but often incur high training and inference cost, while existing Gaussian-splatting-based CKM methods improve efficiency yet still compress wireless multipath interactions into aggregated scattering representations. Consequently, explicit modeling of multi-bounce wireless propagation remains absent from CKM construction. We propose OctCGS, an octree-contextual Gaussian splatting framework that explicitly models the order of bounce jointly over Tx/Rx positions and carrier frequencies. OctCGS partitions the environment into a multi-resolution octree and anchors one Gaussian primitive to each leaf node. Rather than having each Gaussian independently encode all multi-path propagations, it models complex electromagnetic interactions among scatterers through tree attention over the octree hierarchy with controlled complexity. Experiments on simulated benchmarks show that OctCGS achieves a 2.99 dB channel-gain mean absolute error (MAE) and 0.065 channel gain normalized mean absolute error (NMAE), outperforming the strongest baseline by 0.88 dB MAE and 0.021 NMAE.


[10] 2605.23030

A Methodology for Impedance-based Stability Margin Analysis for Interconnected Offshore Wind Clusters

With recent developments in offshore grid architectures, power park modules (PPMs) such as clusters of offshore wind power plants (OWPPs) are increasingly interconnected offshore. Consequently, it is necessary to assess how integrating a new OWPP affects the stability margins of an existing OWPP at the point of connection. Although impedance-based methods are widely used for small-signal stability assessment of interconnected converter-based systems, many studies rely primarily on Nyquist encirclements and do not explicitly quantify stability margins. Thus, this paper proposes a general impedance-based methodology to (i) evaluate the stability margins of an existing connection after a new PPM is integrated and (ii) derive a maximum allowable impedance for the new connection such that the minimum stability margin requirements specified by system operators are satisfied and stable operation is maintained. In addition, new Nyquist-based stability regions are introduced to complement the generalized Nyquist criterion, providing analytical indications of margin compliance and headroom. The proposed method is validated through case studies using vendor-based frequency-domain models of two interconnected OWPPs and HVDC system.


[11] 2605.23041

Holistic Grid-Forming Control to Enhance the Frequency Support from HVDC-Connected Offshore Wind Power Plants

To address the frequency stability challenges posed by the rising penetration of power electronics in power systems, HVDC-connected offshore wind power plants (OWPPs) are increasingly expected to provide inertial response and frequency containment reserve (FCR). In this paper, an improved holistic grid-forming (GFM) control is proposed, aiming to enhance the frequency support by coordinating the GFM controls implemented at all AC and DC terminals of an HVDC-OWPP system, without requiring communication. Firstly, the model of a typical HVDC-OWPP system is developed for control design. Accordingly, the proposed controllers are formulated, followed by an analytical tuning method, where the upper bound of the bandwidth at each AC or DC terminal is identified. Finally, simulations are conducted to verify the functionality and compare the performance with that of representative control configurations. The results show that the proposed holistic GFM control achieves faster response and thus more effective frequency support, while the utilization of the inherent energy storage of each converter is minimized, thereby supporting a new design philosophy for converter control in converter-dominated systems.


[12] 2605.23042

Open-Source METANET Calibration for Reproducible Freeway Traffic Macroscopic Simulation

METANET is a widely used second-order macroscopic traffic flow model for freeway networks, supporting applications across traffic simulation, ramp metering, and variable speed limit control. The predictive accuracy of any traffic model, however, hinges on careful calibration to real-world conditions. Despite its widespread use, there have not been open-source tools for calibrating METANET's parameters. Without open-source calibration, results cannot be easily reproduced or extended to other networks. This work provides an open-source METANET calibration, simulation, and data visualization tool. The calibration is formulated as a nonlinear program (NLP) solved via the interior-point method (IPOPT), with joint ramp flow estimation. We validate our calibration on real-world freeway data from two widely used traffic monitoring systems: Interstate-24 MObility Technology Interstate Observation Network (I-24 MOTION), one of the largest open-road trajectory instruments in the country, and loop detector data from the Caltrans Performance Measurement System (PeMS), which spans nearly 40,000 detectors across California freeways and serves as a standard benchmark in traffic research. Models calibrated using our method are able to reproduce these datasets' observed traffic patterns across diverse network geometries and traffic conditions including complex stop-and-go congestion waves. As large-scale traffic monitoring infrastructure continues to expand, open-source calibration tools are essential for translating growing volumes of sensor data into validated models that can support real-world traffic control. The complete code is publicly available at this https URL to support reproducible research in freeway traffic modeling and control.


[13] 2605.23044

Copula-Induced Correntropy for Robust Conjugate Gradient Learning

Robust learning in the presence of non-Gaussian and statistically dependent noise remains a fundamental challenge in signal processing and adaptive systems. Although information-theoretic learning criteria such as correntropy offer strong robustness against impulsive and heavy-tailed disturbances, existing formulations are commonly applied componentwise and therefore do not explicitly exploit the dependence structures inherent in multivariate, multi-sensor, and temporal signals. In this paper, we propose a learning framework, termed \textit{copula-induced information-theoretic learning} (CITL), which extends correntropy by embedding a copula space representation of residual dependence into the similarity measure. Unlike conventional correntropy-based approaches that operate pointwise on raw residuals, the proposed criterion is defined in a copula-transformed residual space, thus separating marginal robustness from dependence weighting. We derive a copula-induced correntropy (CIC) objective and a mixed marginal--dependence objective used in the implementation, provide information-theoretic and Bayesian interpretations, and develop a robust conjugate gradient (CG) learning algorithm tailored to this criterion. For fixed smooth marginal estimators, a fixed copula-space metric, and a regularized radial penalty, we establish sufficient descent and global stationarity guarantees for the corresponding fixed-estimator subproblem under standard line-search conditions. Experiments on synthetic multivariate signal processing regression problems demonstrate that the proposed method consistently outperforms mean squared error (MSE), Huber, Student's-$t$, and classical correntropy-based approaches, particularly in the presence of dependent heavy-tailed noise.


[14] 2605.23062

AFDM as a Software Upgrade of OFDM: One Firmware Patch, a New Frontier

In this white paper, we summarize for the benefit of the wider research community on wireless communications, the two key results that we shared with the attendees of the 2026 IEEE Communication Theory Workshop in Azores, Portugal, about affine frequency division multiplexing (AFDM). Firstly, we show that in contrast to the wide perception by most researchers, AFDM can be implemented at marginal costs by means of a simple software upgrade (firmware patch) of conventional orthogonal frequency division multiplexing (OFDM), indicating that its adoption can potentially be achieved across a wide range of OFDM-based wireless infrastructure and systems. The most crucial relevance of this finding is that such an upgrade would enable, under the specific conditions of the corresponding systems and their applications, exploiting various advantageous features of AFDM, including robustness to doubly dispersive channels (i.e., to support high-mobility use-cases in 6G), inherent integrated sensing and communications (ISAC) compatibility (i.e., to support sensing use-cases in 802.11bf), and the straightforward introduction of low-complexity physical-layer security at the waveform level (as needed in next-generation IoT systems). Secondly, we also show that the same mathematical principles underpinning the aforementioned finding, also imply an inherent capability of AFDM to reap the full uncoded diversity of static linear time-invariant (LTI) channels, demonstrating that this simple upgrade taps into previously undiscovered strengths of multicarrier waveforms.


[15] 2605.23094

Do Synthetic Brain MRIs Reliably Improve Tumour Classification? A StyleGAN2-ADA Class-Plane Augmentation Study on BRISC 2025

Generative augmentation is often proposed as a remedy for small medical-image datasets, but synthetic images are only useful when they improve downstream task performance. "Augmentation" here means synthetic supplementation: GAN-generated samples added to the real training pool, not geometric or photometric transforms of existing images. Twelve class-plane StyleGAN2-ADA generators were trained on constrained BRISC 2025 partitions to test whether their output, with or without InceptionV3 feature-space filtering, improves held-out tumour classification across three classifier families: a random forest (RF) on InceptionV3 features, a compact two-headed convolutional neural network (CNN), and MobileViTV2, a mobile hybrid convolutional-transformer. Each was evaluated at 1:1 and 1:2 real-to-synthetic ratios. An independent GPT-5.5 blind test placed gated real-versus-synthetic discrimination at 57.73% (95% CI: 54.48--60.92%) on the model-legible subset -- modestly above chance. The RF classifier did not benefit from the synthetic MRIs. The CNN showed consistent mean gains that did not survive Holm correction. MobileViTV2 showed the clearest benefit: filtered 1:1 augmentation improved tumour classification accuracy by 1.02% absolute (95% CI: 0.54--1.54%; Holm-corrected p = 0.0104). A secondary efficiency analysis found that every augmented CNN condition selected its checkpoint 42--64% earlier than baseline, while compute-matched MobileViTV2 runs reached selection after 50--67% fewer real-data epochs. Overall, augmentation utility was found to be architecture- and ratio-dependent, not guaranteed by visual fidelity alone.


[16] 2605.23124

Deep-Learning-Aided Successive Cancellation List Flip Decoding for Polar Codes

Polar codes are the first error-correcting code proven to achieve channel capacity based on infinite code length. The Successive Cancellation List Flip (SCLF) decoding algorithm was proposed by flipping an erroneous bit during the next decoding attempt. To identify the erroneous bits, the Log-Likelihood Ratio (LLR) is used to indicate the reliability of each decision bit. To improve the accuracy of the erroneous bit prediction, we propose deep-learning-aided (DL-aided) SCLF decoding algorithms. We first offer a stacked LSTM network that contains new features to train our models, which are able to improve the accuracy of the prediction of positions of erroneous bits. Then we separately train the stacked LSTM models to predict the position of both the first and second erroneous bits and whether to continue flipping. As a result, the DL-aided SCLF decoding algorithms based on the proposed stacked LSTM \mbox{flip-1} model, stacked LSTM \mbox{flip-2} model, and the stacked LSTM \mbox{continue-flipping} check (CFC) model are able to provide a better performance at a lower number of average decoding attempts when compared to other state-of-the-art decoding algorithms.


[17] 2605.23129

Deception and Counter Deception in Adversarial Graph Traversal Game

We study deception in adversarial graph traversal, where a mobile agent seeks to reach a goal with minimum cost while an adversary alters edge costs to increase the total traversal cost. Unlike prior works that assume fixed observer-deceiver roles, we model this problem with two-sided incomplete information in which both players possess private information and update beliefs from observed actions. To solve the resulting indefinite-horizon game, we develop an adaptation of the Extensive-Form Double Oracle (XDO) algorithm. While the standard XDO algorithm is designed for finite games, the proposed adaptation ensures bounded computation despite endogenous game termination. We show that the proposed algorithm terminates in finite time and returns an epsilon-Nash equilibrium. Finally, we use Value of Information to characterize the deceptive and counter-deceptive behaviors that emerge from equilibrium strategies.


[18] 2605.23137

STAMBRIDGE: Spectral-Temporal Amplitude-aware Mid-Feature Bridge for EEG Visual Decoding

Electroencephalography (EEG) visual decoding remains challenging due to the modality gap between low-SNR neural signals and highly structured vision--language spaces, making direct cross-modal alignment unstable. To address this, we propose STAMBRIDGE, a versatile two-stage framework that sequentially tackles feature conditioning and cross-modal alignment. First, we introduce a Spectral-Temporal Amplitude-aware Modulation (STAM) to extract well-conditioned EEG representations. By replacing hard frequency masking with amplitude-derived soft channel weighting and multi-scale temporal convolutions, STAM explicitly preserves frequency-aware transients while reducing the risk of time-domain ringing artifacts. Building upon these robust neural features, we further introduce a model-agnostic Mid-Feature Semantic Bridge (MFSB) that constructs a regularized intermediate space through directed cross-modal interactions, enabling staged distillation and more stable semantic alignment. Experiments on the THINGS-EEG benchmark show competitive 200-way zero-shot retrieval performance, with 34.50\% Top-1 and 65.95\% Top-5 accuracy. In addition, embeddings learned by STAMBRIDGE produce semantically coherent image reconstructions with a diffusion model, demonstrating robust EEG-to-vision semantic alignment. The code is available at: this https URL.


[19] 2605.23140

Self-Calibration DOA Estimation for Movable Antenna Systems with Antenna Position Errors

In this letter, we investigate the direction-of-arrival (DOA) estimation problem for wireless sensing with movable antenna (MA) systems in the presence of unknown antenna position errors (APE). To achieve robust wireless sensing, we transform the DOA estimation problem with APE into an optimization problem via the orthogonality between the steering vector and the noise subspace. Then we propose an alternating optimization (AO)-based self-calibration estimation, which consists of two stages and iteratively estimates the APE and DOA. Specifically, in the first stage, by fixing the APE, the problem reduces to the classical DOA estimation problem, which is solved using the multiple signal classification (MUSIC) algorithm. In the second stage, we fix the DOA to estimate the APE. By applying the Lagrange multiplier technique to the subproblem, we obtain a closed-form expression for the APE estimation. Simulation results demonstrate the superior DOA estimation performance of the proposed self-calibration algorithm for MA systems compared to the existing approaches.


[20] 2605.23151

Convex Hybrid Modeling: An Operator-Based Approach

While machine learning can accurately model process systems, models for decision making should also be structurally simple and physically interpretable. In process control, for example, (nearly) linear models are favored than nonlinear ones, promoting the use of operator theory, which ``universally'' represents a nonlinear system by a nonparametric operator. On the other hand, interpretability requires by a ``non-universal'', parametric nonlinear model family satisfying first principles; these constraints tend to complicate the learning procedure. This paper considers hybrid modeling by formulating convex learning problems that account for interpretability systematically and give surrogate models efficiently. Three settings are discussed -- (i) regularization around a particular ``reference model'', (ii) restriction on an ``interpretable subspace'', and more generally, (iii) restriction on a ``interpretable manifold'' that is nonlinearly parameterized. In the more general setting, by introducing an operator-theoretic technique to re-parameterize models in the ``lifted'' parameters (``canonical features'', potentially infinite-dimensional), the system is regarded as a kernel-based mixture of interpretable models. Application to both static and dynamic models are exemplified in numerical studies.


[21] 2605.23155

Physics-Informed Digital Twins for Channel Estimation and Traffic Prediction of Non-Terrestrial Networks

In non-terrestrial networks (NTN), high-speed satellite orbital motion, limited pilot signaling resources, and spatiotemporally heterogeneous traffic make accurate channel and traffic state characterization particularly challenging. In this paper, we propose a physics-informed digital twin (DT) framework for channel estimation and traffic prediction. Particularly, it formulates channel state information (CSI) reconstruction as a controllable generative process guided by physical-prior tensors. Through a physics-aware attention mechanism, it effectively reconstructs the real-time full-resolution CSI from highly sparse and outdated pilots. Then, we develop an orbit-adaptive spatiotemporal graph neural network for traffic prediction. By leveraging a dual-stream attention mechanism to capture intra- and inter-plane spatial dependencies and a gated recurrent unit to model temporal evolution, the neural network effectively predicts stochastic traffic residuals, which are integrated with the deterministic physical traffic baseline to form the complete traffic state. To evaluate the proposed DT framework, we establish a high-fidelity NTN DT simulation platform based on real-world Starlink ephemeris, global population, and ERA5 weather data. Experimental results demonstrate that our framework significantly outperforms state-of-the-art baselines in both CSI reconstruction and traffic prediction accuracy.


[22] 2605.23183

GMENet: Generative Mixture of Experts Network for Multi-Center Glioma Diagnosis with Incomplete Imaging Sequences

Contemporary glioma diagnosis integrates molecular features with histopathology to guide clinical decision-making. However, in clinical settings, divergent imaging protocols result in incomplete MRI sequences, leading to two primary challenges: forcing existing frameworks to discard a large portion of clinical data during training and consequently limiting their clinical applicability. To address these limitations, we propose GMENet, a Generative Mixture of Experts Network for multi-center glioma diagnosis with incomplete imaging sequences. Firstly, we design a Cross-attention-based Gated Generation Module that synthesizes missing sequence features from available sequences via cross-attention and dynamic gating mechanisms, incorporating a cycle-consistency loss to preserve semantic integrity. Secondly, we introduce a Dynamically Weighted Experts Fusion Module that performs mixture-of-experts interaction and confidence-aware fusion over original and synthesized dual-sequence features for multi-task prediction. We evaluate GMENet on a multi-center cohort of 1,241 subjects from four in-house datasets and two public repositories. Experiments show that GMENet expands clinically usable training data by 97\%, relative to complete-sequence-only data. Furthermore, it consistently outperforms state-of-the-art methods trained on complete data, demonstrating improved robustness under cross-center distribution shifts.


[23] 2605.23210

Fundamental Bounds and Efficient Estimation for Dead-Time-Constrained Event Detection, with Application to Single-Photon Lidar

We develop an asymptotic statistical theory for parameter estimation from a class of non-i.i.d. periodic binary event-detection processes subject to nonparalyzable dead time and gating, which we call "dead-time event detection" (DED) processes. Such processes arise in single-photon lidar, fluorescence lifetime imaging, X-ray astronomy, and particle or radiation flux measurements in nuclear physics, where each detection renders the radiation/particle detector inactive for a recovery interval. Our theory quantifies how dead time and gating affect the fundamental lower bounds of estimation and identifies practical estimators that attain these bounds. First, we identify a sufficient statistic, showing in particular that activation counts can carry statistically useful information discarded by conventional histogramming hardware. We then prove local asymptotic normality and derive the corresponding Fisher-information rate, thereby obtaining fundamental lower bounds for estimation from DED processes. We prove that the maximum likelihood estimator (MLE), widely used in DED applications, attains these lower bounds. Since computing the MLE typically requires solving a nonconvex optimization problem, we also propose Le Cam one-step estimators, which attain the same asymptotic bounds with only a single local correction rather than iterative optimization. We illustrate the validity of our asymptotic theory and the practical usefulness of one-step estimators through the example of single-photon lidar in both simulations and real-data experiments.


[24] 2605.23223

Experimental Evaluation of Data Upload Efficiency and Guiding Challenges for a Vehicular-to-Road System Using 60-GHz mmWave Ultra-Spots

Maximizing data uploading efficiency in a vehicular-to-road data uploading system using millimeter-wave communication is a challenging issue, as the wireless zone is often critically narrow, and vehicles can easily fail to pass through it without the aid of an autonomous guiding system. Variations in driving routes, speeds, approach angles, and distances to the ultra-spot can significantly affect data transmission performance, leading to either efficient or suboptimal results. This study presents a comprehensive analysis based on 75 experimental cases to identify the optimal travel trajectory and conditions that allow the vehicle to pass through the ultra-spot and enhance data transmission effectively. Experimental results show that with an optimal travel trajectory, appropriate movement speed, antenna placement, and prior estimation of the ultra-spot area, the amount of transferred data can be improved by 6 to 8 times.


[25] 2605.23261

UniSRM: A Unified Speech Reward Model for Reasoning-Based Fine-grained Assessment

Evaluating speech generation still relies heavily on human judgments, such as Mean Opinion Score (MOS), which are expensive, subjective, and difficult to reproduce at scale. While a few recent studies have begun to explore AudioLLM-based judge models, existing efforts typically target only a narrow set of scenarios (e.g., utterance-level quality or single-turn dialogue) and provide limited coverage of diverse speech generation tasks and evaluation dimensions. In this work, we propose UniSRM, a unified speech reward model that can support multi-dimensional, interpretable reward signals with reliable reasoning. To support training and evaluation, we introduce UniSRM-Data and UniSRM-Bench, covering speech evaluation tasks from utterance-level quality to context-level coherence. Based on this dataset, we present the unified speech reward model, UniSRM, with a two-stage pipeline that enables reasoning-based fine-grained assessment. Furthermore, we introduce Reasoning-Consistent Rewards to improve the reliability of the reasoning process. Experiments show that UniSRM delivers more reliable and human-aligned judgments across a broad range of speech evaluation tasks, offering a practical foundation for scalable and unified evaluation of speech quality.


[26] 2605.23282

Discontinuous Galerkin Neural Operator for Pathology Defocus Deblurring

Defocus deblurring in pathological microscopy remains challenging due to the spatially varying and locally discontinuous nature of optical blur induced by a position-dependent integral imaging process. Existing deep learning methods, constrained by shift-invariance assumptions and limited interpretability, are not well suited to such heterogeneous blur patterns. Neural operators provide a principled alternative by modeling defocus formation directly as an integral operator, offering a new perspective on defocus deblurring. However, most existing neural operator architectures for low-level vision rely on globally parameterized kernels that assume smoothness and stationarity, limiting their ability to model heterogeneous and locally discontinuous blur patterns. To address this limitation, we propose the Discontinuous Galerkin Neural Operator (DGNO), which parameterizes the integral kernel using a discontinuous Galerkin formulation with element-local volume operators and interface numerical fluxes. DGNO provides a principled combination of locality, heterogeneity modeling, and global coherence while preserving the underlying physics of optical image formation. Extensive and insightful experiments demonstrate that DGNO surpasses state-of-the-arts, delivering sharper reconstructions, robust handling of spatially varying blur, and scalable high-resolution performance. The code will be released at this https URL.


[27] 2605.23293

Evaluating the Temporal Detection Capability of Integrated Gradients Applied on Sound Classifier

Gradient-based attribution methods can highlight input regions important for neural network predictions, but their effectiveness for temporal sound event detection in audio classification has not been systematically evaluated. This paper assesses whether integrated gradients (IG) can temporally detect sound events when applied to a classifier trained without temporal supervision. We use synthetic polyphonic audio with ground truth timestamps to measure alignment between IG attributions and event boundaries. On a 10-class domestic sound dataset, IG achieves mean Intersection over Union (IoU) of 0.39, frame-level F1 of 0.52, and Pointing Game accuracy of 82.6\%. For comparison, a framewise CNN trained with weak supervision (FW-WS, clip-level training labels) achieves 0.42 IoU, 0.55 F1, and 97.3\% PG, while a strongly supervised variant (FW-SS, frame-level training labels) reaches 0.45 IoU, 0.58 F1, and 97.9\% PG. Overall, these results suggest that post-hoc IG captures meaningful temporal activity patterns of sound events, with localization performance approaching models that explicitly produce frame-level predictions. All methods substantially outperform random and energy-based baselines.


[28] 2605.23323

Efficient Learned Image Compression without Entropy Coding

Entropy coding is widely used in typical learned image compression (LIC) that converts latents into a compact bitstream. However, entropy coding is typically sequential and becomes the coding latency bottleneck. To overcome it, we present Entropy-Coding Free Learned Image Compression (EF-LIC), a multi-rate framework that generates compact representation by removing statistical and correlation redundancy with low coding latency. First, we introduce unconstrained vector quantization and prove that its index distribution approaches the maximum-entropy bound, yielding minimal statistical redundancy. Second, we propose a context-conditioned autoregressive transform that directly reparameterizes the latents to reduce inter-dependency. Theoretical analysis shows that EF-LIC can remove correlation redundancy as effectively as typical LIC with entropy coding, leading to comparable compression performance. Experiments show EF-LIC achieves up to 67.86% bitrate reduction over MS-ILLM on Kodak with LPIPS. Ablation studies further show EF-LIC matches the compression performance of its entropy-coding based variant while achieving over $3\times$ faster encoding and $5\times$ faster decoding.


[29] 2605.23333

Safety-Assured Arrival Scheduling in Sequential UAM Corridor Sections under Speed and Separation Constraints

This paper presents a safety-assured arrival-scheduling framework for Urban Air Mobility (UAM) corridor operations. We propose an analytical method to compute a sufficient ETA gap at Constrained Waypoints (CWPs) that guarantees longitudinal separation along sequential corridor sections with heterogeneous speed limits. The resulting ETA-gap condition depends on section-specific speed bounds and the required separation distance, providing an efficiently computable rule suitable for integration into future digital ETA-scheduling and air traffic management systems. We show that the computed ETA gap ensures safe separation across all corridor sections under prescribed section travel times and speed limits. Numerical simulations for a decreasing-speed corridor confirm that vehicles coordinated with the proposed mechanism adjust their speeds to maintain the required spacing, avoid potential collisions, and support improved traffic flow compared with unscheduled operations.


[30] 2605.23343

From Visual to Digital: Coordination Scheduling and Its Effect on Safety and Efficiency in UAM Corridors

This paper explores scalable coordination strategies for urban air mobility (UAM) corridors by comparing two representative approaches. The first, inspired by visual flight rules (VFR), is a local coordination strategy relying on spatial information available to each vehicle. The second, conceptually aligned with digital flight rules (DFR), is a global coordination strategy based on shared estimated times of arrival (ETAs) at constrained waypoints (CWPs). To support this comparison, we introduce a lightweight disturbance-avoidance mechanism that enables vehicles to adjust their ETAs in response to forecasted disruptions using shared information. We evaluate these approaches through numerical simulations under varying disturbance levels, comparing the locally reactive VFR-style scheme with the globally coordinated DFR-style scheme. Results show that VFR achieves high throughput in low-traffic scenarios but becomes increasingly prone to collisions at higher traffic densities unless conservative separation is enforced, which reduces traffic efficiency. In contrast, DFR maintains more consistent safety performance and traffic efficiency, even under moderate ETA update propagation delays. These findings highlight the advantages of DFR-style global coordination in managing high-density air traffic control (ATC) operations within UAM corridors.


[31] 2605.23354

Physics-informed sparse identification-based tube model predictive control for aerial vehicles

Autonomous aerial vehicles necessitate control strategies that balance computational efficiency with robust performance in dynamic operational environments. This paper proposes a model predictive control (MPC) framework for aerial platforms that leverages physics-informed machine learning (PIML) to achieve an optimal balance between computational tractability and robust performance. At the core of the proposed approach lies a sparse, control-affine model identified via the PIML method, which provides a parsimonious yet interpretable representation of the system dynamics by embedding first-principles knowledge and learning residual uncertainties from operational data. This model is incorporated within a robust MPC scheme that adopts a high-order Runge-Kutta discretization to ensure prediction accuracy and an adaptive tube-based mechanism to guarantee constraint satisfaction under uncertainty. The online adaptation of the tube, directly informed by the residual error of the PIML model, ensures robust stability without introducing excessive conservatism. Rigorous theoretical proofs are provided to establish recursive feasibility and stability. Numerical simulations and experiments on a quadrotor demonstrate that our method significantly reduces computational load compared to nonlinear MPC and robust MPC using a high-fidelity model, while outperforming PID, nonlinear MPC, neural-network-based MPC, and fixed-tube robust MPC in tracking performance and robustness, showcasing the practical efficiency of the proposed PIML-based control synthesis for resource-constrained aerial systems.


[32] 2605.23356

A Distributed Framework for Data-Driven Safe Coordination in Leader-Follower Networks

This paper addresses connectivity preservation in leader-follower multi-agent systems with unknown control-affine dynamics and local state information. We introduce the distributed data-driven zeroing control barrier function (3D-ZCBF) framework, which ensures the controlled invariance of safety sets by identifying derivative bounds from input-state data without requiring explicit models of high-dimensional agent dynamics. In this work, we derive the explicit, decoupled safety conditions necessary to maintain connectivity for leader-leader, and follower-follower pairings. These individual constraints, along with the leader-follower conditions, are aggregated into explicit system-wide conditions that formally guarantee the preservation of the entire communication network. Furthermore, we provide a quantitative analysis demonstrating how the size of the collected data set and the accuracy of the learned Jacobian bounds impact the feasibility of the safety certificates. The proposed conditions are implemented via a projection-based controller, and simulations confirm that these explicit 3D-ZCBF requirements effectively maintain system-level connectivity using only local, two-hop information.


[33] 2605.23366

Stochastic Geometry Analysis of Uplink CUMA-Enabled Cellular Networks

Uplink cellular networks are interference-dominated but interference channel state information (CSI) is rarely available at scale. The emerging fluid antenna system (FAS) concept, which provides additional spatial degrees of freedom through multi-port reconfiguration, offers a promising alternative to CSI-intensive multi-antenna processing. Building on this concept, compact ultra-massive arrays (CUMA) exploit large-scale port selection with low implementation complexity. In each uplink transmission, CUMA activates a subset of ports based on only the desired-link CSI and combines the selected ports via simple superposition, yielding coherent enhancement of the desired user signal, while inter-cell interference aggregates largely non-coherently due to the random superposition effect. Consequently, CUMA is well suited to multi-cell uplink scenarios where CSI is limited. In this paper, we analyze uplink CUMA in multi-cell cellular networks using a stochastic geometry framework. We derive a tight approximate expression for the signal-to-interference ratio (SIR) coverage probability, and further characterize the average user rate and cell sum-rate. The analysis quantifies how key design parameters impact performance and reveals the scaling behavior with network densification. Simulation results validate the accuracy of the derived expressions and show that uplink CUMA achieves competitive, and often superior, performance relative to conventional schemes under practical CSI constraints, highlighting its potential as a low-complexity, hardware-efficient uplink solution for future large-scale cellular networks.


[34] 2605.23368

Energy-Efficient THz Sensing with Hybrid THz/VLC Communication Under Human Blockage Effects

This paper presents an energy-efficient indoor system integrating \ac{THz} with \ac{VLC}. \ac{THz} communication offers ultra-high-capacity links but is limited by severe path loss, atmospheric absorption, and susceptibility to blockages. In contrast, \ac{VLC} provides robust, wide indoor coverage with illumination support, thereby enabling reliable, high-speed hybrid connectivity. To leverage their respective strengths, we propose a hybrid framework that integrates \ac{$THz_s-AP$} with hybrid \ac{$THz_c/VLC_c-AP$}, enabling reliable coverage and enhancing the \ac{EE} from an \ac{ISAC} perspective. We first perform optimal power allocation between the \ac{$THz_s-AP$} and \ac{$THz_c-AP$} to optimized the set of users served by the \ac{$THz_c-AP$} link, considering monostatic sensing performance metrics such as \ac{$P_d$}, \ac{$FA_p$} and \ac{$SC_p$} under the impact of human blockages are evaluated. Subsequently, the overall network power consumption is minimized via a mixed-integer linear programming (MILP) optimization that optimally selects the active \ac{$VLC_c-APs$} and assigns transmit powers. Furthermore, extensive performance evaluations are conducted to analyze key metrics, including average energy efficiency, average spectral efficiency, average sensing rate, and average communication rate. Simulation results demonstrate that, under \ac{THz} sensing, most users are connected to the \ac{$THz_c-AP$} in the absence of blockages, whereas in the presence of blockages, the majority are served by the \ac{$VLC_c-APs$}. Overall, all users maintain reliable coverage with high \ac{EE}.


[35] 2605.23427

Movable-Antenna-Enhanced ISAC: Optimal Antenna Trajectory and Beamforming Design

Integrated sensing and communication (ISAC) is a key enabling technology for next-generation wireless networks. However, most existing ISAC systems rely on fixed-position antennas, which restrict performance when balancing sensing and communication objectives. Movable antenna (MA) technology introduces additional spatial degrees of freedom through antenna mobility, yet existing studies on MA-enabled ISAC schemes mainly consider static antenna repositioning and fail to fully exploit this capability. By leveraging spatio-temporal sampling enabled by antenna motion, optimized MA trajectories can synthesize large virtual aperture arrays, thereby improving angular resolution and reducing sensing ambiguity. To this end, this paper investigates a dynamic MA-enabled ISAC system and studies the joint design of MA trajectories and transmit beamforming. We formulate a joint trajectory and beamforming optimization problem to minimize sensing beampattern mismatch under communication quality-of-service constraints. A branch-and-bound-based algorithm is developed to obtain the globally optimal solution. Numerical results show that the proposed framework significantly outperforms baseline schemes with only one or two antenna repositioning steps, demonstrating its practical feasibility.


[36] 2605.23429

Communication Security and Sensing Privacy in FMCW-Based ISAC Through Signal Modulation

This study proposes a novel radar-centric signaling design and architecture for secure integrated sensing and communication (ISAC) systems. The proposed framework is designed to provide robust physical layer security for data transmission while simultaneously enhancing sensing privacy. It employs index modulation and phase coding over frequency-modulated continuous-wave radar (FMCW) chirps, where index modulation (IM) provides an outer layer of data security, and we explicitly design the phase coding (PC) to perturb the resulting signal's ambiguity function (AF) to enhance sensing privacy. This design reduces the risk of unauthorized surveillance by rendering target velocity estimation practically infeasible for unauthorized passive sensing hardware (i.e., a sensing eavesdropper, S-Eve) and significantly impairing its range estimation capabilities. Furthermore, this study also presents the transmitter and receiver architectures required for effective modulation and demodulation of the proposed ISAC signaling and for performing sensing at the legitimate sensing hardware. Simulation results show that the proposed approach achieves high data throughput while enhancing communication security and sensing privacy.


[37] 2605.23463

StepAudio 2.5 Technical Report

Unified audio-language modeling has emerged as a prominent trend in modern speech systems, promising to bring the reasoning capabilities of large language models to auditory tasks. However, existing unified foundations often struggle to match the depth of specialized systems across automatic speech recognition (ASR), text-to-speech synthesis (TTS), and realtime spoken interaction. Bridging this gap remains an open challenge. This report presents StepAudio 2.5, a unified audio-language foundation model that matches or exceeds specialized systems across all three capabilities. Rather than treating these tasks as architecturally distinct, we operate on the premise that once text and audio share a multimodal representational space, task specialization becomes a matter of operational regimes: data construction, optimization targets, and decoding constraints. Guided by this insight, we advance the post-training paradigm from standard supervised learning to task-tailored Reinforcement Learning from Human Feedback (RLHF), using it as the primary mechanism to define complex optimization targets. We leverage this RLHF-centric alignment, alongside specialized decoding, to shape a shared backbone into three distinct operational modes. Concretely, the ASR branch advances transcription efficiency via verifiable multi-token decoding; the TTS branch achieves controllable, expressive synthesis through preference-based RLHF and context-rich supervision; and the Realtime branch realizes low-latency, persona-consistent dialogue via generative reward modeling within an RLHF framework. On standard benchmarks, StepAudio 2.5 achieves state-of-the-art results across ASR, TTS, and Realtime, demonstrating that a singular audio-language foundation can successfully internalize the distinct deployment objectives of speech understanding, generation, and live interaction.


[38] 2605.23468

ComHymba: Low-Complexity Domain-Informed Foundation Model for Wireless Communications

Wireless foundation models are a promising route to unify channel reconstruction, sensing, and beam management in future wireless communication systems, but existing designs often inherit LLM-style Transformers with quadratic token complexity and weak integration of propagation priors. This paper proposes ComHymba, a domain-informed wireless foundation model built on an asymmetric masked autoencoder for large-scale self-supervised pre-training on Channel State Information (CSI). ComHymba introduces (i) 3D spatio-temporal-frequency patchification with rotary positional embedding, (ii) domain-informed masking strategies that emulate realistic CSI sparsity and fading patterns, and (iii) a decoupled amplitude--phase weighted objective tailored to channel statistics. Architecturally, we employ Hymba blocks that fuse windowed self-attention with state space models (SSMs), enabling linear-time modeling with respect to the overall channel input size. Experiments on eight downstream tasks spanning channel state information reconstruction, environmental sensing, and beam management show consistent accuracy gains over strong task-specific baselines, together with up to a $3.3\times$ inference speedup versus Transformer backbones. Overall, ComHymba provides a scalable and efficient backbone for AI-native physical-layer intelligence.


[39] 2605.23481

Optimal Design Framework for Distributed Array Using Magnetically-Actuated Satellite Swarm

Distributed space antennas using electromagnetic formation flight (EMFF) are a promising architecture for large-aperture, long-life space communication systems. Their feasible aperture, however, is governed by coupled constraints on antenna performance, satellite mass, power generation, coil geometry, and formation-keeping power. This paper proposes a system-level design framework for EMFF-based distributed space antennas. It links phased-array requirements with satellite-level sizing constraints and provides a static grid-based reference for designing feasible apertures under a fixed system mass. Unlike our previous bucket-brigade disturbance-compensation model, the formation-maintenance requirement is incorporated through a control index derived from distributed-control simulations. This index is integrated into an antenna-aperture maximization problem with sizing, power, coil, and sidelobe-envelope constraints. Parametric case studies examine margin magnetic moment, prescribed transmit power, and large inter-satellite spacing. Results show that increasing system mass improves footprint reduction or effective isotropic radiated power only while satellite-level design headroom remains. In direct-to-device cases with 0.15 m spacing, generated-power and coil-geometry constraints dominate the feasible aperture. In the 0.60 m large-spacing case, the required coil burden can exceed satellite-level mass, size, and power capacities, making the design infeasible despite favorable communication performance. The proposed framework enables the design and evaluation of feasible static grid-based EMFF distributed antennas under coupled antenna, satellite, and control constraints.


[40] 2605.23495

Broad learning system with robust adaptive kernel

For the performance degradation problem of broad learning system (BLS) in non-Gaussian noise environment, the variant of BLS based on M-estimator shows good robust performance. However, in most cases, the determination of the optimal loss function is often very time-consuming due to the lack of prior knowledge of the sample data. Therefore, this paper constructs a variant of BLS based on adaptive robust kernel (AR-BLS) to improve the generalization performance of the model in non-Gaussian noise environment. Adaptive robust kernel function is a general loss function that includes many common M-estimator paradigms. By alternately optimizing model weights and adaptive robust kernel parameters, AR-BLS realizes the adaptive adjustment of model robustness under different outlier noise distributions without human intervention. In addition, the iterative convergence of AR-BLS algorithm is proved based on Zangwill's global convergence theorem. Simulation experiments on multiple public datasets and actual application scenarios verify the effectiveness of the proposed method.


[41] 2605.23496

Decentralized Variational Bayesian UKF with Maximum Generalized Student's t-kernel Correntropy for Wide-Area Power System state estimation

A Conventional centralized state estimators exhibit limited robustness in large-scale grids and face practical deployment hurdles. To overcome these challenges, this paper proposes a decentralized maximum generalized Student's t-kernel correntropy Variational Bayesian unscented Kalman filter (D-MGST-VBUKF). The algorithm optimizes the estimation performance at three levels for the regionalized state estimation needs: first, to address non-Gaussian measurement noise in practical systems, we propose the cost function using MGST, retaining Student's t robustness while improving adaptability to complex noise by expanding the degree-of-freedom parameter; secondly, the VB inference framework is constructed to model the unknown noise distribution online, and the joint optimization of the noise statistical characteristics and state estimation is realized by constructing the conjugate prior distribution; finally, the regional state fusion mechanism is established based on the topological correlation characteristics of the power grid, and the global consistency correction of the local estimation results is realized by constructing the state coordination equation of the boundary nodes. Simulation experiments in IEEE 14-bus and IEEE 39-bus system show that the method has stronger robustness compared with the traditional algorithm under non-Gaussian noise environment and unknown noise environment.


[42] 2605.23498

Constant-Envelope Quantized Precoding with Power Control for Cell-Free Massive MIMO-OFDM

Cell-free massive MIMO has matured into a key candidate technology for 6G and beyond, owing to its ability to provide nearly uniform service quality to many user equipments (UEs) over the same time-frequency resources. Unlike conventional cellular massive MIMO, the core idea is to distribute a large number of low-cost access points (APs) across the network and enable joint coherent transmission and reception. While early works largely assumed ideal hardware, hardware impairments become inevitable when APs are implemented with low-cost components. In this context, this paper investigates the adverse impact of low-resolution digital-to-analog converters (DACs) on the downlink performance of cell-free massive MIMO-OFDM systems. In contrast to prior studies that mainly quantify spectral-efficiency degradation under low-resolution DACs, we consider the design of quantized constant-envelope (CE) precoding, which additionally enables the use of highly power-efficient amplifiers. To the best of our knowledge, this is the first work on quantized CE precoding for cell-free massive MIMO-OFDM. Beyond adapting the classical maximum-antenna-power method, we propose a novel power-control strategy across APs that mitigates the detrimental effects of severely quantized transmitters by reducing the contribution of harmful APs. Simulation results demonstrate that the proposed power-control mechanism significantly improves the uncoded bit error rate performance.


[43] 2605.23499

Outlier-Robust unscented Kalman filter based on generalized correntropy induced

Conventional Kalman filtering (KF) approaches exhibit significant limitations in addressing nonlinear state estimation problems contaminated by non-Gaussian noise disturbances. To overcome these challenges, this work proposes a robust iterative square root unscented Kalman Filter based on the generalized correntropy induced (SR-GCI-IUKF). While sharing the maximum correntropy criterion's (MCC) ability to characterize higher-order noise statistics, the proposed GCI framework exhibits intrinsic kernel bandwidth insensitivit a critical advantage enabling robust adaptation to diverse complex noise environments through its generalized kernel structure. For nonlinear state estimation challenges, the algorithm constructs a nonlinear error generalization model that dynamically corrects measurement-induced errors during the state update phase, thereby significantly enhancing estimation accuracy in strongly nonlinear regimes. Furthermore, the square-root decomposition implementation ensures numerical robustness by preserving covariance matrix positive definiteness throughout recursive operations. Theoretical stability guarantees are established through rigorous error dynamics analysis, demonstrating bounded estimation variance under non-Gaussian disturbances. Finally, experiments are carried out in nonlinear systems, land vehicle navigation systems as well as power system FASE to compare other robust algorithms, and it is determined that the proposed algorithm has stronger robustness.


[44] 2605.23502

Distributed Two-Phase Processing for Modular XL-MIMO with Wireless Fronthaul under Hardware Impairments

Modular extremely large-scale MIMO (XL-MIMO) architectures combined with wireless fronthaul provide a scalable alternative to monolithic arrays, but their performance is sensitive to hardware impairments and resource allocation strategies. In this paper, we consider a distributed two-phase processing framework for modular XL-MIMO systems employing amplify-and-forward wireless fronthaul under practical hardware constraints. We jointly model access-side and fronthaul-side distortions and formulate a weighted minimum mean-square error (WMMSE)-based optimization problem that maximizes the uplink sum spectral efficiency (SE) by jointly adjusting UE transmit powers and fronthaul amplification levels. The resulting algorithm alternates between distortion-aware receiver design and convex power-control updates. Numerical results demonstrate that the proposed joint optimization significantly improves spectral efficiency compared to fixed transmission strategies, particularly when the CPU has a moderate number of antennas, while also quantifying the relative impact of access and fronthaul impairments.


[45] 2605.23505

OptiQU: Coordinated Multi-Level Voltage and Reactive Power Control for Enhanced Voltage Quality and Secure Grid Operation

Modern low-voltage (LV) distribution grids face rising shares of photovoltaic generation and high-power loads such as heat pumps and electric vehicle charging stations. Due to high simultaneity, voltage constraints often become binding before thermal limits, triggering costly conventional grid reinforcement measures. Existing voltage and reactive power control in LV grids - e.g., fixed cos($\phi$) or Q(V) control of distributed generators, on-load tap-changing distribution transformers, and line voltage regulators - is typically applied locally and independently, leaving reactive power flexibility potential unused. This paper presents OptiQU, a coordinated voltage and reactive power control concept for medium-voltage (MV) and LV distribution grids, combining centralised optimisation with decentralised local control and fallback strategies. The approach coordinates operational targets and setpoints across MV and LV (e.g., DER reactive power and substation equipment) to mitigate voltage violations and curtailment and to increase hosting capacity, while enabling robust operation under limited communication. The concepts are being evaluated using representative MV/LV models in simulation and lab environments and will be validated in field tests with two German DSOs. Based on existing research, the coordinated approach is expected to increase the exploitable flexibility for upstream voltage and reactive power control. The planned evaluation will quantify this potential and investigate trade-offs between performance, communication effort, and resilience.


[46] 2605.23516

Comprehensive Dataset and Signal Processing Framework for Phonocardiogram-Based Heart Rate and Blood Pressure Estimation

Cardiovascular diseases (CVDs) represent significant global health challenges today, necessitating regular and reliable monitoring to enable early intervention. Phonocardiogram (PCG) signals present a promising non-invasive method for assessing cardiovascular health. While recent studies have focused on estimating heart rate (HR) from PCG signals and blood pressure (BP) through multimodal combinations with other physiological data, reliable and cost-effective systems that can predict both HR and BP using only PCG signals remain largely unexplored. In this study, we proposed and developed a lab-scale cost-effective Phonocardiogram Tracking (PhonoTrack) system that can measure both HR and BP using only the PCG signal. We also introduced a corresponding dataset collected from 15 participants to evaluate the effectiveness of the proposed system. HR was determined using several peak detection methods, such as Hilbert Transform (HT), Shannon Entropy (SE), and WES, achieving notable Pearson correlation coefficients of 0.965, 0.973, and 0.955, respectively. The corresponding root mean square errors (RMSEs) were 2.467 bpm, 1.688 bpm, and 1.992 bpm for HT, SE, and WES, respectively. Additionally, we developed an advanced semi-empirical model based on multiple regression techniques to estimate systolic blood pressure (SBP) and diastolic blood pressure (DBP). This model demonstrated standard deviations of 2.10 mmHg for SBP and 3.20 mmHg for DBP across all subjects, with Pearson correlation coefficients of 0.89 and 0.70, respectively. These findings pave the way for developing a non-invasive, low-cost, and portable PhonoTrack device, positioning it as a promising solution for continuous cardiovascular monitoring settings.


[47] 2605.23524

Beyond Shrinkage: Foundations of Data-Driven Control for Piecewise Affine Systems

Data-enabled predictive control (DeePC) has recently attracted attention as a promising approach for controlling systems directly from raw data, without requiring an explicit identification step. However, DeePC has not yet been extended to piecewise affine (PWA) systems, despite their extensive use in the (predictive) control literature and their universal approximation capabilities. To address this gap, in this work, we lay the foundations for data-enabled predictive control of PWA systems, providing: $(i)$ their behavioral characterization; $(ii)$ an extension of Willems' Fundamental Lemma to represent their behavior from raw data; $(iii)$ an analysis of the coherence of DeePC strategies using a linear predictor and shrinkage regularizers; and $(iv)$ a study of the impact of misclassification errors on structuring data for prediction. Our theoretical findings are validated by numerical results on a simple example, emphasizing the need to extend beyond a regularized version of the foundational DeePC framework to design control actions that are both effective and coherent with a PWA system's behavior, thus ensuring the controller's explainability.


[48] 2605.23536

Utilizing Missed Detections in Directional Sensitivity-Based DOA Estimation

This paper introduces a signal strength-based direction of arrival (DOA) estimation approach for directional sensors that explicitly accounts for missed detections. In traditional phase-based DOA estimation frameworks, negative information from expected emitters that fall below the detection threshold fall outside the scope of standard measurement models. Unlike phase-based DOA estimation methods, the proposed approach relies only on received signal strength measurements. As a result, missed detections arise naturally from the sensing and detection process and convey valuable information via the known detection thresholds. By incorporating both detected signals and missed detections into the likelihood function, we develop a probabilistic estimation method that fully leverages the underlying measurement and detection models. Simulation results show that the proposed method significantly improves DOA estimation accuracy compared to baseline techniques, particularly in challenging scenarios with high missed-detection rates. Real-world experiments using Bluetooth Low Energy (BLE) signals and directional antennas further validate the effectiveness of the approach, demonstrating substantial performance gains. These findings highlight the value of modeling missed detections in sensor array processing and open new avenues for enhancing localization performance in wireless communication systems.


[49] 2605.23560

SafeSABR: Risk-Calibrated Adaptive Bitrate Streaming over Starlink Networks

Starlink, as a representative low Earth orbit (LEO) satellite broadband system, makes high-bitrate video streaming possible in regions where terrestrial broadband is unavailable. However, its access links exhibit rapid throughput fluctuations caused by satellite mobility and handovers. Existing learned adaptive bitrate (ABR) algorithms can achieve high average quality of experience (QoE), yet high-bitrate Starlink streaming exposes severe session-level rebuffering that is not captured by average QoE alone. To address it, this paper proposes SafeSABR, a risk-calibrated learned ABR framework for Starlink networks. SafeSABR formulates Starlink ABR as a QoE--severe-risk tradeoff and follows a three-stage design: behavior-cloning pretraining learns a high-QoE ABR prior, risk-calibrated reinforcement learning (RL) fine-tuning reduces severe-tail action tendencies, and a runtime safety auditor uses safe-capacity lower bounds to check policy-requested bitrates before execution. Experiments on real Starlink traces compare SafeSABR with online, prediction-assisted, and learned ABR baselines. Compared with advanced methods, SafeSABR reduces severe-stall sessions from 22.8% to 7.2% and worst-5% session rebuffering from 54.30 s to 22.68 s, with a 1.8% QoE cost. Component analyses further show that risk-calibrated fine-tuning and safe-capacity auditing reduce unsafe bitrate decisions and downstream severe-session rebuffering. These results show that combining risk-calibrated policy learning with decision-aware safe throughput forecasting can move learned ABR toward a safer QoE--severe-risk operating point under volatile Starlink networks.


[50] 2605.23561

Reliable UAV Detection with ISAC

Unmanned Aerial Vehicle (UAV) detection is one prominent use case of Integrated Sensing and Communication (ISAC) systems in 5G-Advanced and future 6G networks. In this paper, we present experimental results for the detection of a small UAV using unmodified commercial 5G hardware for mono-static Orthogonal Frequency-Division Multiplexing (OFDM) radar and compare them with the expected performance based on models for link budget and hardware impairments. We show that reliable detection with sub-meter accuracy is still possible in over 500 meters distance in a challenging radio environment rich of strong clutter.


[51] 2605.23564

FMCW-Based Integrated Sensing and Communication System: Design, Implementation, and Experimental Measurements

This study proposes a radar-centric integrated sensing and communication (ISAC) system utilizing a two-layer modulation scheme for vehicular networks. Frequency-modulated continuous wave (FMCW) chirps are jointly modulated via phase modulation (PM) and index modulation (IM) to transmit data while maintaining sensing as the primary function. To support this, a novel radar signal processing technique is developed to mitigate the impacts of IM and PM on sensing accuracy, alongside a communication receiver architecture designed to successfully demodulate IM and PM data within FMCW chirps. System performance is evaluated through simulations in the 2.4 GHz and 24 GHz bands under Doppler effects, achieving communication throughputs of 25 Mbps and 50 Mbps, respectively. Furthermore, a proof-of-concept hardware implementation is realized, and experimental measurements via a loopback cable are performed to verify the feasibility of the architecture. Finally, it evaluates the fundamental trade-off between communication throughput, sensing accuracy, and out-of-band emission, demonstrating the system's flexibility to dynamically adjust waveform parameters to meet varying operational requirements.


[52] 2605.23588

Low-cost Parallel Transmission for Dense Indoor Data Collection with LoRaWAN: Time Synchronization and Resource Allocation

LoRaWAN is a compelling low-cost solution for large-scale indoor Internet of Things (IoT) data backhaul, owing to its strong penetration capability and low power consumption. However, its default pure ALOHA access mechanism leads to severe channel contention, substantial packet loss, and reduced throughput under dense, concurrent transmissions. To overcome this, we propose a lightweight out-of-band (OOB) synchronization scheme that integrates a time division multiple access (TDMA) mechanism into commercial LoRaWAN Class~A networks. Unlike approaches requiring gateway scheduling, frequent downlink signaling, or custom hardware, our method introduces a single low-cost node providing millisecond-level alignment via a dedicated OOB synchronization channel. End devices seamlessly access this channel by briefly retuning their existing LoRa transceivers. Consequently, the scheme imposes zero downlink overhead during the steady-state reporting phase, requires no hardware modifications to gateways or end devices, and remains fully backward-compatible. This design enables collision-free scheduled channel access within the configured nominal resource capacity, thereby improving throughput and reducing contention. Real-world experiments using an indoor positioning prototype demonstrate that the proposed TDMA-LoRaWAN architecture improves system throughput by over 30\% and reduces the packet loss rate from 25.8\% to 5.02\% in a 20-node indoor deployment. Furthermore, large-scale simulations corroborate these empirical findings, support the scalability analysis under larger network sizes, and indicate improved energy efficiency per successful packet in dense network settings. These combined results demonstrate the effectiveness of the proposed approach for dense indoor IoT data collection and indicate its practical potential under high uplink reporting demands.


[53] 2605.23593

A study on weakly-supervised training approaches for phoneme-level pronunciation scoring

Phoneme-level computer-assisted pronunciation training systems typically rely on phoneme-level annotations, which are costly and scarce. In this work, we investigate whether phoneme-level mispronunciation information can be learned without phoneme-level supervision by exploiting higher-level pronunciation labels. Specifically, we study a weakly supervised setting in which models are trained using only utterance- or word-level pronunciation labels and analyze whether this supervision induces useful phoneme-level score predictions. We further consider a two-stage training scenario in which a model trained only with utterance-level labels is finetuned using a limited number of carefully-selected phoneme-level labeled utterances. We find that, using our proposed architecture and selection process, the two-stage process leads to comparable results to those obtained with full phoneme-level supervision, requiring only a small fraction of phoneme-level labels.


[54] 2605.23604

Word-Level Modeling with Alignment-Aware Acoustic Fusion for Text-Assisted Intelligibility Prediction in Listeners with Hearing Loss

We address text-assisted speech intelligibility prediction for hearing-impaired listeners in CPC3. Although the target is a sentence-level percentage, it is determined by reference-word recognition outcomes. We formulate prediction as reference-conditioned word-level correctness modeling: a frozen Whisper encoder analyzes degraded speech, a teacher-forced decoder conditions on the canonical transcript, and sentence intelligibility is obtained by averaging predicted correctness probabilities over valid reference words. To complement transcript-conditioned decoder states, we add a word-aligned local acoustic branch based on character-level cross-attention alignment and an utterance-level global acoustic branch for calibration. On the official evaluation set, the decoder baseline obtains RMSE 24.92 and correlation 0.795, while joint fusion improves to incorrect-word F1 0.778, MCC 0.626, correlation 0.806, and RMSE 24.39. A similar trend with Whisper medium suggests that the gain comes from prediction granularity and alignment-aware fusion.


[55] 2605.23619

Frame-Aligned Fusion of Canary and WavLM for Non-Intrusive Intelligibility Prediction of Hearing-Aid-Processed Speech

Non-intrusive intelligibility prediction estimates how well hearing-impaired listeners understand hearing-aid-processed speech without a clean reference. We study this task in the 3rd Clarity Prediction Challenge using two frozen speech encoders, Canary and WavLM. The central question is not only whether complementary pretrained representations should be combined, but where their interaction should occur. We compare single-backbone baselines, uniform score averaging, pool-late fusion, cross-attention, frame-aligned fusion, and reverse alignment under a shared left/right-preserving binaural framework. Among the compared systems, the best model temporally prepares WavLM with a learnable strided convolution and fuses it with Canary on the coarser Canary timeline before pooling, reaching Eval RMSE 24.96$\pm$0.06 and Eval Corr 0.796$\pm$0.001. Severity, enhancement-system, layer-window, and temporal-shift analyses indicate that coarse local temporal correspondence before pooling is a useful inductive bias for this task.


[56] 2605.23636

RF Instrument Agent (RFIA): Empowering RF Instruments with Natural Language Understanding, Scheduling and Execution of Complex Tasks

Modern radio-frequency (RF) instruments, such as vector network analyzers (VNAs), already provide mature remote-control interfaces. However, practical RF measurement workflows still rely on manual operation or custom scripting, which is time-consuming and expertise-intensive. This paper presents RF Instrument Agent (RFIA), a natural-language agent framework for reliable task-driven RF instrument control. RFIA adopts a decoupled intent--planning--execution architecture, where the LLM is used only for task understanding and high-level planning, while instrument-facing operations are handled by a deterministic runtime. Verified skills, workflow templates, RF analysis tools, instrument-specific rules, and retrieval-assisted SCPI knowledge are organized in a structured knowledge base, and hybrid execution graphs are used for closed-loop measurement tasks. A hardware-in-the-loop prototype is implemented on a commercial VNA and evaluated using a 16-task benchmark covering configuration, query, acquisition, rule-aware operation, RF-data analysis, and closed-loop measurement. RFIA handles all benchmark tasks under predefined execution and safety policies, including one expected safety rejection. Hardware-in-the-loop results with both a 230B-scale MiniMax-M2.7 model and a smaller 27B-scale Qwen3.6-27B model confirm that the decoupled architecture supports reliable natural-language RF measurement automation across different LLM backends.


[57] 2605.23642

Fast Fluid Antenna Multiple Access

Fast fluid antenna multiple access (FAMA) is an idea that promises to overcome severe interference in massive access scenarios by reconfiguring the antenna's position at the receiver side on a symbol-by-symbol basis, without the need of precoding nor any other interference mitigation techniques. However, this idea is commonly studied under a \emph{genie-aided} premise: each user terminal (UT) can probe \emph{all} fluid-antenna ports in every symbol instance and ideally knows the instantaneous signal-interference split for the received signals at all the ports. Such assumption is unrealistic since it implies impractical hardware and switching limits, pilot overhead, as well as an unknown ability to determine the signal-interference split. This paper revisits the fast FAMA communication problem and asks a key question: can a UT act \emph{as if} it had full per-port interference knowledge while observing only a small fraction of ports? To this end, we propose a \emph{copula-aided FAMA} framework that learns the joint dependence structure of the complex triplets $(r_k,h_k,I_k)$ across ports, where $r_k$, $h_k$ and $I_k$ denote, respectively, the received signal, the channel coefficient and the aggregate interference signal at the $k$-th port, and uses this learned model to infer unobserved channels and interference. Concretely, we devise an attention-copula time-series model that is trained under random partial-observation masks and evaluated under both rich and finite-scattering channel models. Simulation results indicate that the reconstruction normalized mean-square-error (NMSE) for $h$, $r$, and $I$ drops to the order of $10^{-4}$ once the number of observed ports, $M$, exceeds the spatial degrees of freedom (DoF).


[58] 2605.23649

Diffusion Fluid Antenna Systems for Resilient ISAC

Most existing integrated sensing and communication (ISAC) studies focus on enabling a base station (BS) to support sensing and communication over shared resources through advanced waveform design and power allocation. In contrast, the object-side perspective remains underexplored. For example, an object may wish to remain difficult to detect for security reasons, while another object in close proximity may generate dominant reflections that confuse the BS and impair sensing reliability for the intended target. These challenges motivate the fluid antenna system (FAS) paradigm which introduces a reconfigurable spatial degree of freedom (DoF) that can reshape sensing signatures via port selection, beyond what waveform and power control alone can provide. In this paper, we devise diffusion FAS, a generative artificial intelligence (AI)-driven framework that exploits spatial agility to steer ISAC performance over the electromagnetic fading manifold. Instead of optimizing ISAC solely in the power domain, diffusion FAS casts ISAC as a \emph{dynamic spatial selection} problem in which antenna states (i.e., ports) are chosen to shape sensing signatures while maintaining communication objectives. To work under sparse measurements, we employ a conditional denoising diffusion probabilistic model (DDPM) to reconstruct the latent spatial correlation structure from a small set of observed ports, enabling efficient exploration of the reconfigurable aperture. We demonstrate two FAS-enabled ISAC modes: (1) \emph{generative spatial stealth}, which identifies localized deep fades to suppress a target's sensing visibility by up to two orders of magnitude, and (2) \emph{target isolation}, which synthesizes spatial nulls that reject interference from adjacent objects.


[59] 2605.23661

Output Feedback MPC with Adaptive Tubes

An output feedback model predictive control (MPC) framework with adaptive tubes is proposed for linear time-invariant systems subject to parametric and additive uncertainties. An adaptive observer provides point estimates of the system state, model parameters, and initial condition, while jointly updating the corresponding sets containing the true parameters and initial state. These estimates parameterize the constrained optimal control problem, enabling constraint tightening, terminal ingredients, and tube geometry to be updated as the estimates evolve. In contrast to standard robust tube-based MPC formulations, the proposed approach does not require a common quadratically stabilizing linear feedback gain across the parametric uncertainty set. As the available uncertainty information improves, the tube geometry evolves accordingly, resulting in an adaptive tube MPC framework with improved performance over time. Recursive feasibility and robust exponential stability are established, and a numerical example is presented.


[60] 2605.23682

Tri-Domain Multiuser MIMO Precoding Optimization and Channel Estimation with Spatial-EM Reconfigurable Antenna

In this paper, we propose a tri-domain reconfigurable multiuser multiple-input multiple-output (MIMO) communication system that integrates the electromagnetic (EM) reconfigurable antenna (EMRA) with the spatially movable antenna (SMA), termed the spatial-EM reconfigurable antenna (SEMRA). The proposed system offers EM, spatial, and digital domain degrees of freedom (DoFs) for joint channel reconfiguration, yet introduces new challenges in channel estimation (CE) and precoding optimization. Specifically, for multiuser orthogonal frequency division multiplexing (OFDM) downlink, the precoding design is formulated as a tri-domain optimization problem over antenna positions, EM-domain radiation-pattern weights, and digital precoders. We first develop a zero-forcing (ZF)-based baseline algorithm to decouple the design of spatial reconfiguration, and then propose a weighted minimum mean square error (WMMSE)-based tri-domain joint optimization algorithm for further improving the spectral efficiency (SE). Furthermore, we propose a low-overhead movement-aided channel estimation scheme in which coordinated antenna repositioning across pilot slots synthesizes a denser virtual array, enabling more accurate angle-of-departure (AoD) estimation and EM-domain channel state information (eCSI) reconstruction under the same per-user pilot overhead as the EMRA baseline. The resulting parametric representation enables eCSI assembly at desired antenna positions without additional pilots. Simulation results show that the proposed CE scheme improves eCSI estimation accuracy and the proposed SEMRA achieves higher SE than the EMRA baseline under the same pilot overhead.


[61] 2605.23713

Stacked Intelligent Metasurfaces (SIM) in the Nonlinear Regime: A Multiport Network Model Approach

We present a physically consistent multiport framework for stacked intelligent metasurfaces (SIMs) with linear and explicit nonlinear terminations. The model provides closed-form input--output relations in the linear case and fixed-point forward evaluation in the nonlinear case, with adjoint-based gradients for optimization in both settings. Under stage-isolated SIM structure, complexity remains $\mathcal{O}(QK^3)$. In a 28 GHz near-field localization case study, nonlinear terminations improve transfer-function matching and reduce mean localization error, close to the ideal benchmark.


[62] 2605.23770

Reachability for Low-Thrust Trajectories via Maximum Initial Mass

Reachability analysis plays a central role in low-thrust spacecraft trajectory optimization by identifying which target states can be achieved under constraints on time, thrust, and propellant. Classical approaches construct reachable sets by solving many optimal control problems over grids of terminal states, requiring extensive forward simulations with fixed initial conditions. While effective, this approach is computationally expensive and becomes impractical for high-dimensional systems or strongly nonlinear dynamics, such as those encountered in cislunar environments or solar sail missions. This work introduces a dual formulation of the reachability problem. Instead of computing reachable sets directly, we determine, for fixed transfer time and boundary conditions, the maximum allowable initial mass (or, for solar sails, a scalar sail-strength parameter) that permits a successful transfer. A target is reachable if the spacecraft's initial mass does not exceed this threshold. This reformulation reduces reachability assessment to a scalar optimization problem for each target, producing a smooth scalar field that encodes equivalent feasibility information to classical reachable sets. We develop indirect maximum-initial-mass (MIM) formulations for both electric low-thrust and solar-sail dynamics and show how they can serve as efficient reachability oracles. Building on this formulation, we construct data-driven surrogate models to approximate the MIM-based reachability indicator. We investigate fully connected neural networks and demonstrate that residual networks provide the best trade-off between accuracy, training stability, and model complexity. The resulting surrogates enable rapid reachability evaluation while preserving the numerical advantages of the dual formulation, offering a practical tool for preliminary mission design and feasibility assessment.


[63] 2605.23779

SIM-Aided Near-Field Channel and Localization Estimation With Dimensionality Reduction: A Multiport Network Theory Approach

The deployment of Extremely Large-Scale Antenna Arrays for 6G enables radiative near-field sensing but poses significant challenges in terms of hardware complexity and interference. Stacked Intelligent Metasurfaces (SIMs) address these limitations by enabling wave-domain dimensionality reduction. This paper proposes a rigorous SIM-aided framework for near-field channel and localization estimation based on Multiport Network Theory, which provides an electromagnetically consistent characterization accounting for mutual coupling and non-unilateral inter-layer propagation effects. An indirect estimation approach is adopted, where the SIM is optimized to perform analog spatial filtering by projecting the received signal onto a relevant subspace identified through coarse prior location information. Within this realistic setting, we analytically characterize the impact of SIM approximation errors on channel estimation and quantify the resulting effects on localization performance. The results show that the proposed architecture preserves the essential wavefront curvature information required for accurate near-field localization, achieving performance comparable to fully digital solutions while drastically reducing the number of radio-frequency chains.


[64] 2605.23795

A Measurement-Based Parameterization of Physics Reflection Models for Terahertz Communication

The accurate modeling of reflection coefficients is pivotal for developing reliable channel models in emerging terahertz (THz) communications. This study establishes a 300$\sim$400 GHz channel measurement platform to measure the reflection coefficients of various materials. Based on the analysis of measured data, we propose the single-layer interference with an extended-parameterized Lorentz/Drude (SLI-EPLD) reflection coefficient model. In this model, a sub-band modeling strategy is adopted to characterize the variation of reflection coefficients with frequency, while a parameterized mapping approach is employed to ensure the stability of model parameters. Furthermore, the weighted sub-band fitting for trend regression (WF-TREND) algorithm is introduced to achieve precise sub-band parameter fitting. Validation results demonstrate superior performance to existing models across multiple materials. The reflection coefficient model established in this work serves as a critical foundation for channel modeling in 300$\sim$400 GHz for high-THz communication.


[65] 2605.23809

Advanced AI Service Provisioning in O-RAN through LLM Engine Integration

The Open Radio Access Network (O-RAN) architecture allows AI to be embedded directly into the RAN through modular xApps and rApps, yet creating these applications collecting data, training models, writing code, and deploying them safely remains slow and largely manual. Large Language Models (LLMs) offer strong reasoning and code-generation capabilities but are unsuited for the fast, deterministic inference required in real-time RAN control. We present a proof-of-concept Dual-Brain architecture that combines both strengths: an LLM-based orchestrator translates operator intents into data-collection policies and deployment code, while an automated ML engine, NeuralSmith, trains lightweight classifiers on demand via an API. We describe the architecture and provisioning workflow, share practical insights from a containerized O-RAN 5G~SA testbed, and discuss open research directions.


[66] 2605.23811

A Machine Learning Framework for Large-Scale Static Wireless Mesh Networks

This paper presents a system design methodology for a large-scale static wireless mesh network for 155 commercial off-the-shelf (COTS) radio nodes at fixed infrastructure sites in a challenging island environment. The architecture consists of approximately ten 15-node clusters, each with designated primary and secondary gateway nodes to support inter-cluster communication. A structured, multi-stage planning methodology was developed to guide network design. Site-specific radio frequency (RF) path loss predictions were generated using Remcom's Wireless InSite ray-tracing platform, incorporating terrain, buildings, and dense foliage effects. To optimize connectivity under physical-layer and operational constraints, spectral embedding combined with balanced k-means clustering was applied to partition the nodes into clusters of comparable size. A link budget analysis determined the maximum tolerable path loss under waveform and hardware constraints, defining the connectivity threshold used in the clustering framework. This work integrates deterministic RF propagation modeling with constrained clustering optimization to provide a scalable framework for planning static wireless mesh networks in complex geographic environments. Node mobility and higher-layer networking protocols were outside the scope of this study.


[67] 2605.23831

Ray-Tracing vs. 3GPP TDL: Power Delay Profile Analysis in Outdoor-to-Indoor and Indoor Channels

3rd Generation Partnership Project (3GPP) Technical Report (TR) 38.901 channel models (Releases 15-19) are widely used for physical-layer design and system-level evaluation in dense urban outdoor-to-indoor (O2I) and indoor environments. These models capture ensemble-averaged channel statistics but do not account for site-specific geometry. In this paper, we compare Power Delay Profiles (PDPs) derived from a deterministic ray-tracing model (Remcom Wireless InSite software) with those from the 3GPP TR 38.901 Tapped Delay Line (TDL) channel models. This comparative analysis is performed using a dense urban O2I scenario and a representative single-story indoor layout modeled in Washington, D.C., under matched link-distance and Non-Line-of-Sight (NLOS) conditions. All Wireless InSite PDPs are power-normalized to enable comparison of relative multipath delay structure. We evaluate root-mean-square (RMS) delay spread, mean excess delay, effective maximum delay, and Kullback-Leibler (KL) distribution divergence. Results indicate that 3GPP TDL models generally exhibit longer delay spreads and often fail to capture deterministic, site-specific features such as late-arriving energy and irregular spikes. While TDL models can approximate primary channel features in some cases, their reliance on ensemble-averaged statistics rather than geometry limits their representation of fine multipath structures. We conclude that while 3GPP TDL models are suitable for large-scale system evaluation, deterministic or hybrid approaches are more appropriate for site-specific physical-layer design.


[68] 2605.23851

A Manifold-Based Framework for Coupling-Aware Surrogate Optimization of Antenna Arrays Using Characteristic Modes

A surrogate-based synthesis framework for antenna arrays is presented that incorporates mutual coupling while keeping optimization computationally efficient. The method combines a common characteristic-mode basis, a global modal coupling model, and element-wise generalized scattering matrices (GSMs). Array design variables are formulated and optimized on physically meaningful manifolds, in particular the manifold of unitary symmetric matrices for reciprocal and lossless element GSMs. A staged penalty strategy is used to progressively enforce sidelobe and cross-polarization constraints during multi-beam optimization. The framework is demonstrated for an 8x8 left-handed circularly polarized patch phased array with scan behavior in one principal plane. Different degree-of-freedom assignment strategies are compared, showing that constrained non-identical element classes can satisfy stringent pattern requirements where equal-element designs fail. For the demonstrated case, the optimization converges within seconds on a single CPU core, and full-wave verification of the realized arrays confirms the predicted trends, with good agreement for the SLL and useful accuracy for the XPR. The results indicate that the proposed formulation is a practical and scalable route for coupling-aware array synthesis and realization.


[69] 2605.23859

Natural Yet Challenging to Detect: Robust In-the-Wild TTS through EMA and Dual-Scoring Prompt Selection -- Submission for WildSpoof 2026 TTS Track

In this technical report, we describe our submission for the WildSpoof Challenge TTS Track: Text-to-Speech with In-the-Wild Data. We introduce F5-TTS-DPS, a model built upon the F5-TTS architecture. Our approach integrates Exponential Moving Average (EMA) into supervised fine-tuning to stabilize training and improve generalization. To enhance synthesis fidelity, we leverage large language models (LLMs) and large audio language models (LALMs) for dual-scoring prompt selection, filtering reference audio and text prompts to ensure quality while addressing alignment issues in noisy datasets. Experimental evaluation demonstrates that F5-TTS-DPS achieves strong performance with UTMOS of 3.20 and speaker similarity of 0.51 on the development set. More importantly, our model achieves the best a-DCF scores of 0.1582, 0.5233, and 0.2562 across three advanced SASV systems among all submissions, indicating our synthesized speech is the most difficult to detect and exhibits the highest degree of naturalness and authenticity. Combined with competitive WER performance, these results validate the effectiveness of our approach in generating natural-sounding speech with strong spoofing capabilities.


[70] 2605.22837

Evaluating PhaseNet on Teleseismic Data with MsPASS

Numerous studies have shown that the machine-learning picker PhaseNet produces accurate P and S picks on local earthquake signals, but its performance can degrade sharply on teleseismic signals. To address this limitation, we present a reproducible MsPASS workflow that (i) enables scalable data preparation and management for large seismic archives and (ii) supports standardized PhaseNet training and inference. We assembled a control dataset of 1.6 million waveforms linked to teleseismic P-wave picks made by analysts at the USArray Array Network Facility (ANF). The control dataset confirms that the PhaseNet model trained on regional signals performs poorly on these data. We then trained PhaseNet from scratch on the training split of the ANF control dataset and evaluated it on a non-overlapping held-out test split, increasing P-pick recall by 741.5% and yielding 683.9% more picks within a 0.1s residual window. We also evaluated PhaseNet across different model sizes on both CPUs and GPUs. Increasing the model size by about 120 times improved precision and recall by 15.6% and 23.2%, respectively. However, the scaled model reduced inference throughput by 87.2% on an NVIDIA A100 GPU and by 97.3% on a 128-core high-performance CPU node. These results indicate that scaling PhaseNet is more practical on GPUs than on CPUs, and that simply enlarging the model is not an efficient way to achieve large accuracy gains.


[71] 2605.22899

ROI Extraction in Thermographic Breast Images Using Genetic Algorithms

This work proposes the use of Genetic Algorithms (GA) to identify the area of the breast from the background in thermographic breast images. The proposed method uses color information, a fitness function based on cardioids, and GA. This is the first work in the literature to propose a Region of Interest (ROI) extraction based on GA and cariods. ROI extraction can improve the accuracy of cancer detection and assist with the standardization of acquisition protocols. The method is able to successfully separate the breast region in 52 out of 58 images, while being fully automatic, and not requiring manual selection of seed points.


[72] 2605.22965

Performance Bounds for Rollout Policies in Stochastic Shortest Path Problems

This paper concerns rollout and certainty-equivalent rollout policies for stochastic shortest path problems with absorbing terminal states. The main result provides a direct non-asymptotic performance certificate for a fixed rollout policy: the loss relative to the optimal value is controlled by the uniform accuracy of the value approximation and by the expected time for which the rollout closed loop remains away from the terminal state. Thus, in the undiscounted transient setting, the expected hitting time plays the role of a discount or finite-horizon parameter in more standard approximate dynamic programming bounds. This paper also gives a performance-difference identity showing that suboptimality is exactly accumulated through the transient occupation measure, and a deterministic sharpness example showing that the hitting-time factor is unavoidable. Finally, consequences under uniform hitting-time and Foster-Lyapunov drift conditions are given, and extend the argument to certainty-equivalent rollout by adding a separate local model-mismatch term.


[73] 2605.22988

Active Sensing Subserves Task-Level Control

Active sensing is traditionally defined as the expenditure of energy, typically in the form of movement, for obtaining information. Here, we propose that the combination of reliance on adaptive sensors, the linkage between movement and sensing, and task-level control inevitably gives rise to the emergence of active sensing movements. In this way, active sensing is not driven by sensory goals, such as minimizing uncertainty about the state, but rather is necessary for task-level control. This hypothesis, that active sensing subserves control, is supported by both empirical data from organisms and mathematical theory. Interestingly, active sensing behaviors often occur in discrete epochs, interspersed with goal-oriented behavior. This suggests that animals switch between two behavioral modes with distinct control policies, an `explore' mode in which animals produce dynamic movements to shape sensory feedback, and an `exploit' mode in which animals produce slower compensatory movements that are directly related to achieving task goals. This strategy for feedback control that relies on adaptive sensors, active sensing, and mode switching is not commonly used in engineered systems despite being ubiquitous in biology. Engineered systems comprising state-of-the-art sensors, actuators, and mechanical designs can outperform animals with respect to ``cost functions'' such as maximum force generation, precision, and speed. Nevertheless, animals routinely achieve robust, graceful behaviors that are currently unmatched by engineered systems, suggesting that current control systems are insufficient. These insights, expressed in the language of control theory, may be critical for improving robotic sensing and control.


[74] 2605.23240

Signal Temporal Logic Motion Planning via Graphs of Convex Sets

This paper investigates continuous-time motion planning under Signal Temporal Logic (STL) specifications. The goal is to generate smooth robot trajectories that satisfy high-level logical and timing requirements while respecting low-level motion constraints. To this end, we propose an efficient framework that combines timed-automata reasoning with graphs of convex sets (GCS). An STL specification is first represented by a timed automaton, which is then coupled with a convex decomposition of the configuration space to form a joint transition system encoding both task progress and region occupancy. Based on this joint transition system, the STL motion-planning problem is reformulated as a shortest-path problem over a GCS, whose solution induces a smooth Bézier-spline trajectory satisfying the STL specification, smoothness requirements, and velocity bounds. We establish the soundness of the proposed formulation and analyze its computational complexity, showing that, once the timed automaton and convex decomposition are fixed, the convex relaxation scales polynomially with the configuration-space dimension and the Bézier degree. We further develop a compact timed-automaton construction for an expressive STL fragment using dedicated templates and Boolean composition. Numerical experiments on low-dimensional benchmarks, a $3$-D quadrotor, a $30$-DoF humanoid, and a hardware experiment on a UR-3 robot arm demonstrate that the proposed method efficiently solves complex STL motion-planning problems and produces smooth executable trajectories.


[75] 2605.23263

6G Communication Networks Enabling Embodied Agents: Architecture and Prototype

Embodied agents, which couple intelligent decision-making with physical actuation in the real world, impose far more stringent and heterogeneous communication requirements than purely software-based agents. While 6G promises sub-millisecond latency, ultra-high reliability, native intelligence, and integrated sensing, systematic studies on how to exploit these capabilities for embodied agent communication remain limited. This article investigates 6G-enabled communication systems for embodied agents from both conceptual and engineering perspectives. First, we review the concept, embodiment value of embodied agents, and clarify their distinctions from disembodied agents. Then, we analyse the symbiotic relationship between embodied agents and 6G networks. We highlight how key 6G enablers can support the stringent requirements of human-robot interaction. Furthermore, we demonstrate the proactive role of embodied agents in bolstering communication networks through coverage extension, environmental sensing, and physical world understanding. Building on these insights, we propose a hierarchical communication architecture for human-robot remote interaction, comprising a human-intent perception layer, an open radio access network (O-RAN)-based transport layer, an intelligent intermediary layer, and an embodiment layer. To validate its feasibility, we implement an end-to-end prototype that integrates a haptic device, an industrial robotic arm, an intermediary platform, and a 5G O-RAN testbed. Experimental results demonstrate millisecond-level latency and stable closed-loop operation, confirming the practicality of the proposed architecture and providing a reference for future 6G-embodied agent research and industrial deployments.


[76] 2605.23306

SpinFlow: A Physics-Informed Spin Field Framework for Traffic Phase Inference and Transition Detection

Active traffic management (ATM) is frequently hindered by traditional macroscopic models and rigid empirical thresholds that fail to capture metastable phase precursors, resulting in delayed, reactive interventions. To address this, we propose SpinFlow, a physics-informed spin-field framework unifying Kerner's three-phase theory with statistical physics for continuous macroscopic traffic phase inference. Inspired by the Heisenberg model, SpinFlow parametrizes spatially varying phase weights via a latent spin vector and a competitive-equilibrium mapping, allowing synchronized flow to emerge naturally. A physics-regularized Expectation-Maximization algorithm inverts this latent structure from high-resolution trajectories, jointly optimizing the spin field while softly enforcing mass conservation and spatial smoothness. We introduce the Phase Equilibrium Degree (PED) to quantify structural alignment and topologically localize phase-transition points. Across four real-world trajectory datasets, SpinFlow achieves $R_{q}^{2}$ up to 0.940, PED drops of 94.9-100%, and interpretable phase maps that outperform three heterogeneous baselines on forward accuracy, physics consistency, and bottleneck localization. SpinFlow pinpoints congestion nucleation without prior network topology, yielding a data-driven, physics-consistent trigger for ATM.


[77] 2605.23419

Generalized Stochastic Approximation of the Log-Likelihood Ratio for Robust Sequential Change-Point Detection

Sequential change-point detection in non-Gaussian stochastic processes is challenging because the underlying densities are rarely known in real time. Classical parametric procedures such as CUSUM lose optimality under distributional mismatch, whereas nonparametric alternatives often react slowly. We develop a unified framework that approximates the log-likelihood ratio (LLR) on a generalized stochastic basis -- polynomial, logarithmic, or fractional-power -- using only moments up to order 3s, with no analytic form of the distribution, and thereby adapts the classical CUSUM, GRSh, and SRP procedures to non-Gaussian data. The convergence functional J(s) = K^T Y is interpreted as the projection of the Kullback-Leibler divergence onto the basis span, yielding a formal criterion for selecting the approximation order. We target the regime of small relative change-points, where the signal energy changes little but the shape of the distribution -- tail structure and modality -- does. A robust threshold follows from Kunchenko's probability-error bound (KU-PE), which controls the false-alarm rate without empirical tuning. On nine public benchmarks across four domains, the method is, to our knowledge, the only one operative on extremely heavy-tailed data (excess kurtosis gamma_4 > 20), where classical methods produce 100% false alarms, while reducing the detection delay at a guaranteed false-alarm level. The core theorems are formally verified in Lean 4.


[78] 2605.23508

DrawVideo: Generating Long Video from Storyboard Keyframe Sketches

Long video generation requires high-fidelity synthesis, coherent narrative structure, and user control over extended time spans. Existing text-to-video methods often rely on a single long prompt, limiting control over pose, composition, layout, and motion. We propose DrawVideo, a sketch-guided, storyboard-driven framework for controllable long-video generation. DrawVideo decomposes long videos into independently controllable shots, each defined by a black-and-white sketch, an appearance prompt, and a motion prompt. The sketch controls pose and layout, the appearance prompt defines identity, scene, and style, and the motion prompt guides temporal dynamics. DrawVideo follows a hierarchical 'global multi-shot, local single-sketch' strategy: it first generates a structure-aligned reference keyframe, then expands the motion prompt into derivative keyframes representing action states, and finally synthesizes clips between adjacent keyframes to build each shot. We also introduce SketchLongVideo, the first dataset for sketch-guided text-to-long-video generation, constructed from animation videos via shot detection, keyframe extraction, vision-language recognition, prompt decomposition, and sketch conversion. Experiments show that DrawVideo achieves strong structural controllability, appearance consistency, visual stability, and coherent long-video generation.


[79] 2605.23537

Concomitant DAG Learning: On the Roles of Noise Adaptivity, Sparsity, and Non-negativity

Directed acyclic graphs (DAGs) constitute a central modeling tool to enable principled reasoning about cause-effect interactions in complex systems. However, since the causal structure underlying a group of variables is often unknown and interventions may be infeasible or ethically challenging to implement, there is a need to address the task of inferring DAGs from observational data. However, most classical structure identification approaches face two key obstacles: the combinatorial challenge of enforcing acyclicity, which severely limits scalability, and identifiability challenges arising from latent confounding or heterogeneous noise. This tutorial offers an overview of recent signal processing and optimization advances that address these issues by recasting DAG structure learning as a continuous, score-based estimation problem over adjacency matrices. We begin with a didactic introduction to structural equation models and the formulation of causal graph recovery, followed by a historical survey of score-based methods ranging from early combinatorial search schemes and greedy heuristics to modern continuous frameworks that leverage smooth characterizations of acyclicity. Building on this foundation, we describe concomitant DAG estimation methods that jointly infer sparse causal structure and exogenous noise levels, improving robustness under heteroscedasticity and distribution shifts by rendering the estimator noise adaptive. All in all, the tutorial introduces readers to challenges and opportunities for signal processing research at the crossroads of causal inference, high-dimensional statistics, and scalable graph learning, while outlining emerging directions including online, nonlinear, and neural causal discovery.


[80] 2605.23568

TactileReflex: Noise-Statistics-Driven Vision-Tactile Reflex Control for Force-Sensitive Manipulation

Manipulating fragile deformable containers, such as disposable plastic cups filled with liquid, demands real-time grip-force adaptation within an extremely narrow force margin: insufficient force causes slip, while excessive force irreversibly deforms the thin wall. Existing approaches struggle to achieve such force-sensitive manipulation tasks. We propose a noise-statistics-based calibration-driven reflex control paradigm with vision-based tactile sensing: by analyzing the sensor's intrinsic noise characteristics (via a brief static-hold-and-unload protocol), we directly derive all controller thresholds, eliminating external force calibration, trial-and-error manual tuning, or material-specific physical models. Instantiating this paradigm, we present TactileReflex, a three-channel closed-loop controller that extracts three image-level proxies, shear intensity ($S_y$), contact intensity ($F_n$), and center of pressure ($C$), from dual visuo-tactile sensors and drives prioritized reflex channels at ~12 Hz for slip suppression, weight-adaptive release, and force protection. Each channel closes the loop directly on its proxy via noise-derived thresholds. Ablation demonstrates that only the full three-channel system is able to prevent irreversible container deformation (5/5 success vs. at most 1/5 for partial configurations). In a dynamic pouring task, fixed-effort baselines fail in all 10 attempts due to pose drift, while TactileReflex achieves 9/10 success across two water volumes. As a self-contained and interpretable controller, TactileReflex can serve as a plug-and-play safety layer beneath high-level manipulation pipelines, including haptic-free VR teleoperation and vision-language-action (VLA) policies.


[81] 2605.23671

A Non-Iterative Algorithm for Clearing Two-Layer Energy-Sharing Markets with Voltage Constraints

Real-time hierarchical energy-sharing markets are promising to coordinate large numbers of prosumers. Still, most existing clearing methods rely on linearized or DC power-flow models and do not explicitly handle reactive power or voltage-security constraints. With AC network constraints, the problem becomes a large-scale bilevel Mathematical Program with Equilibrium Constraints (MPEC) that is difficult to solve in real time. This paper develops a non-iterative clearing algorithm for two-layer energy-sharing markets with voltage constraints. We first derive an efficient best-response function for each lower-layer energy-sharing market and reduce the equilibrium search to one dimension by exploiting the pricing-coupling structure. We then embed this function into the upper-layer network-constrained problem and reformulate the bilevel MPEC as a single-level mixed-integer second-order cone program (MISOCP), which is computationally tractable. Case studies on the IEEE 123-bus system with 12,300 prosumers show that the proposed method preserves nodal voltages within prescribed limits and delivers solutions with maximum errors below 0.01\% in 0.829 s.


[82] 2605.23708

Learning Dynamic Stability Landscapes in Synchronization Networks

The robustness of synchronization is typically characterized by scalar, per-node stability indices whose dependence on topology is studied via network science or graph neural networks (GNNs). We propose a novel upstream task, learning stability landscapes, which provide deeper insights into synchronization behavior and from which many such scalar indices can be derived. Crucially, we pioneer a graph-to-image prediction paradigm: learning image-like landscapes as per-node targets directly from graph topology, a formulation we are not aware of having been established elsewhere in the literature. To support this task, we release two datasets of 10,000 graphs each at 20 and 100 nodes with per-node landscape labels, based on a conceptual oscillator model, capturing power grid synchronization behavior. A GNN encodes topology and a CNN decoder renders per-node images, learned end-to-end with good in-distribution accuracy, generalizing across graph sizes and to realistic power grid topologies. This demonstrates that stability landscapes, while beyond the reach of conventional network science, are learnable from topology and open new avenues for moving beyond scalar stability indices in biology, neuroscience, and power grids.


[83] 2605.23782

Routing Equilibrium in Mixed-Autonomy Traffic Networks with Altruistic Autonomous Agents

Recent advancements in vehicle autonomy have drawn interest in understanding the impact of autonomous vehicles on traffic systems. In this paper, we study a traffic assignment problem in a mixed-autonomy setting where both human-driven and autonomous vehicles coexist. We model the interaction as a simultaneous routing game where human drivers are self-interested and aim to minimize their own travel times, while autonomous agents are altruistic and aim to minimize the total social cost. The standard nonatomic congestion game analysis establishes the existence of equilibrium to this game under convex cost functions, and does not have any implication of its uniqueness. In this work, we formulate the equilibrium as a variational inequality (VI), which enables us to establish the equilibrium existence without convexity assumption, and guarantees the uniqueness of the aggregated link flow and social cost at equilibrium under a specific class of cost functions. Leveraging this VI framework, we provide sufficient conditions under which including autonomous agents improves, deteriorates, or has no effect on social cost. While the possibility of deterioration has been established in prior work, our results complement existing worst-case bounds by explicitly characterizing sufficient conditions under which each outcome occurs, thereby providing a deeper understanding of mixed-autonomy traffic systems. Furthermore, we consider a centralized scenario where a social planner optimizes the routing of autonomous agents, and show that the same equilibrium is achieved as in the decentralized scenario when assuming convex this http URL, we conduct numerical experiments that illustrate how social cost changes with the amount of autonomous vehicles under different system parameters.


[84] 2605.23804

Perceptually Lossless Tactile Texture Synthesis with Compact Spectral Envelope Models

Modern audio-visual media rely on compact representations for efficient storage and transmission, whereas realistic digital touch still depends on high-resolution tactile recordings. Existing approaches for representing tactile signals constrain manipulation and limit the generation of new content. Here, we introduce two compact representations, spectral beta and spectral slope, that capture the temporal spectral structure of finger-surface friction signals while preserving perceptually relevant information. Spectral beta models spectral skewness using a two-parameter beta distribution, whereas spectral slope approximates the spectrum with an asymmetric bandpass filter defined by low- and high-pass orders. We evaluated these representations in a perceptual study with 14 participants using five virtual textures rendered on a friction-modulation display and compared them with physical textures and high-fidelity reproductions of recorded signals. Spectral beta achieved perceptual similarity ratings comparable to those of the original high-fidelity reproductions. Regression analysis further showed that matching spectral energy across nine critical frequency bands was the strongest predictor of perceived realism. Together, these findings suggest that tactile texture perception depends primarily on fundamental temporal spectral patterns and that modeling these patterns is sufficient for perceptually realistic rendering. These results establish an efficient and scalable framework for haptic compression, communication, and synthetic texture generation.


[85] 2605.23813

Minimum Effort Control Using Variational Methods of Analytical Mechanics A New Approach For Optimal Control

Modern optimal control theory involves adjoining the already known equations of motion of a dynamic system to the objective function using dynamic costates; this is done in order to constrain the optimal control solutions to satisfy the equations of motion. The use of costates increases the number of variables and hence increases the complexity of the problem. On the other hand, variational methods of analytical mechanics finds the equations of motion by minimizing an action functional of the dynamic system, realizing control forces as external input to the system. In this paper a new disruptive approach for computing the optimal control is presented. This approach adopts the variational methods of analytical mechanics to derive equations for the control, in addition to the equations of motion. This is achieved by recognizing the control actuator as part of the dynamic system. In addition to the kinetic energy and potential energy, the action functional in this new approach includes additional energy terms that represent the control energy of the system. Two different methods are presented to write the modified action functional. The proposed approach is a significant departure from the modern optimal control theory, and it eliminates the need for costates when solving for the control. In this paper, a case study is presented to demonstrate the new approach.


[86] 2605.23864

Harnessing Individual Motivation for Collective Efficiency: A Mechanism-Driven Distributed Optimization Method

In industrial scenarios involving multi-agent collective decision-making, centralized decision-making may not be admissible due to restrictive access to individual local information, while the conflicts between participants' self-interest and global performance may also impede collaborative distributed decision-making. This paper proposes a mechanism-driven distributed decision-making method, wherein incentives are employed and designed to motivate participants to collaborate in a distributed fashion even though each participant's decision is driven primarily by self-interest. Focusing on optimization problems with coupled objective functions and coupled constraints, we design a distributed optimization algorithm tailored for this class of problems and provide guarantees for its convergence. Furthermore, we design two incentive mechanisms, the shadow pricing mechanism and the Vickrey-Clarke-Groves mechanism, and demonstrate that participants are willing to engage in distributed collaboration under these mechanisms. The mechanism drives the execution of the distributed algorithm, and the optimal result of distributed computation guides the determination of incentives in the mechanism, both of which are interrelated to form a closed loop. Finally, numerical experiments illustrate the effectiveness of the proposed algorithm and mechanisms.


[87] 2310.05999

Sustainable and Efficient Renewable-Driven Energy Trading via Neural-Enhanced Time-Adaptive Robust Nash Bargaining between Hydrogen-Enriched Gas and Active Distribution Networks

Integrated hydrogen-enriched compressed natural gas (HCNG) and active distribution network (ADN) is providing efficient and sustainable flexibility for consuming renewable energies. Yet, cross-sector privacy and uncertain high-renewable scenarios block stable coordination. They also worsen decision performance and convergence. To conquer the barrier, a neural enhanced time-adaptive robust Nash bargaining strategy is this http URL the first stage, to clear energy trading between ADN and gas distribution network (GDN) and promote its sustainability, a privacy preserved Nash Bargaining based on the alternating direction method of multipliers (ADMM) is applied. The next robust dispatch stage explores the worst renewable scenarios and derisks ADNs profit collapse from uncertainties. The convergence of the entire energy trading scheme is theoretically proved. As such, sustainable returns from the participation of solid oxide fuel cell (SOFC) and HCNG are facilitated. Finally, a time complexity and social welfare co-driven neural-enhanced time-adaptive strategy is proposed. The strategy assesses the influence of time resolution on social benefits and solving time in multi-energy trading. Based on the assessment, a neural network surrogate model is trained to accelerate the trading process in a close looped manner. Numerical assessment reveals that, the proposed strategy reaps a stable social welfare of nearly 1.6% to total cost, and benefit-steady situations for both ADN and GDN, even in the worst renewable scenarios. Moreover, it reduces runtime to 102.47s, improving computational efficiency by over 69.86% versus the fixed time-scale baseline, almost without sacrifice in economy.


[88] 2311.13056

Simultaneous Online System Identification and Control using Composite Adaptive Lyapunov-Based Deep Neural Networks

Although deep neural network (DNN)-based controllers are popularly used to control uncertain nonlinear dynamic systems, most results use DNNs that are pretrained offline and the corresponding controller is implemented post-training. Recent advancements in adaptive control have developed controllers with Lyapunov-based update laws (i.e., control and update laws derived from a Lyapunov-based stability analysis) for updating the DNN weights online to ensure the system states track a desired trajectory. However, the update laws are based on the tracking error, and offer guarantees on only the tracking error convergence, without providing any guarantees on system identification. This paper provides the first result on simultaneous online system identification and trajectory tracking control of nonlinear systems using adaptive updates for all layers of the DNN. A combined Lyapunov-based stability analysis is provided, which guarantees that the tracking error, state-derivative estimation error, and DNN weight estimation errors are uniformly ultimately bounded. Under the persistence of excitation (PE) condition, the tracking and weight estimation errors are shown to exponentially converge to a neighborhood of the origin, where the rate of convergence and the size of this neighborhood depends on the gains and a factor quantifying PE, thus achieving system identification and enhanced trajectory tracking performance. As an outcome of the system identification, the DNN model can be propagated forward to predict and compensate for the uncertainty in dynamics under intermittent loss of state feedback. Comparative simulation results are provided on a two-link manipulator system and an unmanned underwater vehicle system with intermittent loss of state feedback, where the developed method yields significant performance improvement compared to baseline methods.


[89] 2410.19842

A comprehensive evaluation of pretraining strategies for channel-agnostic contrastive self-supervision of biosignals

Contrastive learning yields impressive results for self-supervision in computer vision. The approach relies on the creation of positive pairs, something which is often achieved through augmentations. However, for multivariate time series effective augmentations can be difficult to design. Additionally, the number of input channels for biosignal datasets often varies from application to application, limiting the usefulness of large self-supervised models trained with specific channel configurations. Motivated by these challenges, we set out to investigate strategies for creation of positive pairs for channel-agnostic self-supervision of biosignals. We introduce contrastive random lead coding (CRLC), where random subsets of the input channels are used to create positive pairs and compare with using augmentations and neighboring segments in time as positive pairs. We validate our approach by pre-training models on EEG and ECG data, and then fine-tuning them for downstream tasks. CRLC outperforms competing strategies in both scenarios in the channel-agnostic setting. Notably, for EEG tasks CRLC surpasses the current state-of-the-art reference model. While, the state-of-the-art reference model is superior in the ECG task, incorporating CRLC allows us to obtain comparable results. In conclusion, CRLC helps generalization across variable channel setups when training our channel-agnostic model. The code is available at this https URL.


[90] 2503.11967

A Profit Sharing Mechanism for Coordinated Power Traffic System

The transportation network operator (TNO) and the power distribution network operator (DNO) act non cooperatively during the scheduling process. Under the TNOs management, the distribution of charging load may exacerbate the local supply-demand imbalance in the power distribution network (PDN), which negatively impacts the secure and economic operation of the PDN. This paper proposes a profit sharing mechanism based on the principle of incentive compatibility for coordinating the transportation network (TN) and the PDN to minimize the total operation cost of the PDN. In this mechanism, the scheduling process of the power transportation system is divided into two stages. At the prescheduling stage, the TNO allocates traffic flow and charging load without considering the operation of the PDN, after which the DNO schedules and obtains the original cost. At the rescheduling stage, the DNO shares part of the saved dispatch cost to motivate the TNO to reallocate the EVs charging, which is more beneficial to the operation of the PDN. This two-stage process is then simulated by two single level models and a bilevel model. Finally, the optimal sharing ratio is identified, at which the total scheduling cost of the DNO can decrease to the lowest point when gaming with the TNO. The efficiency of the proposed mechanism is simulated via a coupled network with 12 traffic nodes and 18 electric buses. Numerical results demonstrate that the DNO can achieve the minimum total cost. Simultaneously, the TNO can also benefit from the proposed profit-sharing mechanism.


[91] 2503.24152

Quantifying Grid-Forming Behavior: Bridging Device-level Dynamics and System-Level Strength

Grid-forming (GFM) technology is widely regarded as a promising solution for future power systems dominated by power electronics. However, a universally accepted definition of GFM behavior and precise method for its quantification remain elusive. Moreover, the impact of GFM converter on system stability is not precisely quantified, creating a significant disconnect between device and system levels. To address these gaps from a small-signal perspective, at the device level, the paper introduces a novel metric, the Forming Index (FI) to quantify a converter's response to grid voltage fluctuations. Rather than enumerating various control architectures, the FI provides a metric for the converter's GFM ability by quantifying its sensitivity to grid variations. At the system level, a new quantitative measure of system strength that captures the multi-bus voltage stiffness is proposed, which quantifies the voltage and phase angle responses of multiple buses to current or power disturbances. The paper further extends and defines this concept to grid strength and bus strength to identify weak areas within the system. Finally, the device and system levels are bridged by formally proving that GFM converters enhance system strength. The proposed framework provides a unified benchmark for GFM converter design, optimal placement, and system stability assessment.


[92] 2506.00474

A European Multi-Center Breast Cancer MRI Dataset

Early detection of breast cancer is critical for improving patient outcomes. While mammography remains the primary screening modality, magnetic resonance imaging (MRI) is increasingly recommended as a supplemental tool for women with dense breast tissue and those at elevated risk. However, the acquisition and interpretation of multiparametric breast MRI are time-consuming and require specialized expertise, limiting scalability in clinical practice. Artificial intelligence (AI) methods have shown promise in supporting breast MRI interpretation, but their development is hindered by the limited availability of large, diverse, and publicly accessible datasets. To address this gap, we present a publicly available, multi-centre breast MRI dataset collected across six clinical institutions in five European countries. The dataset comprises 741 examinations from women undergoing screening or diagnostic breast MRI and includes malignant, benign, and non-lesion cases. Data were acquired using heterogeneous scanners, field strengths, and acquisition protocols, reflecting real-world clinical variability. In addition, we report baseline benchmark experiments using a transformer-based model to illustrate potential use cases of the dataset and to provide reference performance for future methodological comparisons.


[93] 2507.14300

Distributed consensus-based observer design for target state estimation with bearing measurements

This paper introduces a novel distributed consensus-based observer design that enables a group of agents in an undirected communication network to solve the problem of target tracking, where the target is modelled as a chain of integrators of arbitrary order. Each agent is assumed to know its own position and simultaneously measure bearing vectors relative to the target. We start by introducing a general continuous time observer design tailored to systems whose state dynamics are modelled as chains of integrators and whose measurement model follows a particular nonlinear but observer-suited form. This design leverages a correction term that combines innovation and consensus components, allowing each agent to broadcast only a part of the state estimate to its neighbours, which effectively reduces the data flowing across the network. To provide uniform global exponential stability guarantees, a novel result for a class of nonlinear closed-loop systems in a generalized observer form is introduced and subsequently used as the main tool to derive stability conditions on the observer gains. Then, by exploring the properties of orthogonal projection matrices, the proposed design is used to solve the distributed target tracking problem and provide explicit stability conditions that depend on the target-agents geometric formation. Practical examples are derived for a target modelled as first-, second-, and third-order integrator dynamics, highlighting the design procedure and the stability conditions imposed. Finally, numerical results showcase the properties of the proposed algorithm.


[94] 2509.21406

A Crime/S.I.R. optimal control problem

This paper presents and discusses a mathematical model inspired by control theory to derive optimal public policies for minimizing costs associated with the reduction and control of criminal activity in a population. Specifically, we analyze the optimal control problem \begin{equation*} \min G(u_1, u_2, u_3) = \int_{0}^{t_{\text{F}}} \left( I(t) - R(t) + \frac{B_1}{2} u_1^2(t) + \frac{B_2}{2} u_2^2(t) + \frac{B_3}{2} u_3^2(t) \right) \, dt. \end{equation*} where $I=I(t)$ and $R=R(t)$ satisfies the system of equations \begin{equation*} \left\{ \begin{aligned} \dot{S} &= \Lambda - (1-u_1)SI - \mu S + ((1+u_3)\gamma_2)I + \rho \Omega R,\\ \dot{I} &= (1-u_1)SI - (\mu + \delta_1)I - ((1+u_2)\gamma_1)I - ((1+u_3)\gamma_2)I + (1-\Omega)\rho R,\\ \dot{R} &= ((1+u_2)\gamma_1)I - (\mu + \delta_2 + \rho)R. \end{aligned} \right. \end{equation*} Our approach assumes that the social and economic effects of criminal behavior can be modeled by a dynamic SIR-type system, which serves as a constraint on a cost functional associated with the strategies implemented by government and law enforcement authorities to reduce criminal behavior. Using optimal control theory, the proposed controls, i.e., preventive policies (such as community and social cohesion programs), are expected to have a significant and positive impact on crime reduction, generating opportunities for the most disadvantaged sectors of Cali society and contributing to long-term security. Given that resources to address this problem are limited, this research aims to determine an optimal combination of public interventions and policies that minimize criminality at the lowest possible economic cost, using an SIR model, tools from variational calculus, and optimal control theory.


[95] 2512.01101

Supervisory control synthesis for multilevel DES with local buses

In multilevel supervisor synthesis, dependency structure matrix techniques can be used to transform the models of plants and requirements into a tree-structured hierarchical decomposition of the synthesis problem and thus efficiently synthesize local supervisors. A bus component, which has many dependencies across a system, tends to lead to an undesirable clustering of many components in one synthesis subproblem. Prior work showed how to recognize and properly treat a global bus structure. In this paper we leverage this work from global to local bus structures through a novel multilevel discrete-event system (MLDES) architecture. Specifically, the hierarchical system decomposition is revisited by allowing bus detection not only on the top level but at each level of the system hierarchy. Given this architecture, an algorithm is introduced that constructs a tree-structured MLDES. A case study on a production line shows the effectiveness of the proposed method through significantly improved synthesis performance, measured by the sum of the controlled state-space sizes of the local supervisors.


[96] 2512.01182

Autonomous Navigation and Station-Keeping on Near-Rectilinear Halo Orbits

This article develops an optical navigation (OPNAV) and station-keeping pipeline for the near-rectilinear halo orbit (NRHO) in high-fidelity ephemeris model dynamics, using synthetic images of the Moon in a non-iterative horizon-based OPNAV algorithm, applying the result in a navigation filter, and using the obtained estimates in a station-keeping control scheme that keeps the spacecraft in the vicinity of a reference orbit. We study differential correction-based and minimization-based implementations of the so-called x-axis and propose an improved targeting prediction scheme by incorporating the filter's state covariance with an unscented transform. We also introduce a hysteresis mechanism, which improves stationkeeping cost and provides insight into the difference in performance between the differential correction-based and minimization-based approaches. We perform Monte-Carlo experiments to assess the pipeline's tracking and Delta-V performances. We report several key findings, including the variability of the filter performance with the sensor field of view and measurement locations, station-keeping cost reduction achieved by the unscented transform-based prediction and hysteresis, as well as the variability of the cumulative Delta-V as a function of maneuver location due to the periodic structure in the OPNAV-based filter's estimation accuracy.


[97] 2512.01386

Joint CFO-Channel Estimation under Strong Inter-Cell Interference for Low-Altitude Radio Mapping

Extending terrestrial networks into low-altitude airspace is a practical way to support aerial services, and accurate low-altitude radio maps are essential for characterizing terrestrial base station (BS) coverage and guiding system design. This work targets per-cell per-beam radio mapping from 5G new radio (NR) synchronization signal (SS) burst sets. Conventional processing treats interference as noise and focuses on the strongest link, which is insufficient to comprehensive awareness of the radio environment and ineffective in dense multi-cell low-altitude scenarios. We propose a successive waveform reconstruction and cancellation framework that iteratively estimates, reconstructs, and subtracts the SSs of stronger BSs, thereby enabling reliable detection and estimation of ultra-weak signals. To support this, we introduce the notion of a carrier frequency offset (CFO)-coherent block within which a common-CFO/per-synchronization signal block (SSB)-channel model holds and design a joint CFO-channel estimator that coherently aggregates multiple SSBs within each CFO-coherent block. We further derive closed-form scaling laws that relate estimation accuracy to unmanned aerial vehicle (UAV) speed, motion geometry, burst periodicity, and the length of the CFO-coherent block. Simulations show that the proposed framework can detect and estimate SSs at signal-to-interference-and-noise ratio (SINR) levels down to -30 dB. Field tests at 150 m altitude demonstrate per-beam coverage maps for more than ten overlapping BSs and reveal that, despite strong received power, the measured SINR rarely exceeds 10 dB, underscoring the need for careful interference management in low-altitude airspace.


[98] 2512.09827

Energy-Efficient Federated Learning with Relay-Assisted Aggregation in IIoT Networks

This paper presents an energy-efficient transmission framework for federated learning (FL) in industrial Internet of Things (IIoT) environments with strict latency and energy constraints. Machinery subnetworks (SNs) collaboratively train a global model by uploading local updates to an edge server (ES), either directly or via neighboring SNs acting as decode-and-forward relays. To enhance communication efficiency, relays perform partial aggregation before forwarding the models to the ES, significantly reducing overhead and training latency. We analyze the convergence behavior of this relay-assisted FL scheme. To address the inherent energy efficiency (EE) challenges, we decompose the original non-convex optimization problem into sub-problems addressing computation and communication energy separately. An SN grouping algorithm categorizes devices into single-hop and two-hop transmitters based on latency minimization, followed by a relay selection mechanism. To improve FL reliability, we further maximize the number of SNs that meet the roundwise delay constraint, promoting broader participation and improved convergence stability under practical IIoT data distributions. Transmit power levels are then optimized to maximize EE, and a sequential parametric convex approximation (SPCA) method is proposed for joint configuration of system parameters. We further extend the EE formulation to the imperfect channel state information (ICSI). Simulation results demonstrate that the proposed framework significantly enhances convergence speed, reduces outage probability from 10-2 in single-hop to 10-6 and achieves substantial energy savings, with the SPCA approach reducing energy consumption by at least 2x compared to unaggregated cooperation and up to 6x over single-hop transmission.


[99] 2602.03070

ProOPF: Benchmarking and Improving LLMs for Professional-Grade Power Systems Optimization Modeling

Growing renewable penetration introduces substantial uncertainty into power system operations, necessitating frequent adaptation of dispatch objectives and constraints and challenging expertise-intensive, near-real-time modeling workflows. Large Language Models (LLMs) provide a promising avenue for automating this process by translating natural-language (NL) operational requirements into executable optimization models via semantic reasoning and code synthesis. Yet existing LLM datasets and benchmarks for optimization modeling primarily target coarse-grained cross-domain generalization, offering limited, rigorous evaluation in power-system settings, particularly for Optimal Power Flow (OPF). We therefore introduce \textbf{ProOPF-D} and \textbf{ProOPF-B}, a dataset and benchmark for professional-grade OPF modeling: ProOPF-D contains 12K instances pairing NL requests with parameter adjustments and structural extensions to a canonical OPF, together with executable implementations; ProOPF-B provides 121 expert-annotated test cases with ground-truth code, enabling end-to-end evaluation under both concrete and abstract OPF modeling regimes.


[100] 2602.14671

Data Augmentation for Pathological Speech Enhancement

The performance of state-of-the-art speech enhancement (SE) models considerably degrades for pathological speech due to atypical acoustic characteristics and limited data availability. This paper systematically investigates data augmentation (DA) strategies to improve SE performance for pathological speakers affected by Parkinson`s disease, evaluating both predictive and generative SE models. We examine three DA categories, i.e., transformative, generative, and noise augmentation, assessing their impact with objective SE metrics. Experimental results show that noise augmentation consistently delivers the largest and most robust gains, transformative augmentations provide moderate improvements, while generative augmentation yields limited benefits and can harm performance as the amount of synthetic data increases. Furthermore, we show that the effectiveness of DA varies depending on the SE model, with DA being more beneficial for predictive SE models. While our results demonstrate that DA improves SE performance for pathological speakers, a performance gap between neurotypical and pathological speech persists, highlighting the need for future research on targeted DA strategies for pathological speech.


[101] 2603.03895

Constellation Selection and Power Control for OFDM-based ISAC: From Theory to Prototype

Integrated sensing and communication (ISAC) techniques can leverage existing, wide-coverage communication networks to perform sensing tasks, enabling large-scale and low-cost target sensing. However, the inherent randomness of communication data payloads introduces undesired sidelobes in the ambiguity function that may degrade target detection and parameter estimation performance. This paper develops a communication-centric ISAC framework that is standards-compliant and compatible with existing devices. Specifically, we propose a low-complexity constellation selection scheme over a finite, off-the-shelf alphabet, achieving an efficient sensing-communication trade-off without custom waveforms or frame-structure changes. To this end, we analyze two classical sensing receivers including matched filtering (MF) and reciprocal filtering (RF) for ranging measurements, and derive closed-form sensing laws that link constellation statistics to sensing performance. Under any finite-alphabet constellation combination, MF sidelobes depend on the weighted sum of the kurtosis values of the per-subcarrier constellations, while RF noise enhancement depends on the inverse second moment of the transmit symbol, providing a tractable expression for tuning the sensing-communication trade-off. The analysis extends to multi-symbol coherent integration and achieves the expected processing gain. We prove that in flat-fading channels, any Pareto-optimal solution activates no more than three constellations. For frequency-selective channels, a bilevel algorithm with closed-form inner updates attains near-optimal performance while sharply reducing computational complexity. We validate the entire theoretical pipeline with numerical simulations as well as experimental results.


[102] 2603.10743

Scaling and Trade-offs in Multi-agent Autonomous Systems

Designing autonomous drone swarms is hampered by a vast design space spanning platform, algorithmic, and numerical-strength choices. We perform large-scale agent-based simulations in three canonical scenarios: swarm-on-swarm battle, cooperative area search with attrition, and pursuit of scattering targets. We demonstrate how dimensional-analysis and data-scaling can be leveraged to collapse performance data onto scaling functions that are mathematically simple, yet counterintuitive and therefore difficult to predict a priori. These scaling laws reveal success-failure boundaries, including sharp break points which we show can be framed as an ``effective swarm size.'' Additionally, we show how this technique can be used to quantify trade-offs between agent count and platform parameters such as velocity, sensing or weapon range, and attrition rate. Furthermore, we show the benefits of embedding an optimal path planning loop within this framework, which can qualitatively improve the scaling laws that govern the outcome. The methods we demonstrate are highly flexible and would enable rapid, budget-aware sizing and algorithm selection for large autonomous swarms.


[103] 2603.15278

Encirclement Guaranteed Finite-Time Capture against Unknown Evader Strategies

We consider a pursuit-evasion scenario involving a group of pursuers and a single evader in a two-dimensional unbounded environment. The pursuers aim to capture the evader in finite time while ensuring the evader remains enclosed within the convex hull of their positions until capture, without knowledge of the evader's heading angle. Prior works have addressed the problem of encirclement and capture separately in different contexts. In this paper, we present a class of strategies for the pursuers that guarantee capture in finite time while maintaining encirclement, irrespective of the evader's strategy. Furthermore, we derive an upper bound on the time to capture. Numerical results highlight the effectiveness of the proposed framework against a range of evader strategies.


[104] 2603.18123

Understanding Task Aggregation for Generalizable Ultrasound Foundation Models

Foundation models promise to unify multiple clinical tasks within a single framework, but recent ultrasound studies report that unified models can underperform task-specific baselines. We hypothesize that this degradation arises not from model capacity limitations, but from task aggregation strategies that ignore interactions between task heterogeneity and available training data scale. In this work, we systematically analyze when heterogeneous ultrasound tasks can be jointly learned without performance loss, establishing practical criteria for task aggregation in unified clinical imaging models. We introduce M2DINO, a multi-organ, multi-task framework built on DINOv3 with task-conditioned Mixture-of-Experts blocks for adaptive capacity allocation. We systematically evaluate 27 ultrasound tasks spanning segmentation, classification, detection, and regression under three paradigms: task-specific, clinically-grouped, and all-task unified training. Our results show that aggregation effectiveness depends strongly on training data scale. While clinically-grouped training can improve performance in data-rich settings, it may induce substantial negative transfer in low-data settings. In contrast, all-task unified training exhibits more consistent performance across clinical groups. We further observe that task sensitivity varies by task type in our experiments: segmentation shows the largest performance drops compared with regression and classification. These findings provide practical guidance for ultrasound foundation models, emphasizing that aggregation strategies should jointly consider training data availability and task characteristics rather than relying on clinical taxonomy alone.


[105] 2604.15524

Safe and Energy-Aware Multi-Robot Density Control via PDE-Constrained Optimization for Long-Duration Autonomy

This paper presents a novel density control framework for multi-robot systems with spatial safety and energy sustainability guarantees. Stochastic robot motion is encoded through the Fokker-Planck Partial Differential Equation (PDE) at the density level. Control Lyapunov and control barrier functions are integrated with PDEs to enforce target density tracking, obstacle region avoidance, and energy sufficiency over multiple charging cycles. The resulting quadratic program enables fast in-the-loop implementation that adjusts commands in real-time. Multi-robot experiment and extensive simulations were conducted to demonstrate the effectiveness of the controller under localization and motion uncertainties.


[106] 2605.12408

From EEG Cleaning to Decoding: The Role of Artifact Rejection in MI-based BCIs

Motor imagery (MI) BCIs are sensitive to EEG artifacts, yet the practical impact of automated artifact rejection on downstream MI decoding performance remains unclear. While most work focuses on decoder design, the contribution of data curation, particularly automated rejection policies, has received comparatively less attention, despite its importance for robust machine learning pipelines. Here, we propose Fast Automatic Artifact Rejection (FAAR), a lightweight method that computes a compact set of artifact-sensitive features, derives an epoch-level Signal Quality Index, adaptively selects rejection thresholds, and automatically rejects contaminated epochs without requiring prior knowledge of artifact types or manual threshold tuning. We evaluate FAAR on 13 publicly available MI datasets and compare it to a no-rejection baseline, AutoReject, and Isolation Forest. We show rejection effects are strongly subject- and regime-dependent, with the largest gains in low-baseline/low-SNR conditions, so it should be used adaptively. FAAR reduces inter-subject performance variability, an important property for MI-BCI reliability and BCI-illiteracy, without aggressive data removal. Finally, FAAR's lightweight and fully automated thresholding yields consistent rejection behavior across offline curation, training, and online filtering, and supports real-time BCI constraints.


[107] 2605.17905

Curriculum-Guided Heterogeneous Multi-Agent Intelligence for Multi-UAV Cooperative ISAC

Seamlessly unifying communication and sensing, sixth-generation (6G) networks are poised to transform into intelligent platforms with high spectral-energy efficiency and real-time environmental awareness. In the low-altitude economy, unmanned aerial vehicles (UAVs) enable air-ground integrated sensing and communication (ISAC) for applications such as logistics and inspection, yet most studies focus on single-UAV or homogeneous-agent designs. In contrast, this paper proposes a multi-UAV cooperative ISAC system that enables heterogeneous-agent collaboration between multiple UAVs and a ground base station (BS) for joint target sensing, tracking, and communication. The system is formulated as a posterior Cramer-Rao bound (PCRB) minimization problem under communication performance constraints, utilizing joint trajectory-beamforming optimization. To tackle the NP-hard nature of this problem, we design a curriculum-based heterogeneous-agent proximal policy optimization (C-HAPPO) algorithm, where curriculum learning guides progressive policy refinement and Kronecker/QR decomposition mitigates action dimensionality. Simulation results show that the proposed approach achieves more than a 30% improvement in sensing performance, faster convergence, and higher tracking accuracy than existing baselines, demonstrating its scalability and effectiveness for complex multi-UAV ISAC scenarios.


[108] 2502.04230

XAttnMark: Learning Robust Audio Watermarking with Cross-Attention

The rapid proliferation of generative audio synthesis and editing technologies has raised serious concerns about copyright infringement, data provenance, and the spread of misinformation via deepfake audio. Watermarking offers a proactive solution by embedding imperceptible yet identifiable and traceable signals into audio content. While recent neural network-based watermarking methods like WavMark and AudioSeal have improved robustness and quality, they struggle to jointly optimize both robust detection and accurate attribution. This paper introduces Cross-Attention Robust Audio Watermark (XATTNMARK), which bridges this gap by leveraging partial parameter sharing between the generator and the detector, a cross-attention mechanism for efficient message retrieval, and a temporal conditioning module for improved message distribution. Additionally, we propose a psychoacoustic-aligned time-frequency (TF) masking loss that captures fine-grained auditory masking effects, improving watermark imperceptibility. XATTNMARK achieves state-of-the-art performance in both detection and attribution, demonstrating superior robustness against a wide range of audio transformations, including challenging generative editing at varying strengths. This work advances audio watermarking for protecting intellectual property and ensuring authenticity in the era of generative AI.


[109] 2503.04929

Neural Configuration-Space Barriers for Manipulation Planning and Control

Planning and control for high-dimensional robot manipulators in cluttered dynamic environments require computational efficiency and robust safety guarantees. Inspired by recent advances in learning configuration-space distance functions (CDFs) as representations of robot bodies, we propose a unified approach for motion planning and control that formulates safety constraints as CDF barriers. A CDF barrier approximates the local free configuration space, substantially reducing the number of collision-checking operations during motion planning. However, learning a CDF barrier with a neural network and relying on online sensor observations introduces uncertainties that must be considered during control synthesis. To address this, we develop a distributionally robust CDF barrier formulation for control that accounts for modeling errors and sensor noise without assuming a known underlying distribution. Simulations and hardware experiments on a UFactory xArm6 manipulator show that our neural CDF barrier formulation enables efficient planning and robust safe control in cluttered and dynamic environments, relying only on onboard point-cloud observations.


[110] 2506.10118

Data-driven balanced truncation for second-order systems with generalized proportional damping

Structured reduced-order modeling is a central component in the computer-aided design of control systems in which cheap-to-evaluate low-dimensional models with physically meaningful internal structures are computed. In this work, we develop a new approach for the structured data-driven surrogate modeling of linear dynamical systems described by second-order time derivatives via balanced truncation model-order reduction. The proposed method is a data-driven reformulation of position-velocity balanced truncation for second-order systems and generalizes the quadrature-based balanced truncation for unstructured first-order systems to the second-order case. The computed surrogates encode a generalized proportional damping structure, and we propose a computational procedure for inferring the damping coefficients from data by minimizing a least-squares error over the coefficients. Several numerical examples demonstrate the effectiveness of the proposed method.


[111] 2603.01655

Transform-Invariant Generative Ray Path Sampling for Efficient Radio Propagation Modeling

Ray tracing has become a standard for accurate radio propagation modeling, but suffers from exponential computational complexity, as the number of candidate paths scales with the number of objects raised to the interaction order. This bottleneck limits its use in large-scale or real-time applications, forcing traditional tools to rely on heuristics that reduce path candidates at the cost of potentially reduced accuracy. To overcome this limitation, we propose a machine-learning-assisted framework that replaces exhaustive path searching with intelligent sampling via Generative Flow Networks. Applying these generative models to this domain presents challenges, particularly sparse rewards due to the rarity of valid paths, which can lead to convergence failures and trivial solutions when evaluating high-order interactions in complex environments. To ensure robust learning and efficient exploration, our framework incorporates three key components. First, an \emph{experience replay buffer} captures and retains rare valid paths. Second, a uniform exploratory policy improves generalization and prevents overfitting to simple geometries. Third, a physics-based action masking strategy filters out physically impossible paths before the model considers them. Validated on idealized street-canyon scenarios, our model achieves substantial speedups over exhaustive search -- up to $10\times$ faster on GPU and $100\times$ faster on CPU -- while maintaining high coverage accuracy and successfully uncovering complex propagation paths. However, out-of-distribution evaluations on real-world Manhattan street geometries reveal that generalizing to substantially different urban morphologies requires further advancement in model capacity or alternative training strategies. Source code, tests, and a tutorial are available at this https URL.


[112] 2603.24489

Model Predictive Path Integral Control as Preconditioned Gradient Descent

Model Predictive Path Integral (MPPI) control is a widely used sampling-based method for trajectory optimization, yet its convergence properties remain only partially understood. This paper provides a direct convergence analysis using variational optimization. By lifting constrained trajectory optimization to a Kullback-Leibler (KL) regularized problem over decision distributions, we derive a reduced free-energy objective defined over a parametric sampling family. For general parametric families, we derive gradient and Hessian representations of this reduced objective and analyze preconditioned gradient descent on the sampling-distribution parameters. In the fixed-covariance Gaussian case, the classical MPPI update is recovered exactly as a unit-step preconditioned gradient update. We prove descent and stationarity guarantees for the exact expectation-based iteration when the Hessian of the reduced objective is bounded in the metric induced by the preconditioner. For the Gaussian family, we further show that the preconditioned Hessian is governed by the covariance of the Gibbs-tilted distribution relative to the covariance of the sampling distribution, yielding a covariance-dependent sufficient condition for the descent of exact unit-step MPPI. Numerical experiments illustrate the theory and the effect of key hyperparameters.


[113] 2605.06958

Hybrid Multiport Receivers for Slow Fluid Antenna Multiple Access

We propose a novel receiver architecture that preserves the performance benefits of multiport selection in fluid-antenna systems while requiring only a very small number of radio-frequency (RF) chains. The resulting fluid-antenna hybrid multiport (FAHM) receiver effectively decouples port selection from signal combining by integrating a low-complexity analog combining network similar to those used in conventional hybrid multiantenna designs. We develop a stopping criterion to determine the number of selected ports, which limits the performance loss associated with port selection, and then design the hybrid combiner for a given RF-chain budget. The FAHM architecture is evaluated in a multiuser set-up operating under slow fluid-antenna multiple access (FAMA). In this scenario, a FAHM implementation with only 2 RF chains showcases a performance comparable to a fully-digital conventional multiport scheme with a much larger number of RF chains. Additionally, the proposed receiver architecture attains over 60% reduction in computational burden when integrated with a novel efficient implementation of the state-of-the-art generalized-eigenvector port-selection method.


[114] 2605.07880

Robust Capacity Expansion under Wildfire Ignition Risk and High Renewable Penetration

In power systems, the risk of wildfire ignition has increased significantly in recent years. The impact and severity of these events on energy dispatch, as well as their societal ramifications, make wildfire prevention critical for power system planning and operation. A common intervention by system operators is to de-energize transmission lines to mitigate the risk of fire caused by equipment failures. With the growing integration of variable renewable generation, managing and preparing the system to de-energization under wildfire risk has become even more challenging. In this context, mitigation decisions such as installing battery energy storage systems and undergrounding transmission lines can reduce the risk and adverse effects associated with de-energization and renewable generation variability. This paper presents a robust optimization model to determine the optimal location of battery storage and undergrounding of transmission line investment, utilizing representative weeks and uncertainty sets to capture the temporal relationship of uncertain variables. Specifically, this paper addresses: (i) the worst-case realization of ignition risk leading to the de-energization of transmission lines, combined with the worst-case realization of renewable energy availability, and (ii) the optimal investment decisions for energy storage capacity and undergrounding of transmission lines that are exposed to ignition risk. The proposed model is formulated as a mixed-integer linear programming (MILP) problem, employing duality theory and binary decomposition to address nonlinearities, and is solved using a column-and-constraint generation algorithm. The proposed framework is evaluated on a model of the San Diego power system, demonstrating its practical effectiveness in improving the resilience to wildfire risk.