New articles on Electrical Engineering and Systems Science


[1] 2606.13694

Efficient Temporal Modeling for Mobile Sleep Staging via Lightweight Random Attention

Mobile sleep staging serves as a foundational infrastructure for in-home sleep monitoring and closed-loop modulation. But existing sequential models such as RNNs and Transformers are computationally expensive for mobile deployment. In this paper, we propose Random Attention (RA), a lightweight temporal modeling module based on fixed random projections, which replaces learnable sequence modeling with similarity-based aggregation. RA introduces little additional parameters beyond the epoch encoder while enabling effective temporal smoothing. We further provide a theoretical interpretation via the Random Attention Prior Kernel (RAPK), which decomposes RA into a global smoothing term and a feature similarity term, offering an interpretable view of temporal sleep structure. Experiments on Sleep-EDF-20 and Sleep-EDF-78 show that RA consistently improves epoch-wise baselines by 1-3\% in accuracy and F1 score, while achieving competitive performance compared with LSTM, GRU, and Transformer models. RA also demonstrates strong generalization across different backbone encoders and improved robustness over conventional temporal smoothing methods. These results indicate that efficient sleep staging can be achieved through lightweight similarity-based temporal aggregation, making RA suitable for real-time wearable applications.


[2] 2606.13698

Active Inference for Adaptive Traffic Signal Control in Noisy Nonstationary IoT Environments

Urban traffic signal control at IoT-instrumented intersections must remain effective under sensor occlusion, weather attenuation, and nonstationary demand. Conventional controllers degrade under these conditions, and learned policies remain difficult to audit. To address these challenges, we propose an active inference controller for a four-arm signalized intersection that dynamically selects phases by minimizing expected free energy (EFE) over Gaussian beliefs about per-direction congestion levels, yielding a fully traceable decision pipeline. We benchmark the controller in a SUMO traffic simulator against a rule-based heuristic and a deep Q-network (DQN) across four scenarios that progressively increase noise and nonstationarity, spanning sensor occlusion, adverse weather, and stochastic accidents. Across 100 independent random evaluations per scenario, active inference attains the lowest idle times and CO2 emissions in the noisiest scenarios (56,977 s and 29.12 kg vs. 71,741 s and 30.56 kg for DQN). These gains come at a modest cost in bus priority service rate and phase switch frequency.


[3] 2606.13700

C-MambaPose: A Physics-Informed Complex Mamba Framework for Cross-Environment WiFi Human Pose Estimation

Human pose estimation (HPE) utilizing wireless WiFi signals has emerged as a promising technology owing to its device-free nature, privacy preservation, and robustness against occlusion and poor lighting. However, existing methods often overlook the physical complex phase information of WiFi signals and fail to generalize across diverse environments due to severe domain shifts. In this paper, we present C-MambaPose, a physics-informed complex-valued Mamba-GraFormer hybrid framework for robust cross-environment WiFi-based 3D HPE. Our framework first sanitizes raw WiFi Channel State Information (CSI) phase errors and constructs a phase-preserving complex-valued representation. We then employ a Spatiotemporal Complex Mamba encoder with a dynamic selective receptive field to capture fine-grained phase dynamics. A cross-attention joint-query mapper maps the unstructured sequence tokens to human joints, which are decoded by a Graph Convolutional Network (GCN) to predict anatomically coherent 3D coordinates. Extensive evaluations on the MM-Fi dataset show that C-MambaPose achieves competitive or superior performance to state-of-the-art baselines across all settings, setting a new state-of-the-art specifically on the challenging cross-environment split, requiring only 3.78 M parameters-an 83.1\% reduction compared to GraphPose-Fi~\cite{chen2026graph} and an 85.7\% reduction compared to MetaFi++~\cite{zhou2023metafi++}, while maintaining a comparable size to DT-Pose~\cite{chen2025towards} (which is only 18\% smaller) but achieving significantly superior performance without requiring any pretraining. Our code is publicly available at this https URL.


[4] 2606.13701

Synergistic Blood Pressure Estimation via Contactless mmWave Radar and Imaging Photoplethysmography: A Feasibility Study

Continuous, non-contact blood pressure (NCBP) monitoring holds significant promise for pervasive cardiovascular care, yet single-modality approaches -- such as imaging photoplethysmography (iPPG) -- remain constrained by environmental artifacts, skin-tone sensitivity, and the absence of proximal cardiac mechanical information. This study investigates the feasibility of a dual-modality sensing paradigm that synergistically integrates facial iPPG with posterior-facing frequency-modulated continuous wave (FMCW) millimeter-wave radar to capture complementary hemodynamic cues: distal optical volumetric fluctuations and proximal cardiac micro-motions (radar motion signals, RMS). To bridge the morphological disparity between these heterogeneous streams, we develop an end-to-end deep learning architecture, BiLSTM-MS-DiCNN, which leverages multi-scale dilated convolutions for spatial feature extraction and bidirectional long short-term memory for temporal dependency modeling. In a controlled feasibility study involving 15 healthy participants across distinct hemodynamic states (resting, deep breathing, and post-exercise), the proposed framework achieved a Mean Absolute Difference (MAD) of 4.71 mmHg for systolic BP (SBP) and 4.60 mmHg for diastolic BP (DBP) under resting conditions, with consistent performance during physiological perturbations. These preliminary findings demonstrate the viability of mmWave-iPPG fusion as a promising pathway toward robust, unobtrusive NCBP monitoring.


[5] 2606.13702

Uniform Asymptotics of the Pseudo Wigner-Ville Distribution for Nonlinear Chirps

The analysis of non stationary signals in complex physical systems often relies on Time Frequency distributions. Among these, the Pseudo Wigner Ville Distribution (PWVD) stands out for its superior resolution but is mathematically challenging due to its inherent quadratic nonlinearity. This nonlinearity generates complex interference artifacts and cross terms in the phase space, potentially obscuring the physical features of the signal, particularly for nonlinear chirps. In this work, we establish a mathematically grounded framework for the PWVD for general windowed nonlinear chirps. By leveraging the theory of oscillatory integrals with coalescing stationary points, we derive a uniform asymptotic expansion that bridges the gap between heuristic signal processing and semiclassical geometric approaches (Berry's chord construction). The resulting closed form representation, expressed in terms of symmetric incomplete Airy functions, provides a unified description of the nonlinear transform's behavior, regularizing the transition across the instantaneous frequency caustics. While the framework is general, we show its power on two illustrative examples: the high precision nonlinear chirps of coalescing binaries in gravitational-wave astronomy and radar nonlinear chirps for pulse compression applications. The analytical results successfully predict the structure of interference patterns and quantify the systematic bias in peak based frequency estimation. Therefore, this study establishes a systematic bridge between nonlinear mathematical analysis and precision experimental physics, validating the PWVD as a robust tool for detailed source characterization in high signal to noise regimes.


[6] 2606.13711

Wi-Fi Self-Coexistence in the 6 GHz Band: An ns-3 Evaluation of LPI and SP Usage

The U.S. has adopted four power regimes for opera tion in the shared unlicensed 6 GHz band -- standard power (SP), low-power indoor (LPI), geofenced variable power (GVP), and very low power (VLP) -- with maximum permitted EIRP levels of 36 dBm, 30 dBm, 24 dBm, and 14 dBm, respectively. Although these regimes are primarily intended to protect incumbent services, their heterogeneous transmit power levels also introduce additional coexistence challenges within 6 GHz Wi-Fi networks. In this paper, we develop an ns-3 Wi-Fi 6E/802.11ax coexistence testbed to study coexistence under heterogeneous power regimes and to provide a reproducible simulation methodology. To the best of our knowledge, prior work has not specifically examined self-coexistence issues within 6 GHz Wi-Fi networks. We evaluate two coexistence scenarios: one in which both the LPI AP and the SP AP are indoors, and another in which the LPI AP is indoors while the SP AP is outdoors. Results are compared against an indoor LPI--LPI baseline when applicable. Our findings show that: (i) the presence of an indoor SP AP can significantly degrade the goodput of an LPI AP; (ii) channel bandwidth is a key factor in determining the extent of SP-to-LPI impact, with the degradation being most severe at 20 MHz and partially alleviated at 160 MHz; (iii) physical blockage between outdoor SP and LPI APs improves fairness; and (iv) BSS coloring does not necessarily improve fairness in mixed-regime deployments. The simulation framework can be extended to study coexistence between Wi-Fi and cellular systems, as recently proposed by Ofcom in the U.K.


[7] 2606.13794

An integrated interpretable control effectiveness learning and nonlinear control allocation methodology for overactuated aircrafts

Nonlinear dynamics and the strong couplings that arise between multiple effectors undermine the assumptions behind conventional, linear control allocation techniques. When flight enters regimes where nonlinear effects dominate, linear allocators exhibit reduced accuracy due to increased model mismatch, which subsequently degrades performance and robustness of the flight control system. High fidelity onboard models and black box data driven approaches can recover accuracy across the flight envelope, but respectively impose computational burdens prohibitive for real time allocation and sacrifice the interpretability required for verification and fault diagnosis. This paper addresses these limitations by learning an explicit, physics constrained analytical model of the control effectiveness mapping from representative flight data using Sparse Identification of Nonlinear Dynamics. The resulting mapping is compact, interpretable, and admits analytical derivatives, enabling efficient computation within nonlinear solvers that additionally incorporate actuator dynamics, without requiring an onboard model. An online adaptation mechanism monitors prediction residuals and refreshes the model when significant plant changes are detected, providing graceful reconfiguration under actuator failures and varying operating conditions. The methodology is evaluated on a high fidelity nonlinear benchmark aircraft across a range of aggressive maneuvers, achieving accuracy comparable to a full nonlinear onboard model while substantially reducing computational cost relative to established baselines.


[8] 2606.13847

Modal Analysis of Spatial Load Correlation in AI Data Center-Dominated Power Systems

Hyperscale AI data centers induce spatially and temporally correlated load fluctuations that violate classical independence assumptions and are not captured by time-averaged spectral methods. These correlations are episodic and non-stationary, requiring analysis that resolves transient structure. This paper applies Dynamic Mode Decomposition (DMD) to the temporal evolution of pairwise inter-bus correlation coefficients to form a low-dimensional state representation that enables modal analysis without a stationarity assumption. DMD eigenvalues encode the correlation regime: their location in the complex plane distinguishes sustained coherence, decaying transients, and intensifying events, while oscillation frequency maps to underlying physical coupling mechanisms. Using an IEEE 39-bus Real-Time Digital Simulator (RTDS) testbed with three converter-interfaced AI data center loads driven by synthetic workload profiles, global DMD provides a time-averaged modal baseline in a slow thermal band ($f \approx 0.005$\,Hz, $|\mu| = 0.91$) captures 93.6\% of total correlation energy. A sliding-window DMD formulation identifies transient intensification events: 51 of 775 windows (6.6\%) satisfy the $|\mu_k^{(n)}| > 1$ criterion, which aligns with stochastic workload coincidences. Cross-validation with RTDS voltage coherence confirms elevated coupling during these intervals. The proposed modal growth indicator provides an early-warning signal of correlation intensification prior to peak pairwise coherence.


[9] 2606.13853

Spatial Load Correlation in AI Data-Center-Dominated Power Systems

The proliferation of large-scale data centers introduces spatially correlated demand profiles that challenge the long-standing assumption of statistical independence of loads in power system analysis. This paper examines the emergence of such load correlations and evaluates their impact on data-center-dominated grids. Analytical derivations reveal that correlated load fluctuations amplify aggregate stochastic disturbances, reduce voltage stability margins through weakened reactive power stiffness, and degrade frequency stability margin by erosion of natural load diversity effects. Real-time digital simulation studies confirm that moderate spatial correlation in distributed data centers produces simultaneous frequency deviations and voltage fluctuations across multiple buses. The findings offer transmission system operators a physics-based perspective to interpret emerging oscillatory phenomena and establish stability planning criteria grounded in measurable load-correlation structures rather than traditional diversity assumptions.


[10] 2606.13885

Learning Graph Topology with Functional Priors via Bilevel Optimization

Learning graph topology of complex networks is challenging due to limited data availability and imprecise data models. Different from prior works that focus on structural priors with explicit control on macroscopic properties such as sparsity, this paper proposes a novel functional prior approach for graph topology learning. We postulate that complex networks are inherently optimized to perform a certain task (e.g., social networks specialize at optimizing a welfare function, biological networks are resilient towards node/edge deletion), which can be incorporated as a regularizer to assist in graph learning. Mathematically, we formulate a bilevel optimization problem where the lower-level problem solves the associated task on a candidate graph topology and the upper-level problem trades off between data fitting and task performance. We design a two-timescale gradient descent (TTGD) algorithm and show that under verifiable conditions, it finds a stationary point to the bilevel graph learning problem with a sublinear convergence rate. We provide theoretical insights on the graph topology learned from the functional priors and show that the resulting regularizers subsume a broad class of graph filter regularizers, including polynomial graph regularizers as special cases. We show via extensive experiments on synthetic and real datasets that the proposed formulation gives rise to reliable estimates of graph topology, even with insufficient data.


[11] 2606.13891

TetraRL: A Self-Adaptive Runtime for On-Device Deep Reinforcement Learning Systems

Autonomous robotic systems, including autonomous vehicles, drones, and mobile robots, increasingly rely on on-device Deep Reinforcement Learning (DRL) to adapt to dynamic environments. Unlike cloud-based solutions, embedded DRL must perform training and inference directly on resource-constrained hardware while maintaining timely decision-making. This creates a fundamental challenge: balancing four tightly coupled objectives, real-time performance, task reward, memory utilization, and energy consumption. Optimizing these objectives independently often leads to suboptimal behavior, while conventional multi-objective methods may violate resource constraints and compromise reliability. This paper presents TetraRL, a self-adaptive runtime framework for tetra-objective on-device DRL. TetraRL formulates embedded DRL as a unified optimization problem over real-time, reward, RAM, and reserve (energy) objectives, and employs a preference-conditioned reinforcement learning controller to dynamically navigate the resulting trade-off space. The framework integrates a unified resource-management abstraction, hardware-aware DVFS control, and a runtime Override Layer for robust constraint enforcement. We implement TetraRL on NVIDIA Jetson AGX Orin and Orin Nano platforms and evaluate it across diverse DRL environments. Results show that TetraRL effectively balances all four objectives, achieves competitive trade-offs under varying runtime preferences, and incurs negligible overhead. Moreover, a single trained policy can support runtime-switchable optimization goals, providing a practical foundation for resource-aware and self-adaptive on-device DRL.


[12] 2606.13919

GMN4AD: Graph Matching Network for Alzheimer's Disease Diagnosis with Test-Time Domain Adaptation using Multi-centered Structure Magnetic Resonance Imaging

Alzheimer's Disease (AD) is a progressive neurodegenerative disorder that affects millions of older adults, with prevalence expected to rise significantly in the coming years. Early diagnosis, particularly during the mild cognitive impairment (MCI) stage, is critical for timely intervention. Structural Magnetic Resonance Imaging (sMRI) has emerged as a key modality for detecting AD-related brain changes, but traditional graph-based approaches often struggle with modality and inter-site heterogeneity, limiting diagnostic performance. In this paper, we propose Graph Matching Network for Alzheimer's Disease Diagnosis (GMN4AD), designed to model interactions between heterogeneous brain graphs derived from neuroimaging data. Unlike conventional methods that treat each brain graph independently, GMN4AD leverages graph matching to capture cross-graph relationships, enhancing diagnostic precision. Furthermore, we introduce a test-time domain adaptation strategy that combines contrastive learning to mitigate domain shifts during inference. Extensive experiments on three public AD datasets demonstrate that GMN4AD achieves superior performance compared to state-of-the-art methods, offering a robust and generalizable solution for AD diagnosis.


[13] 2606.13951

Accuracy of Joint Time-Based and Carrier-Phase Positioning in 5G Networks under Correlated Measurement Errors

High-accuracy positioning is critical for emerging applications such as autonomous driving, industrial automation, augmented reality, and smart cities. 3GPP Release 18 introduced carrier-phase (CP) positioning for 5G that offers superior accuracy compared to conventional time-based methods such as time of arrival (ToA). However, CP-based positioning requires resolving the integer phase ambiguity, which refers to the unknown number of full-wavelength cycles completed during signal propagation. Joint processing of ToA and CP can mitigate this integer ambiguity by narrowing down the search space of possible integers, particularly for short wavelengths. This paper investigates the performance of a positioning method that integrates ToA and CP measurements. As a main contribution, the analysis explicitly accounts for the error correlation between ToA and CP measurements. Furthermore, the study analyzes the impact of key 5G system parameters on positioning accuracy using this correlation-aware joint method in both factory and urban environments, where many 5G positioning applications are expected to emerge. The results highlight that exploiting this correlation can further improve positioning performance by approximately 7 percent. Moreover, the findings of this study provide insight into how 5G system parameters can be tuned to achieve centimeter-level accuracy under favorable conditions.


[14] 2606.13957

High-Fidelity Video Compression based on Invertible Neural Transform and Implicit Conditioning

Learning-based video compression has recently achieved competitive rate-distortion performance compared to conventional video codecs. However, most existing methods rely on non-invertible analysis-synthesis transforms, with reconstruction quality subject to both quantization and transform approximation errors. This limitation becomes particularly restrictive at higher quality points, where quantization errors are small and transform-induced distortion dominates. To address this, we propose InnVC, an Invertible neural network based Video Codec for wide-range and high-fidelity compression. The core idea is to preserve an invertible main transform path prior to quantization, while injecting content-adaptive context through a compact implicit conditioning field. This decouples strongly correlated video content from harder-to-model fine details, allowing different components to specialize in complementary reconstruction tasks for more efficient compression. To further improve compressibility, we introduce a scheduled masking strategy that progressively concentrates informative content into fewer latent channels for more effective entropy coding. Experiments on the UVG and MCL-JCV benchmarks show that InnVC achieves strong compression performance over a broad quality range, being particularly effective in the high-quality regime, yielding BD-rate reductions of 21.66% in PSNR and 46.06% in MS-SSIM relative to x265 on UVG. To the best of our knowledge, InnVC is the first neural video codec covers operating poins from low bitrate to high fidelity within a single architecture scale, spanning more than 20 dB in PSNR.


[15] 2606.14004

Unsupervised Approaches for Global Prosodic Embedding Extraction

Prosody is central to oral communication, conveying information like the emotional state of the speaker and cues needed for meaning disambiguation. Many self-supervised models of speech produce embeddings that encode prosodic as well as linguistic, and speaker information. This entanglement of information is problematic in scenarios where prosody is the main distinguishing factor while other factors may vary between training and deployment; in such cases, a purely prosodic representation would be more robust. Such representation could also be used for analyzing the role of prosody in a given task or as input to speech synthesis systems. In this work, we propose a variety of approaches for producing global prosodic embeddings based on auto-encoder models of pitch and energy. We develop a benchmark for assessing the performance of these representations, showing that our embeddings provide competitive or superior performance under challenging conditions, compared to various alternatives.


[16] 2606.14056

Space-Based GNSS Radio Frequency Interference Detection Evaluation Through Multi-Satellite Data Integration

Space-based GNSS reflectometry (GNSS-R) can detect terrestrial radio frequency interference (RFI) through elevated noise power in delay-Doppler map forbidden zones. This study evaluates how constellation size affects detection performance using Level 1 delay-Doppler observations from seven CYGNSS spacecraft collected over three months from the NASA this http URL archive. Four metrics are analysed: detection latency, spatial coverage, spatial coherence, and persistence monitoring reliability. Results show that the full seven-satellite constellation reduces median detection latency by a factor of 4.7 compared with a single satellite and increases interception probability for a 5-minute emission from 2\% to 11.5\%. Median footprint revisit time improves from 5.8 hours to under 2.0 hours. Spatial coherence analysis indicates that a single satellite leaves up to 72\% of source structure unresolved. Persistence monitoring confirms interference onset 39 days earlier than single-satellite deployment. The largest gains occur between one and three satellites, establishing three satellites as the minimum effective constellation size.


[17] 2606.14064

HPC-Enabled Generator Importance Assessment for RTO-Scale Resource Adequacy Planning

Modern power systems are increasingly under stress as aging assets approach retirement and load growth outpaces new generation construction. The severity of this challenge varies by region: in the EU, the transmission grid can partially compensate for local generation shortfalls, while in the US, generation tends to be more localized, making retirements harder to offset. Retirement of generation has consequences for system reserves, fuel supply chain, and public health. We present an high-performance computing (HPC) framework for rapidly assessing the grid importance of individual generating units and ranking them by primary fuel type, operating cost, or grid impact. Historically, such studies were computationally intensive and therefore conducted infrequently. This work demonstrates that such assessments can be completed in minutes, enabling planners to evaluate a much broader range of generation portfolio scenarios than was previously possible.


[18] 2606.14067

Vision-Based Efficient Joint Trajectory and Channel Tracking in Near-Field XL-MIMO Systems

Accurate joint tracking of mobile users, surrounding scatterers, and dynamic channels is a critical task for sixth-generation (6G) wireless systems, essential for both ensuring high-quality communications and empowering advanced selsing applications such as autonomous driving and immersive extended reality. While extremely large-scale multiple-input multiple-output (XL-MIMO) inherently offers strong support for this task through its high spatial resolution and spectral efficiency, its massive scale of antenna arrays, coupled with near-field propagation characteristics, makes joint trajectory and channel tracking time-consuming and hardware-intensive. To address these challenges, we rethink the problem from a vision-based signal perspective. Specifically, we design a subarray-based partially connected hybrid beamforming (PC-HBF) architecture with a tailored time-multiplexed (TM) mechanism. This effectively compensates for the aperture loss caused by limited radio frequency (RF) chains, generating high-fidelity Cartesian-domain signal images that inherently capture near-field spatial features. Based on this visual representation, we propose an improved CenterNet to perform accurate one-shot path localization, circumventing the path-iterative search required by conventional compressed-sensing-based methods. Building upon this to further improve the accuracy and exploit temporal correlation, a local small-scale orthogonal matching pursuit (OMP) refiner and a lightweight cascaded OMP tracker are developed. Finally, a Hungarian-based trajectory association module is incorporated to maintain track continuity and provide trajectory-level information for environment monitoring. Simulation results show that the proposed framework consistently outperforms representative baselines in position and channel tracking accuracy, especially under low-SNR and limited-hardware conditions.


[19] 2606.14091

Who Spoke When in Multi-Conversation: Target Speaker Tagging Task and Benchmark

We present target speaker tagging (TST), a task that integrates speaker diarization, verification, and identification into a unified workflow for multi-speaker conversations. Given long recordings and pre-enrolled speakers, TST detects and labels speech segments of known speakers while rejecting unknown ones. Despite its practical importance, research has been limited by the absence of suitable evaluation resources. To address this, we introduce TST-Bench, a large-scale synthetic benchmark with over 150 enrolled speakers, 300 sessions of 20-60 minutes, and reference annotations with global speaker labels. We define an evaluation protocol encompassing diarization and full-pipeline scenarios. Experiments on both real and synthetic data show that TST poses challenges not captured by conventional benchmarks, and that dedicated system design yields significant gains over naive integration of existing solutions. The benchmark dataset and evaluation protocols are publicly released.


[20] 2606.14107

Generalized Linear Graph Representation: A Compact Operator Space for Graph Signal Processing and Graph Neural Networks

Graph Signal Processing (GSP) and Graph Neural Networks (GNNs) rely fundamentally on the matrix representation of the underlying graph topology. This representation defines key operators such as the graph Fourier transform, spectral filtering, and convolution. Existing parameterized operator families interpolate only partial subsets of classical graph matrices, while broader formulations become non-compact when representing transition-type operators, limiting both theoretical analysis and stable learning. To address this issue, we propose the Generalized Linear Graph Representation (GLGR), denoted by $\mathbf{Q}_{\alpha,l}$, as a compact two-parameter operator family defined on a bounded linear domain. GLGR unifies major classical operators together with transition-type operators without requiring asymptotic parameters. Theoretically, we show that $\mathbf{Q}_{\alpha,l}$ admits a variational decomposition balancing local smoothness and global degree-weighted energy, derive spectral perturbation bounds, and establish graph-aware sufficient conditions for positive semi-definiteness. Building on this formulation, we develop Adaptive GLGR Convolution (AG-Conv), which makes the propagation operator itself learnable within end-to-end GNNs. Experiments on graph classification and node classification benchmarks show that GLGR improves both fixed-operator representation search and adaptive graph learning across multiple backbones.


[21] 2606.14114

Digital Twin-Based Channel Generation Toolchain and Foundation Model for Low-Altitude XL-MIMO

The rapid development of the low-altitude economy (LAE) has created growing demand for reliable aerial communication systems. Extremely large-scale multiple-input multiple-output (XL-MIMO) is a promising enabler for such systems due to its high spatial resolution and robust connectivity. However, three-dimensional (3D) mobility together with near-field propagation makes it difficult to obtain dedicated high-fidelity wireless datasets, hindering systematic algorithm development and evaluation. To address this issue, we develop LAETwin-XL, a digital twin (DT)-based toolchain and dataset for XL-MIMO research in LAE scenarios. Built on the Sionna ray-tracing (RT) module, the proposed toolchain simulates near-field and far-field channels with diverse wireless labels for practical environments. Building on this dataset, we further develop a conditional denoising diffusion implicit model (CDDIM)-based generative foundation model that is pretrained to learn transferable XL-MIMO channel representations from incomplete channel observations. Unlike conventional task-specific or foundation models that rely on relatively complete channel inputs, the proposed model can generatively infer informative channel representations from partially observed channels. Experimental results demonstrate that the proposed framework achieves effective zero-shot channel extrapolation performance. Furthermore, using lightweight task heads and limited training data, it enables parameter-efficient transfer to various downstream tasks (e.g., channel estimation, classification, and localization), delivering high accuracy and robustness even under sparse antenna observations. The codes and dataset are available at this https URL.


[22] 2606.14120

FAConformer: Frequency-Aware Convolutional Transformer for Auditory Attention Decoding

Auditory attention decoding (AAD) aims to infer the attended speaker from neural responses in multi-speaker acoustic environments and is a key problem for neuro-steered hearing systems. Although recent studies have achieved encouraging progress, existing AAD models still do not fully exploit frequency domain electroencephalography (EEG) information. In particular, most approaches introduce multi-band information through handcrafted feature extraction or direct cross-band feature concatenation, which mainly exploit frequency information at a shallow level and may overlook band-specific patterns and cross-band interactions. To address these limitations, this paper proposes FAConformer, a frequency-aware CNN-Transformer framework for AAD that explicitly integrates band-specific encoding and adaptive cross-band interaction. Specifically, FAConformer first decomposes EEG signals into multiple frequency bands and assigns each band to an independent CNN-Transformer encoder for band-specific modeling. The resulting band-wise features are then adaptively fused by a carefully designed frequency-aware attention (FAA) module that models cross-band dependencies by treating band-wise features as tokens. Further, band-wise auxiliary supervision (BAS) is introduced to prevent weakly contributing branches from being under-optimized during joint training. In this way, FAConformer performs frequency-aware modeling that more effectively exploits frequency domain information. Extensive experiments on two public AAD datasets with three decision-window lengths demonstrated that FAConformer consistently outperformed 12 competitive baselines, surpassing the current state-of-the-art model by 4.9%. Further analyses of band importance, ablation, and parameter sensitivity verify the effectiveness, robustness, and interpretability of the proposed framework. Code is available at this https URL.


[23] 2606.14136

Environment-Aware Stable Neural Koopman Dynamics Learning for Input-Driven Systems under Environmental Constraints

Constructing predictive models of nonlinear dynamical systems from measurement data is a longstanding problem in systems identification and control. Although Neural ordinary differential equations~(Neural ODEs), Koopman operator approximations, and input-aware architectures have each moved the field forward, none simultaneously addresses environment-varying operating conditions, rigorous stability guarantees, and input-to-state stability (ISS) certification within a unified trainable framework. This paper introduces Environment-Aware Stable Neural Koopman Dynamics Learning (ESNKD), which integrates four components: (i)~a bundle-structured encoder that maps environmental observations to a geometrically regularized latent manifold, drawing on the fiber bundle framework; (ii)~an input-conditioned Neural ODE whose residual term handles arbitrary external signals, extending the input concomitant philosophy; (iii)~a contraction synthesis layer enforcing convergence via Persidskii-type tractable linear inequalities, analogous to the certification mechanism; and (iv)~a Koopman lifting stage with LMI-based ISS verification that follows the theoretical pipeline of. Theoretical guarantees cover solution existence and uniqueness, incremental exponential stability, ISS with explicit gain bounds, and robustness to environmental perturbation. Experiments on five benchmark systems, including two robotic manipulation platforms, show consistent improvements over five competitive baselines in both prediction accuracy and safety certification rates.


[24] 2606.14175

HIDVAS: A Hearing Instrument Dataset in Various Acoustical Scenarios for Algorithm Evaluation and Training

To evaluate the performance of audio signal processing algorithms and to train data-driven algorithms, e.g., as applied in hearing instruments, either simulated or recorded data can be used. While large batches of simulated data can be generated using mathematical models, recorded data provide a more adequate representation of real-life scenarios. Therefore, in this paper, the Hearing Instrument Dataset in Various Acoustical Scenarios (HIDVAS) is introduced. This dataset consists of both impulse responses and audio recordings using eight external loudspeakers, two external microphones, and a dummy head. On this dummy head behind-the-ear (BTE) hearing instrument shells with two microphones per shell are mounted, and in the dummy head's ears receiver-in-canal (RIC) hearing instrument loudspeakers are inserted. The dummy head also contains microphones located at its eardrum. The impulse responses have been computed from a swept-sine recording for each microphone-loudspeaker pair, and the audio recordings have been obtained by playing back audio (male and female speech, speech shaped noise, singing voice, stringed instrument, wind instrument, and percussion instrument) through each individual loudspeaker and recording simultaneously using all microphones. These recordings have been repeated for four hearing instrument domes (open, semi-open, closed, and no-RIC) in three reverberation conditions in one room (T30 = 0.09 s, T30 = 0.47 s, and T30 = 0.73 s), and in one reverberation condition in a different room (T30 = 1.48 s). The usage of the dataset as a `hearing instrument in a box' is exemplified with three example use cases.


[25] 2606.14198

SOS-based Stability Verification for Saturated INDI Control of Hybrid-VTOL Aircraft Pitch Rate Dynamics

Incremental nonlinear dynamic inversion (INDI) is a prominent flight-control strategy valued for its robust disturbance rejection; however, its formal stability verification has traditionally been limited to linearized dynamical models. This paper presents a formal nonlinear stability certificate for a saturated INDI pitch-rate controller for a hybrid vertical take-off and landing (VTOL) aircraft by representing the INDI controller via an equivalent recurrent equilibrium network (REN). By casting the saturated INDI architecture as a REN, the closed-loop dynamics are exactly mapped to an augmented state-feedback system. This structural equivalence enables the use of sum of squares (SOS) programming to synthesize a locally valid Lyapunov function without relying on conservative bounding approximations. The resulting certificate yields an inner estimate of the region of attraction (RoA) that explicitly accounts for actuator saturation, formally verifying the controller's stability in operating regimes where standard linear margins lose their validity.


[26] 2606.14223

Event-Level Sensing for Intelligent 6G ISAC

The intelligent evolution of mission-critical networks, such as the Internet of vehicles (IoV) and the low-altitude economy (LAE), requires sixth-generation (6G) networks to move beyond discrete physical parameter estimation toward deeper environmental understanding. However, existing integrated sensing and communications (ISAC) studies mainly focus on target-level sensing, which provides fragmented snapshots of the physical world and lacks the behavioral semantic capability to interpret intent. This limitation hinders the intelligent evolution of such networks and prevents 6G from acquiring the essential sensing foundation to evolve into an "intelligent service engine". To bridge this gap, ISAC must advance toward event-level sensing, which models continuous-time states to enable persistent recognition and prediction of target intent and behavioral semantics. This article presents a comprehensive overview of event-level sensing in 6G ISAC networks. We first introduce its fundamental concepts, sensing types, and representative scenarios. We then review key enabling techniques across waveform design, target state estimation and tracking, and event recognition. Furthermore, focusing on IoV and LAE scenarios, we discuss representative applications of ISAC event-level sensing and the intelligent enhancement of downstream operational functions enabled by event-level information. Finally, we highlight future research trends and potential directions to further advance ISAC event-level sensing toward intelligent and proactive 6G networks.


[27] 2606.14248

Spectrum Aware Illumination Estimation Using Multispectral Image

Multispectral (MS) imaging extends beyond conventional RGB imaging by capturing more spectral bands, thereby improving illuminant spectrum estimation (ISE). However, existing methods often fail to fully exploit spectral information, resulting in suboptimal performance under diverse lighting conditions and across different sensor domains. Hence, we propose a deep learning framework with a spatio-spectral feature extraction block, which incorporates spectral attention mechanisms to enhance spectral correlation and preserve illuminant-relevant spatial features. Through the inclusion of an illuminant prior (IP), our approach prioritizes specific channels that provide more meaningful information in an MS image. We also propose a spectral-domain transform across different MS sensor spaces. The results demonstrate that illuminant spectra learned in high-dimensional sensor spaces can be effectively transformed to various lower-dimensional camera sensor spaces without any additional training. To facilitate evaluation, we introduce a real-world MS dataset containing high-dimensional ground-truth illumination spectra captured under diverse lighting conditions. Through extensive experiments, we demonstrate that our method achieves superior accuracy compared to existing models, thus providing a practical solution for real-world ISE. The code and dataset are available at this https URL.


[28] 2606.14286

Topology Optimization for DC Circuit Breaker Placement in HVDC Switching Stations

HVDC protection will be required in future multiterminal HVDC grids to prevent large outages caused by DC faults. Therefore, system-level protection design is essential for the development of HVDC switching stations that connect several converter stations and lines within these grids. This paper presents an optimization method for the design of HVDC circuit breaker (DCCB) configurations in HVDC switching stations and electrical energy hubs. This approach builds on the current practice of using selected configurations based on pre-defined protection strategies. In contrast to these existing methods, the DC switching station design in the proposed method offers significantly more flexibility and allows the consideration of large numbers of relevant operating conditions, leading to more effective, optimal design outcomes. A mixed-integer linear optimization problem is formulated to design the DC protection and minimize the risk of high impact DC faults. An example case study demonstrates that the optimization method allows the calculation of the optimal number of DCCBs for a given DC switching station, based on the failure rates of DC grid components and the DCCB cost relative to the fault impact. With these results, the marginal benefit to risk reduction of each additional DCCB included in a DC switching station is calculated. Moreover, the result of the optimization problem provides the optimal breaker configuration for the required number of DCCBs and can consequently be used as a topological design tool for DC switching stations.


[29] 2606.14291

Intelligent Domain Adaptation for Power System Transient Stability Assessment Under Varying Operating Scenarios

While deep learning-based transient stability assessment (TSA) approaches have exhibited great potential in power system stability monitoring, they are prone to undergo performance degradation in practical contexts with frequent variations of operating conditions. To address this issue, this work develops an adaptive TSA framework via domain adaptation-enabled deep transfer learning. First, for the sake of capturing the primary transient stability characteristics, a robust metric, i.e., heterogeneous hybrid distribution metric (HHDM), is designed through mathematical means to effectively handle multi-scale Gaussian and long-tail distributions of transient responsive data and to precisely quantify the intrinsic distributional discrepancies between the source and target domains corresponding to different operating scenarios. With the help of the HHDM, a Bayesian theory-based dual-distribution domain adaptation method is constructed, aligning not only marginal probability distributions between domains but also the distributions of sub-domain categories. Such alignments enable fine-grained transient stability feature transfer, helping significantly improve the adaptability of a well-trained TSA model to target domains. Furthermore, a multilayer sparse regularization algorithm is introduced to mitigate feature volatility caused by variations in operating scenarios, thereby enhancing the model's generalization in the presence of unforeseen scenarios. Numerical tests on three test systems illustrate that, compared with conventional methods, the proposed framework improves online TSA accuracy by 0.5% to 5% in a cost-effective manner, with the learning cost for TSA model update largely reduced.


[30] 2606.14293

On the Feasibility of Passive Bistatic ISAC Based on Unmodified LoRa

Integrated Sensing and Communication (ISAC) enables sensing capabilities by reusing communication signals, making it particularly attractive for large-scale deployments through signals of opportunity. While most existing ISAC research targets wideband systems, Low Power Wide Area Network (LPWAN) technologies such as LoRa remain largely unexplored from a radar-like sensing perspective. Existing LoRa-based approaches mainly focus on motion detection or require modifications of the communication waveform, limiting their applicability in deployed networks. This paper investigates the feasibility of radar-like sensing using unmodified LoRa communication signals as signals of opportunity in a purely passive bistatic ISAC configuration. The proposed approach focuses on Doppler-based sensing to enable target separation and super-resolved target estimation without interfering with existing LoRa network operation. The analytically derived sensing capabilities are compared against simulation results and validated through bistatic measurements using two USRP B210 software-defined radios, confirming the feasibility of Doppler-based LoRa sensing under practical conditions and revealing relevant implementation challenges. The results demonstrate that LoRa-based ISAC enables highly scalable, large-area, low-resolution sensing by leveraging existing infrastructure, providing a complementary sensing capability to area-limited high-resolution 6G ISAC systems, and a foundation for future multi-node and data fusion extensions.


[31] 2606.14372

$κ$: A Geometry-Quality Metric Complementary to GDoP for Closed-Form TDoA Multilateration

The Geometric Dilution of Precision (GDoP) characterizes the noise sensitivity of a Time-Difference-of-Arrival (TDoA) localization system, but does not capture every way the analytical multilateration solution can become ill-conditioned. We introduce a complementary geometry-quality metric $\kappa$, the leading coefficient of the closed-form TDoA solver's quadratic, and derive its $N$-dimensional generalization through a vectorized formulation. Two closed-form algebraic identities relate $\kappa$ to the Jacobian determinant of the measurement model and to the quadratic's discriminant, establishing that the system exhibits exactly two distinct singularity loci: branch divergence and the Jacobian/branch-merge locus flagged by GDoP. A Cramér--Rao-bound-linked closed form for the noise sensitivity $\sigma_\kappa$ under the standard Gaussian ToA model is validated against Monte Carlo to 2% median relative error. An empirical atlas over a dimensionless geometry parameter space confirms both identities at machine precision and shows that $\kappa$-bad regions and GDoP-bad regions are non-trivially disjoint in target space, establishing the two metrics as genuinely complementary. A case study on a four-node operational array, with per-sensor time of arrival (ToA) noise estimated empirically from Automatic Dependent Surveillance Broadcast (ADS-B)-paired over-the-air captures, shows that the theory-predicted threshold and a Monte-Carlo-measured operational threshold agree on the per-subsystem ordering at the deployment noise level. Their ratio is approximately constant across the three two-dimensional subsystems, serving as a deployment-specific calibration constant between the algebraic $\kappa$-noise floor and the downstream operational threshold, analogous in spirit to the standard relation linking GDoP to the circular error probable.


[32] 2606.14401

A Feedback Stability Theorem for Frequency-dependent Compensation of Excess and Lack of Passivity

This article studies the stability of feedback interconnections of linear time-invariant systems based on frequency-dependent passivity indices. Using these frequency-dependent passivity indices, we show that the feedback interconnection of two systems can be certified to be stable even if both systems have a lack of passivity in terms of their scalar passivity indices. The main contribution of this paper is a new stability theorem based on frequency-dependent passivity indices. Moreover, we discuss the connection of the proposed feedback stability theorem to prior results based on scalar passivity indices. A numerical case study showcases the advantages of frequency-dependent passivity indices over scalar indices for feedback interconnections of linear systems.


[33] 2606.14412

Repeater-Assisted Massive MIMO Downlink Performance with Calibration Errors

Reciprocity-based downlink beamforming is imperative for a scalable time-division duplex massive multiple-input multiple-output~(MIMO) deployment. Specifically, for a dual-antenna repeater-assisted massive MIMO system, a mismatch between forward and reverse path gains at the repeater can exacerbate the overall calibration error between the user equipments (UEs) and the base station (BS), which potentially also contains calibration errors of their individual radio-frequency chains. This paper models the effects of such calibration errors, underpins the relations between the uplink and downlink channels for repeater-assisted systems with calibration errors clubbed with the over-the-air channel estimation errors, and derives analytical expressions of the downlink spectral efficiency. The presented results can then be simplified to several special cases, underscoring situations wherein such errors can become pronounced.


[34] 2606.14419

On Optimal Strategies for Joint Reciprocity Calibration in Distributed MIMO

This paper investigates the impact of reciprocity calibration errors on the downlink spectral efficiency (SE) of multi-user large antenna systems. Specifically, we consider two calibration approaches: (a) global calibration, in which all antennas (can be distributed access-points (APs)) in the system cooperatively perform calibration, and (b) local calibration, wherein only a subset of antennas involved in downlink beamforming performs calibration. We derive the downlink SE considering the use-and-then-forget bound and side-information bound, and then demonstrate that, when downlink pilots are employed (in the case of side-information bound), the global calibration outperforms local calibration for arbitrary calibration topologies.


[35] 2606.14426

A Floquet Mode LQR for Orbital Station-Keeping in Cislunar Space

A linear optimal control law for orbital station-keeping in the Earth-Moon Restricted Three Body Problem (R3BP) is developed via Linear Quadratic Regulator (LQR) theory. First, the cost function is established considering a periodic state-weight matrix, leveraging stability information of the target orbits retrieved through Floquet theory. Then, the resulting periodic Riccati differential equation is solved and local asymptotic stability guarantees are shown. Finally, the performance of the proposed LQR when tracking periodic orbits in the circular and elliptic R3BPs is analyzed numerically.


[36] 2606.14434

Orbital Station-Keeping in the Earth-Moon System via Nonlinear Backstepping

A nonlinear orbital station-keeping solution for the circular and elliptic versions of the Earth-Moon Restricted Three-Body Problem (R3BP) is developed via a backstepping technique. Formal guarantees for global asymptotic stability (GAS) are attained, as shown through Lyapunov's stability theory. The adequacy of the proposed control law is evaluated through the means of numerical trials over closed periodic solutions of the circular and elliptic R3BPs. The ramifications of the control gain choice are carefully studied and simulated.


[37] 2606.14471

A Generalized Plant Perspective on Linear-Convex Feedback Optimization

Feedback optimization is a control approach for driving a dynamical system to the solution of an optimization problem by interconnecting the plant with an algorithm. Existing stability guarantees typically rely on timescale separation, enforced by conservative gain bounds that limit transient performance and require a pre-stabilized plant. This paper revisits the robust control perspective on feedback optimization. We formulate the plant-optimizer interconnection as a generalized plant, where the cost gradients are characterized by Zames--Falb Integral Quadratic Constraints. Classical timescale-separation bounds are recovered as a special case of static multipliers, with dynamic multipliers yielding substantially tighter stability margins. The formulation also enables IQC based synthesis of dynamic output feedback controllers that jointly stabilize the plant and optimize transient performance, with possible model uncertainty absorbed into an uncertainty channel. For constrained problems, the framework extends to dynamic controllers that generalize projected gradient flows. Numerical examples illustrate the benefits and flexibility of the proposed approach.


[38] 2606.14478

Optimization Models and Steady-State Minimum-Fuel Operating Strategies for Hydrogen-based Hybrid Electric Aerospace Propulsion Systems

This paper presents an optimization framework for the operation of hydrogen-based hybrid electric aerospace propulsion systems consisting of a hydrogen gas turbine and an electric motor powered by a solid oxide fuel cell, connected to the gas turbine via multiple gas channels and heat exchangers. Our framework computes the minimum-fuel optimal operating strategies over a flight mission accounting for the complex propulsion system with strong thermodynamic and mechanical coupling between components. First, we identify surrogate optimization models of the components employing high-fidelity model simulations. Second, we frame the minimum-fuel optimal control problem over a given flight mission and parse it into a static nonlinear optimization problem that can be efficiently solved with off-the-shelf nonlinear programming algorithms. Finally, we apply our optimization framework to a typical flight mission of an advanced, commuter aircraft (Beechcraft 1900D market segment), considering a parallel propulsion system architecture with four different configurations that share a common baseline but differ in the inclusion of an additional battery and by-pass valves around the two heat exchangers. The resulting optimal trajectories are validated against high-fidelity simulation results, demonstrating the accuracy of our framework. Results show that adding by-pass valves around the air and hydrogen heat exchangers can reduce fuel consumption by 19.11 % without the battery, and by 19.56% with the battery. We show that adding a battery yields a slight increase in fuel consumption (below 1%) for future projected energy densities under steady-state conditions. Conversely, when considering state-of-the art energy densities, the additional battery weight outweighs the benefits, limiting its potential applicability to only assisting transients, which are not considered in the present work.


[39] 2606.14486

Implications of the Reciprocity Theorem for Reconfigurable Intelligent Surfaces

Reciprocity between a transmitter and receiver is a foundational requirement in wireless communications. A few recent works have suggested that reciprocity is broken under reflection by reconfigurable intelligent surfaces (RIS) when the reflection phase becomes incident angle dependent. In this work, we rigorously show that these claims are based on the use of idealized reflection coefficients that ignore mutual coupling between heterogeneous unit cells, surface-truncation effects, and structural scattering contributions from the RIS. Full-wave electromagnetic simulations of transmit/receive antennas and a finite-size RIS implemented via a particular unit cell design are performed to quantitatively demonstrate that reciprocity holds even in the presence of incident-angle dependent reflection phases. To show this, we calculate two-port antenna scattering parameters and evaluate the electromagnetic reciprocity integral to support our claims.


[40] 2606.14499

Beamforming Design for Stem-Connected Microwave Linear Analog Computer (MiLAC)-Aided Multiuser MISO Downlinks

A microwave linear analog computer (MiLAC) is a tunable microwave network that performs computation through wave propagation in the analog domain. In beamforming, data streams pass through a reconfigurable admittance network and emerge as antenna signals. For communications, MiLACs are preferably lossless and reciprocal to avoid power dissipation and non-reciprocal components, but these constraints limit the analog beamformers they can realize. Fully-connected MiLACs offer broad flexibility at the cost of a quadratic number of tunable admittances in the antenna count. Stem-connected MiLACs reduce this scaling to linear and preserve point-to-point capacity, but their role in multiuser downlink beamforming and under bounded, discrete hardware constraints has remained open. This paper addresses both questions for the multiuser multiple-input single-output downlink. We show that a stem-connected MiLAC can realize every beamformer on the complex Stiefel manifold and prove that, when $N\ge 2K-1$, this Stiefel-restricted design achieves the same sum-rate as the fully-connected MiLAC, where $N$ and $K$ are the numbers of transmit antennas and users. We then develop a weighted minimum mean-square error solver with a Riemannian Stiefel update, together with a closed-form projection baseline and an alternating refinement for bounded, discrete susceptances. Simulations show that the stem-connected MiLAC matches fully-connected MiLAC performance, approaches the fully digital sum-rate upper bound without symbol-rate digital processing, and recovers most of the loss caused by direct hardware-grid quantization.


[41] 2606.14568

Trimodal Glioma Representation Alignment via Volumetric Contrastive Learning

Glioma grading and survival prediction require the integration of heterogeneous information collected at different spatial and biological scales. Histopathology describes tissue morphology, mRNA expression captures molecular activity, and magnetic resonance imaging provides a non-invasive view of tumor extent and radiological heterogeneity. Existing glioma prognosis models often combine only two of these sources, while their alignment objectives remain mostly pairwise. This paper introduces GLORIA, a novel trimodal framework for GLioma Omics - Radiology - hIstopathology Alignment. GLORIA processes whole-slide image regions, gene-expression profiles, and 3D MRI volumes through modality-specific encoders, projects them into a shared latent space, and aligns them with a Gramian contrastive loss that measures the volume spanned by the three modality embeddings. The aligned representations are fused through a cross-modal gating module and optimized jointly for three-class glioma grading and overall survival prediction. We evaluate GLORIA on a matched TCGA-GBM/LGG and BraTS21 cohort, comprising 132 patients with all three modalities. On the shared trimodal test set, GLORIA improves over the bimodal WSI-mRNA baseline in all the metrics considered.


[42] 2606.13823

A Stationarity-and-Coupling Criterion for Training-Free Time-Lagged Spectral Embeddings of Multivariate Time Series

We study training-free fixed-length descriptors for multivariate time series and ask not merely whether such a descriptor performs well, but when it can be expected to work at all. Our object of study is $D(\tau)$, built from a time-lagged correlation matrix truncated at the Marchenko-Pastur edge so that only signal-bearing eigenvalues survive and classified by cosine similarity to class centroids with zero learned parameters. The central contribution is not the descriptor but a falsifiable applicability criterion for it. Working from a stationary Gaussian VAR(1) model, we argue that $D(\tau)$ separates two classes when the signals are approximately stationary and the class information lives in their cross-channel temporal coupling rather than in marginal per-channel power. We derive, semi-formally, three consequences: a distinguishability condition, why the static ($\tau=0$) covariance collapses to chance, and why a stationary but power-discriminated paradigm defeats the descriptor. The criterion is operational: a two-part pre-flight test -- an augmented Dickey-Fuller stationarity check and a power-baseline saturation check -- predicts applicability before any training. We validate both halves on a mixed assortment. On four paradigms that satisfy the criterion (Sleep-EDF, BCI-IV-2a, MIT-BIH, ESC-50) the descriptor is competitive with strong baselines at a fraction of their cost, reaching $88.5\pm4.5\%$ under 20-subject leave-one-subject-out on Sleep-EDF on a single CPU thread. On three that violate it -- non-stationary ERPs, and financial-volatility and wearable-stress regimes that are power-discriminated -- it fails exactly as the pre-flight predicts, and these negatives are the more informative half. We are explicit that $D(\tau)$ is not the most accurate representation; its value is a compact, training-free embedding whose domain of validity is known in advance.


[43] 2606.13834

Solving Subgraph Extraction Problems Using $Δ$Search

Many NP-hard graph problems can be modeled as optimal subgraph extraction problems with feasibility constraints. From Network Design to Facility Location, from Robotics to Graph Drawing, the subgraph extraction pattern emerges across diverse domains. Despite this commonality, these problems are typically solved with domain-specific heuristics. Usually, these problems balance competing objectives such as maximizing coverage or minimizing cost while satisfying structural constraints such as connectivity, planarity and reachability. In this work, we introduce $\Delta$Search, a general and fast heuristic framework that exploits the insight of Reward-Penalty optimization for solving a large class of subgraph extraction problems. The framework is easy to use as it only requires feasibility constraints and optimality criteria to be provided by the user to express the subgraph extraction problem. We also show how exact methods can be augmented with $\Delta$Search to improve their performance by aggressive pruning of the search space. We evaluate our framework on monotone graph problems such as Maximum Planar Subgraph (MPS) and Minimum Connected Dominating Set, Weighted Monotone problems such as Maximum Weighted Independent Set and Minimum Weighted Steiner Tree, and non-monotone graph problems such as Prize Collecting Vertex Cover (PCVC) and Uncapacitated Facility Location Problem (UFLP). Our results show that $\Delta$Search matches or surpasses state of the art heuristics for MPS, UFLP and PCVC problems with similar runtime. For the remaining problems, $\Delta$Search achieves approximately 89% of the solution quality of the state-of-the-art algorithms without any problem-specific tuning


[44] 2606.13839

Explaining RhythmFormer: A Systematic XAI Analysis of Periodic Sparse Attention for Remote Photoplethysmography

Remote photoplethysmography (rPPG) transformers achieve low heart-rate error on benchmarks, yet their decisions remain opaque--a growing concern as rPPG moves toward clinical heart rate estimation. Existing rPPG XAI is dominated by qualitative heatmap inspection without quantitative faithfulness metrics or physiology-grounded validation, leaving a gap between visual plausibility and auditable evidence. We address this gap. First, we adapt four attribution methods (raw attention, rollout, flow, Beyond Intuition) to RhythmFormer's bi-level routing attention with top-$k$ selection. Second, we introduce a skin coverage metric quantifying how much attribution mass falls on skin regions. Third, we adapt the SaCo faithfulness coefficient from its original classification setting to rPPG regression by using the MAE between original and perturbed predicted rPPG waveforms as the perturbation impact. Applying these tools, we quantify a multi-hop leakage effect under sparse top-$k$ routing: attention rollout and flow almost completely restores the connections that individual refined-attention layers explicitly set to zero. Beyond Intuition mitigates this via its value-projection-weighted rollout and gradient-supported mask, attaining the highest median refined skin coverage ($0.83$ vs. $0.57$ for vanilla rollout) and faithfulness ($F=0.92$) among the evaluated methods on UBFC-rPPG. Validation across diverse datasets and model variants is needed. A case study on a low-SaCo outlier further shows all four methods recovering consistently once an artefactual region is replaced, suggesting consistent SaCo behavior across attribution families in this illustrative case. Together, these metrics move XAI for rPPG toward auditable numerical evidence about spatial alignment and perturbation faithfulness, i.e. trustworthy rPPG XAI.


[45] 2606.13915

Learning Dynamic Swing-Up of an Inverted Pendulum using Remote Magnetic Actuation

Electromagnetic Navigation Systems (eMNS) have gained considerable attention for minimally invasive surgery and targeted drug delivery. While most of the literature relies on quasi-static control of these systems, recent work has demonstrated the benefits of dynamic approaches. However, trajectory tracking far from equilibrium states remains largely unaddressed. We close this gap by demonstrating the first swing-up of a magnetically actuated inverted pendulum using the clinically-ready Navion eMNS. Although the inverted pendulum is not clinically relevant in itself, the proposed method utilizes torques and forces as control objectives, making it applicable to other magnetically actuated devices such as catheters and guidewires. Our approach combines trajectory optimization that accounts for internal eMNS dynamics with time-varying Linear Quadratic Regulator (LQR) state feedback and Iterative Learning Control (ILC), which leverages previous trial data and the system's dynamic model to progressively refine the feedforward command. While LQR alone fails due to the complex phenomena of magnetic actuation, ILC enables successful swing-up within six iterations. Furthermore, post-experimental analysis reveals that the learned ILC correction closely matches the torque discrepancy predicted by high-fidelity magnetic field model calibration, suggesting learning and adaptation as a promising tool to deal with uncertainties in electromagnetic actuation arising, e.g., from patient-specific physiological motion patterns and field model calibration inaccuracies.


[46] 2606.14008

Pseudonym Scheme Based on Hybrid Certificates for Security Credential Management System in Vehicular Communications

In recent years, the Institute of Electrical and Electronics Engineers (IEEE) and the European Telecommunications Standards Institute (ETSI) have developed a series of security communication standards for vehicular communications. These standards include mechanisms such as the Security Credential Management System (SCMS) and Butterfly Key Expansion (BKE) to protect vehicle privacy. However, these standards are mainly based on the Elliptic-Curve Cryptography (ECC), which may be vulnerable to attacks from quantum computing in the future. In response to this potential risk, this study proposes a hybrid certificate that combines the ECC with Post-Quantum Cryptography (PQC). This approach enables infrastructure systems to be built on cryptographic foundations that are more resilient to quantum-based attacks. Furthermore, this study presents a generalized pseudonym scheme that is compatible with various cryptographic algorithms for generating pseudonym certificates. This design aims to eliminate the possibility of inferring any correlation between the public key in a pseudonym certificate and that in an enrollment certificate. This study also conducts a comprehensive performance evaluation of the RSA, ECC, and PQC algorithms, particularly those standardized by the National Institute of Standards and Technology (NIST). The comparison considers factors such as message length and computation time. Based on the findings, this study recommends suitable pseudonym schemes that adopt hybrid certificates for secure and efficient use in vehicular communications.


[47] 2606.14027

Same-Origin Policy for Agentic Browsers

Agentic browsers integrate autonomous AI agents into web browsers, enabling users to accomplish web tasks through natural-language instructions. The same-origin policy (SOP) is a fundamental browser security mechanism that prevents unauthorized automated cross-origin data flows induced by scripts. However, whether SOP remains effective in agentic browsers is an open question that has not been systematically studied. In this work, we bridge this gap. We first observe that an agentic browser can itself serve as an automated channel for cross-origin data flows, potentially leading to SOP violations. To investigate this phenomenon, we construct SOPBench, a benchmark for evaluating SOP violations in agentic browsers. Our evaluation shows that existing agentic browsers frequently violate SOP, both in benign settings and under attacks. To address this problem, we propose SOPGuard, an SOP enforcement mechanism tailored to agentic browsers. We implement SOPGuard in BrowserOS, an open-source agentic browser. Extensive evaluations demonstrate that SOPGuard effectively enforces SOP while preserving utility and incurring only a small runtime overhead. Our code and data are available at this https URL.


[48] 2606.14050

Battery Bidding under Price Uncertainty in Wholesale Electricity Markets

Grid-scale batteries increasingly influence outcomes in wholesale electricity markets, but their observed bid patterns remain difficult to interpret. In particular, bids that appear to reflect strategic withholding may instead arise from rational operations under price uncertainty and risk management. We develop an asset-level model of a price-taking battery that submits stepwise buy and sell bid curves in the day-ahead market under a finite set of price scenarios. The battery chooses quantity--price pairs to maximize a mean--CVaR objective subject to physical and market constraints. A direct formulation is a mixed-integer linear program, but we show that its integer decisions can be removed, yielding an exact linear programming reformulation suitable for empirical analysis. Our empirical results deliver three insights. First, withholding behavior can arise even without market power, because scarce stored energy and uncertain future prices increase the value of holding energy. Second, the effect of uncertainty depends on the state of charge: when stored energy is scarce, greater uncertainty raises sell bid prices, whereas when stored energy is abundant it can lower them. Third, risk management reshapes bid curves into layered structures that secure profitable execution across a broad set of scenarios while preserving some exposure to rare but valuable price spikes.


[49] 2606.14063

Semidefinite Relaxations for Collision-Free Motion Planning

We study semidefinite relaxations for collision-free motion planning. We focus on a point robot moving from start to goal through spherical obstacles in $\mathbb{R}^n$, subject to path continuity constraints and squared derivative costs; a setting that is conceptually simple yet captures the hardness of collision-free motion planning. We formulate this problem exactly as a nonconvex problem over polynomial curves, and present a natural semidefinite relaxation. We contribute two key theoretical insights; to our knowledge this is the first theoretical analysis of semidefinite relaxations for collision-free motion planning. First, we show that solving the convex relaxation is equivalent to solving, to global optimality, a related motion planning problem in a potentially higher-dimensional space. This geometric interpretation yields necessary and sufficient conditions for tightness, and a clear intuition for when the relaxation is loose. Second, we show that the relaxation admits a symmetry reduction that makes it significantly smaller than one might expect, with positive semidefinite cone sizes that scale linearly with the polynomial degree and are independent of the ambient dimension. The resulting relaxation is 10 to 100 times faster than direct nonlinear programming transcriptions solved with SNOPT and IPOPT, exhibits significantly lower variance in solve times, and reliably finds a locally optimal path for the original problem. We demonstrate its effectiveness as a convex steering function in an RRT planner for minimum-snap quadrotor planning with $C^4$ continuous trajectories.


[50] 2606.14081

Clay-CNN Hybrids: Leveraging Geo-Foundational Models as Auxiliary Context for Landslide Detection

Rapid post-event landslide mapping is essential for disaster response but remains difficult to automate due to extreme class imbalance. This study evaluates whether Clay v1.5, a Geo-Foundational Model (GFM), can improve pixel-level landslide segmentation on the Landslide4Sense (L4S) benchmark, which contains 3,799 training chips with 14 Sentinel-2 and terrain bands and approximately 2% positive pixels. We compare three strategies: Clay as the primary encoder with multi-scale residual terrain fusion, a U-Net backbone augmented with Clay semantic context at the bottleneck, and a standard U-Net baseline. The hybrid U-Net + Clay model with two-stage Low-Rank Adaptation (LoRA) achieved the best test F1 of 64.5 +/- 1.8% over three seeds, surpassing the Clay-only backbone (55.2 +/- 3.6%) and the U-Net baseline (59.9%). Clay as a standalone encoder underperformed the U-Net due to the absence of multi-scale skip connections, but its pretrained representations consistently improved performance when injected as auxiliary context. These findings suggest that GFMs are most effective for landslide detection when they complement spatially detailed convolutional architectures rather than replace them.


[51] 2606.14188

Robustness without Wrinkles: Parallel Simulation and Robust MPC for Certified Deformable Manipulation

We present CORD-SLS, a real-time control method for safe deformable object manipulation, with a focus on ropes and cloth. At its core is a GPU-parallel differentiable simulator with contact smoothing which enables efficient gradient-based planning through intermittent contact. To robustly satisfy constraints under model and sensing uncertainty, we develop a real-time, GPU-parallel output-feedback robust model predictive control (MPC) algorithm that plans with this simulator. We further show that the simulator accelerates model-based RL for training neural manipulation policies. To improve real-world robustness, we use conformal prediction to calibrate visual-feedback and perception-error bounds for MPC, producing reachable tubes that enable high-probability safe control. We evaluate CORD-SLS on high-dimensional, contact-rich rope and cloth manipulation tasks in simulation and hardware, including obstacle avoidance, routing, folding, and smoothing. Across settings, CORD-SLS achieves millisecond-speed planning, exceeding baselines in safety, speed, and task success.


[52] 2606.14214

StreamRTPS: Increasing DDS Bandwidth Efficiency by Reducing Protocol Overhead

In this paper, we propose three extensions to the Real-Time Publish Subscribe wire protocol, on which Data Distribution Service (DDS) is based, to improve bandwidth efficiency. First, a stream negotiation mechanism exchanges static header information during discovery, replacing the full RTPS header at runtime with a compact 2 B identifier. Second, a payload aggregation scheme aggregates samples for the same locator into single UDP packets, reducing IP and UDP header costs. Third, a predictive heartbeat suppression strategy reduces control traffic by omitting heartbeats for periodic communication patterns, falling back upon detected loss or timing violations. All three mechanisms preserve Real-Time Publish Subscribe(RTPS) compatibility by extending DDS discovery to activate these features when supported. Experimental results show that stream headers reduce bandwidth consumption by up to 27.9 % compared to conventional RTPS under best-effort transport, and that heartbeat suppression yields a further 22.7 % reduction on top of stream headers under reliable transport, while preserving transmission latency in both cases.


[53] 2606.14216

Short-Horizon Position Accuracy of Single-Track Models: Implications for Motion Planning of Autonomous Vehicles

Accurate and computationally efficient vehicle models are essential for motion planning of autonomous vehicles, where positional accuracy directly affects trajectory feasibility and safety. However, the positional accuracy has not been systematically evaluated against real measurements. Therefore, this paper compares the short-horizon positional accuracy of three single-track vehicle models against vehicle measurements across various driving maneuvers. Model parameters are identified through dedicated experiments with the instrumented test vehicle. Rather than identifying a single best model, this work aims to provide insight into the trade-offs between model complexity, parameterization quality, and positional accuracy for informed model selection in Model Predictive Control applications.


[54] 2606.14355

Point Cloud Upsampling through Patch-based Frequency Superposition

In recent years, neural networks have become the dominant models in most point cloud upsampling methods. Although these approaches are achieving good results, they do have drawbacks, such as a lack of interpretability and data dependency. Moreover, they have to be trained on a dataset that is similar to the test data in order to perform well. To avoid these disadvantages, we propose Point Cloud Upsampling through Patch-based Frequency Superposition (PUtPFS), an optimization-based approach that selects subsets of points and estimates the surface of this set through superpositioning spatial frequencies. Then, new points are placed on this surface. By successively selecting points in the least dense regions of the point cloud, a uniform upsampling can be reached. With this method, we surpass the current best upsampling results in the commonly considered point-to-surface distance. Furthermore, we achieve the best Chamfer and Hausdorff distance among the optimization-based approaches. As an additional advantage, our method does not need any training data and is mathematically interpretable.


[55] 2606.14403

A Deep Zero-Inflated Model of North Atlantic Right Whale Presence To Support Blue Economy Management in the U.S. East Coast

Effective modeling of endangered marine mammal species, such as the North Atlantic Right Whale, is critical for balancing marine conservation with the growing blue economy. Passive acoustic monitoring data collected by autonomous underwater vehicles provide new opportunities for localized marine species detection and oceanographic sensing, but introduce complex statistical challenges such as zero inflation, imperfect detection, and intricate dependence structures. In response, we propose the Deep Zero-Inflated Bernoulli (DeepZIB) model--a deep statistical method which jointly models latent species presence and conditional detection probabilities while learning complex habitat relationships from heterogeneous covariate information. We establish theoretical results on the model's structural properties and conduct simulation experiments to demonstrate its ability to recover underlying parameters and latent presence fields. Application to real-world passive acoustic monitoring data on the North Atlantic Right Whale along the U.S. East Coast demonstrates improved model adequacy and predictive performance in capturing the species' dynamic and spatially varying habitat. A key advantage of DeepZIB is its ability to generate high-resolution, spatially and temporally varying presence maps, providing valuable insights for targeted and risk-aware management of blue economy industries, ranging from offshore and marine energy, to fisheries management and maritime transport.


[56] 2606.14421

ForestBack: Breadcrumb-Based Pedestrian Dead Reckoning for Infrastructure-Free Return Navigation

Reliable return navigation remains an important challenge in GPS-denied environments where external positioning infrastructure may be unavailable or unreliable. This paper presents ForestBack, an infrastructure-free pedestrian return navigation framework based on breadcrumb-based pedestrian dead reckoning (PDR). The system records a user's walking route as a sequence of reversible breadcrumb nodes and generates reverse-path guidance without requiring GPS, Wi-Fi, Bluetooth beacons, or pre-installed infrastructure. ForestBack integrates acceleration-based step detection, adaptive step-length estimation, magnetometer-assisted heading estimation, barometric-altitude correction, and bidirectional breadcrumb path reconstruction. The system was evaluated using an indoor obstacle-avoidance route with five checkpoints, where the user navigated around a central obstacle. A dataset of 36 walking trials and 42,474 time-series samples was used for evaluation, including IMU signals, magnetometer readings, barometric variables, turn-event labels, ground-truth trajectories, baseline PDR outputs, proposed ForestBack outputs, and power-related measurements. Experimental results show that ForestBack reduced the mean RMSE from 1.129 m to 0.965 m compared with traditional PDR, corresponding to a 15.76% improvement. The mean final-position error was reduced from 1.781 m to 1.388 m, while turn-event detection consistency reached approximately 99.90%. These results indicate that ForestBack improves trajectory reconstruction and route-preserving return guidance in obstacle-avoidance scenarios. The released dataset and analysis notebook support reproducibility and future benchmarking of infrastructure-free PDR-based return navigation systems.


[57] 2606.14448

Generalized Framework for a Fair Comparison of Cellular and Cooperative Massive MIMO Systems

Cooperative massive multiple-input multiple-output (MIMO) promises large gains over cellular deployments, but existing comparisons of different architectures often mix antenna distribution, inter-site coordination, and processing assumptions. This paper introduces a graph-based framework for fair comparison of cellular, coordinated, and cell-free massive-MIMO systems. We differentiate between two key properties, namely antenna distribution and inter-site cooperation, which yields seven representative system types. We derive compatible uplink and downlink spectral efficiency (SE) expressions, including an uplink bound for detectors with mixed instantaneous and statistical effective channel state information (CSI), and adapt scalable user association and processing rules to all considered architectures. We evaluate these systems using extensive numerical simulations and show that for a fair comparison much larger simulation areas (at least 2.5 $\times$ 2.5 km2) than commonly used are required. We introduce the relative capacity, which measures how closely each architecture approaches centralized cell-free processing. The results show that coordinated, phase-aligned beamforming across spatially distributed antennas is the main source of cooperation gains. In dense deployments with few antennas per access point (AP), coordinated Distributed Antenna System (DAS) and hybrid cell-free architectures achieve much of the centralized cell-free performance while requiring substantially weaker midhaul assumptions.


[58] 2606.14528

BayLing-Duplex: Native Full-Duplex Speech Dialogue with a Single Autoregressive LLM

Real-time, full-duplex speech interaction is a key feature of next-generation spoken chatbots, allowing the model to listen and speak at the same time and to handle natural phenomena such as overlap, hesitation, and barge-in. Existing speech language models (SpeechLMs) such as LLaMA-Omni and GLM-4-Voice are still turn-based and rely on an external Voice Activity Detection (VAD) module to mark the end of the user's turn, which fundamentally limits their interactive ability. In this paper, we introduce BayLing-Duplex, a native full-duplex SpeechLM where a single autoregressive LLM decides when to listen, when to speak, and when to stop, with no auxiliary turn-taking module. The design adds only a few special tokens to the standard vocabulary, so it transfers across LLMs and reuses existing training and serving stacks with no architectural adaptation. Starting from the public GLM-4-Voice checkpoint and using only 400K full-duplex samples for fine-tuning followed by a lightweight DPO stage, BayLing-Duplex reaches 92% turn-taking success and 100% interruption success on InstructS2S-Eval, while improving the speech-response score from 2.17 to 3.39 over Moshi. BayLing-Duplex also matches or surpasses its turn-based counterpart on Llama Questions, Web Questions, and Alpaca-Eval, showing that simultaneous listen-and-speak modeling does not sacrifice response quality.


[59] 2606.14536

Provably Safe, Yet Scalable Reinforcement Learning

Safe reinforcement learning (RL) aims to learn policies that optimize rewards while satisfying constraints. Predominant approaches rely on soft-constrained policy optimization, which has achieved empirical success but does not provide formal safety guarantees for the learned policy. In contrast, methods with strict guarantees typically rely on explicit certificate functions, whose construction requires the direct synthesis and verification of control-invariant sets, a process that scales poorly with state dimension and often yields overly conservative behavior. In this paper, we present the Provably Safe, yet Scalable RL (PS2-RL) framework, a novel two-phase architecture for learning provably safe policies in a scalable manner, designed to overcome the key bottlenecks of prior methods. Rather than explicitly computing invariant sets, PS2-RL leverages a learned backup policy to forward-integrate the system dynamics, generating an implicit control-invariant set online. In the first phase, the backup policy is trained with our proposed safe-arrival value function, which characterizes the optimal backup policy for invariant-set construction. In the second phase, an RL policy is trained end-to-end through a differentiable projection layer that strictly enforces the safety guarantees induced by the learned backup policy. By maximizing the volume of the implicit control-invariant set in the first phase, the resulting PS2 policy from the second phase is performant and scalable, while maintaining provable safety. Crucially, PS2-RL imposes no restrictions on the underlying RL algorithm and can be plugged into any existing training pipeline. We establish theoretical guarantees for the proposed framework and evaluate it on robotic control tasks with state dimensions up to 10, a regime in which prior provably safe RL methods struggle or become impractical.


[60] 2606.14601

A Statistical and Machine Learning Framework for Operational Threshold Detection and Deployable Dispatch Controller Development in Hydrogen Multi-Energy Systems

This study presents a statistical and machine learning framework for characterizing a hydrogen-based multi-energy system (H-MES) using one year of high-resolution operational data. Statistical analysis revealed a binary operation driven by renewable surplus, with solar irradiance explaining 45.7% of rank-based variance in hydrogen production, a large effect by conventional standards. Only high-irradiance periods triggered meaningful electrolyzer engagement, while electricity demand exerted a weaker inverse suppression effect ($\epsilon^2 = 0.126$). Multiple regression confirmed electrolyzer power as the dominant linear predictor, with a synergistic solar-wind interaction. Notably, Random Forest analysis ranked wind output first in predictive importance despite its weak bivariate correlation (r = 0.167), revealing non-linear dynamics invisible to parametric methods. A sequence model exploited strong 24-hour autocorrelation (r = 0.845) for operational forecasting, while a reinforcement learning agent optimized hydrogen revenue dispatch. The core contribution is demonstrating that statistical and machine learning approaches are complementary for H-MES modeling and control.


[61] 2606.14606

Impedance MPC with Disturbance Estimation for Dexterous Hand Control

Dexterous hands must simultaneously track precise finger trajectories and maintain safe, compliant contact -- objectives in tension for any fixed-gain controller. We present an actuator-agnostic Impedance Model Predictive Control (Impedance MPC) framework for dexterous fingers, instantiating the constant-$A_d$ offset-free architecture established for physical human-robot interaction (pHRI); its stability, recursive-feasibility, and input-to-state-stability guarantees are inherited by preserving the architectural assumptions. An algebraic feedforward reduces the tendon transmission -- hydraulic, cable, pneumatic, twisted-string, or series-elastic -- to a constant-coefficient double integrator, so the QP cost inverse is precomputed offline and a 10-step receding-horizon quadratic program runs at 500\,Hz while enforcing hard constraints on contact force (ISO/TS 15066), actuation limits, and jerk. An encoder-only augmented-Kalman disturbance state drives steady-state error to zero under any constant contact load. On a hydraulically actuated finger -- the worked example platform, adding pressure and cavitation constraints -- the 500\,Hz Kalman MPC attains 0.5\,mrad RMS, 0.1\,mrad steady-state, and 6.6\,mrad peak deflection under 1.5\,Nm contact: 183$\times$, 1500$\times$, and 23$\times$ better than classical impedance. The realized first-move stiffness (18$\to$323\,Nm/rad with update rate) is independently verified. The architecture scales to a 16-DOF LEAP Hand MuJoCo simulation, recovering from 2.5\,N grasp-load disturbances within 0.7\,s.


[62] 2606.14612

Moonlight in Latent Space: Chirality and Structural Correspondence Between Beethoven's Op. 27 No. 2 and Machine Learning Mechanisms

We show that the three movements of Beethoven's "Moonlight Sonata" (Op. 27 No. 2) instantiate three distinct machine learning architectures -- not by analogy, but by structural correspondence. Through computational analysis of the score (entropy, Jensen-Shannon divergence, dissonance, hand distributional overlap, self-similarity matrices, temporal memory decay, and contextual pitch embeddings), we establish four counterintuitive findings: (1) perceived musical "temperature" is governed by throughput, not distributional width; (2) the lightest movement carries the highest dissonance; (3) the movements implement streaming, recurrent, and periodic positional encoding memory architectures; and (4) the same pitch class acquires different contextual identities across movements, analogous to contextual this http URL embeddings in NLP -- and unsupervised clustering recovers the tonal structure without music-theoretic input. We construct a reverse sonification (decoding analytical features back into MIDI) and quantify the chirality of the encode-decode cycle: what distributions preserve and sequential ordering destroys. Prompted by a listener's observation that the decoded piece sounds like "mirror isomers that can't be superimposed," the chirality measurement reveals reconstruction loss increasing monotonically with n-gram order. Bootstrap baselines and subsample checks confirm all movements carry sequential information above noise, though raw values are confounded by sample size. Cross-domain comparison shows natural language has higher chirality than music, reflecting stronger sequential constraints.


[63] 2606.14614

Decoding Semantic Categories from Picture-Naming EEG

Picture naming requires the transformation of visual object information into a spoken lexical response through perceptual, semantic, lexical, and articulatory processes. This study asked whether semantic-category information is recoverable from high-density EEG during overt picture naming. Sixteen native French-speaking participants performed a picture-naming task using line drawings. Picture labels were embedded with a multilingual text-embedding model and organized into nine interpretable semantic categories, providing a data-driven semantic target space for neural decoding. EEG activity was represented channel-wise using a pre-trained single-channel EEG encoder over an early post-stimulus window, a later naming-related window, and their combination. Nine-class decoding showed above-chance semantic-category discrimination in all temporal representations. Balanced accuracy increased from 0.562 in the early window to 0.610 in the naming-related window, and reached 0.781 when both windows were combined, with a maximum Macro-F1 of 0.784. Class-level F1 scores showed consistent gains across semantic categories, and sensor-level decoding maps indicated spatially distributed category information. These findings suggest that semantic-category structure is reflected in EEG activity during overt picture naming and that early and naming-related temporal windows provide complementary information. The results support the use of modern neural decoding methods as tools for investigating lexical-semantic processing in spoken language production.


[64] 2606.14617

Whole-Body Impedance Model Predictive Control for Safe Physical Human--Robot Interaction on Floating-Base Platforms

Floating-base robots must balance under rigid contact constraints while interacting safely with humans. Existing whole-body control~(WBC) frameworks allocate the full joint space to locomotion or rely on fixed-gain impedance feedback that accumulates steady-state error under sustained physical human--robot interaction~(pHRI) forces. This paper extends the authors' fixed-base two-layer Impedance MPC to floating-base platforms through a three-level architecture: a centroidal MPC plans contact forces over a 500\,ms horizon; a priority-driven WBC layer resolves balance into joint torques through contact-consistent null-space projection; and the residual null space is governed by a receding-horizon quadratic program~(QP) that predicts and rejects pHRI disturbances using a Kalman-augmented state. A contact-consistent feedback linearization reduces the arm end-effector plant to a double integrator with a \emph{constant} state matrix within each contact mode, enabling offline precomputation of the QP cost and ${\geq}1$\,kHz operation. A covariance-inflation protocol preserves the disturbance estimate across contact-mode switches, guaranteeing zero steady-state error under bounded constant pHRI loads, and an Impedance Equivalence Theorem shows the infinite-horizon limit recovers a classical task-space impedance law whose effective mass, damping, and stiffness adapt to posture and contact configuration. Simulations on a 17-DOF biped and the Unitree G1 humanoid validate the design.


[65] 2606.14623

Towards unified Geophysical Data Requirements for Magnetic Navigation (MagNav)

Magnetic Navigation (MagNav) has emerged as a vital alternative Positioning, Navigation, and Timing (PNT) solution, leveraging Earth's magnetic field for robust navigation in GPS/GNSS-degraded or denied environments. Despite its potential, the successful deployment of MagNav is currently hindered by the lack of standardized, high-fidelity geomagnetic reference maps. Existing datasets, primarily designed for geological exploration or academic research, do not meet the distinct operational requirements of navigation systems regarding spatial resolution, error quantification, and global accessibility. This paper initiates a community-focused dialogue on future geophysical data requirements for MagNav, grounded in extensive real-world flight trials. We distinguish between two primary use cases with divergent data needs: Operational MagNav, which requires globally consistent, queryable, and uncertainty-aware datasets for field deployment, and MagNav R&D, which demands comprehensive access to raw survey data to foster innovation. We provide a prioritized set of recommendations for future data requirements, including the development of cohesive, merged datasets, the inclusion of localized 3D uncertainty estimates, and the expansion of the World Magnetic Model (WMM) core field model to spherical harmonic degree 13 to improve consistency. Finally, we emphasize the strategic necessity of designated test ranges to validate these requirements and ensure the operational robustness of MagNav infrastructure.


[66] 2606.14679

Optimal Hidden-Target Learning for Online Inventory Optimization on General Convex Sets

Online inventory optimization (OIO) is online convex optimization with physical memory: inventory carryover makes the feasible action set depend on the past. A natural principle, used in stochastic inventory learning and recently in OIO under a single linear capacity constraint, is to maintain a hidden target chosen by an online learner and implement its projection onto the currently feasible order-up-to set. We prove that this simple principle is optimal for OIO on arbitrary bounded convex capacity sets. With online gradient descent as the base learner, the method improves the best known regret guarantee for OIO on general convex sets from inverse to inverse-square-root dependence on the common-demand probability, and we prove a matching lower bound. The same principle gives the first polylogarithmic regret guarantee for strongly convex losses and the first dynamic regret guarantee adapting to Euclidean path variation on general convex capacity sets. The analysis introduces a norm alignment principle: the right state variable is the distance from the hidden target to the feasible set, measured in the same norm as the projection. Under norm alignment, this distance evolves pathwise as a scalar queue, with target movement as arrival and common demand as service. This reduction to one-dimensional queue control resolves the state dependence and extends the guarantees to general convex capacity sets, beyond the reach of prior productwise approaches. Experiments on synthetic and real-world inventory data corroborate the theory.


[67] 1906.07435

Performance Analysis of QAM-MPPM in Turbulence-Free FSO Channels: Accurate Derivations and Practical Approximations

Following the trends of index modulated (IM) techniques for optical communications, in the last few years several new waveform proposals have been made, aiming at conveying a higher density of information by driving different signal properties. One of these proposals mixes multi-pulse pulse-position modulation (MPPM) and quadrature amplitude modulation (QAM) in a system called QAM-MPPM. We present here a new way to demodulate its compound waveform, and, for the non-turbulent free space optical (FSO) channel, we provide accurate analytical expressions for its error probabilities, both in the case of the traditional and the new detector. Based on these analytical derivations, we also provide simplified expressions for the estimation of the error probabilities. We show that the new detector offers a gain of some tenths of dB in signal-to-noise ratio with respect to the previously defined one without an increase in complexity, and that our error probability estimations are more accurate than previously published results. To the best of our knowledge, this work is the first to provide simulation results validating the study of the error probability performance of QAM-MPPM.


[68] 2305.19655

Equivalent modelling for the fundamental frequency dynamic variation: State-space, impedance, and power-frequency representations

Stability of power electronic converters connected to power grids is commonly assessed by using the impedance criterion while the stability of power grids is typically analysed by using the network state-space representation. It is known that the impedance criterion may lead to erroneous results if the grid frequency dynamics are not considered while eigenvalue analysis is considered as a reliable method for system stability assessment. The equivalence between these two methods has been recently explored, without considering the effect of network frequency variations. Additionally, the link of the impedance criterion with the power-frequency dynamics of power systems also remains largely unexplored. In this paper, the equivalency between the impedance method considering the grid frequency dynamics and the conventional eigenvalue analysis is demonstrated. In addition, the dynamic interaction between the apparent power flow and the network fundamental frequency is formulated and its link with the impedance representation is shown. It is demonstrated that, by using the impedance representation with the network frequency as an additional input port, the network frequency perturbation plot (NFP) can be intuitively expressed by using quantities consistent with the impedance analysis framework. The main findings are verified using detailed numerical simulations of two representative systems.


[69] 2409.04843

Leveraging Sound Source Trajectories for Universal Sound Separation

Existing methods utilizing spatial information for sound source separation require prior knowledge of the direction of arrival (DOA) of the source or utilize estimated but imprecise localization results, which impairs the separation performance, especially when the sound sources are moving. In fact, sound source localization and separation are interconnected problems, that is, sound source localization facilitates sound separation while sound separation contributes to refined source localization. This paper proposes a method utilizing the mutual facilitation mechanism between sound source localization and separation for moving sources. The proposed method comprises three stages. The first stage is initial tracking, which tracks each sound source from the audio mixture based on the source signal envelope estimation. These tracking results may lack sufficient accuracy. The second stage involves mutual facilitation: Sound separation is conducted using preliminary sound source tracking results. Subsequently, sound source tracking is performed on the separated signals, thereby refining the tracking precision. The refined trajectories further improve separation performance. This mutual facilitation process can be iterated multiple times. In the third stage, a neural beamformer estimates precise single-channel separation results based on the refined tracking trajectories and multi-channel separation outputs. Simulation experiments conducted under reverberant conditions and with moving sound sources demonstrate that the proposed method can achieve more accurate separation based on refined tracking results.


[70] 2411.17552

TRUST-UP: Trustworthy Reinforcement learning Using Safe Techniques for UAV Pursuit

Reinforcement Learning (RL) enables autonomous aerial vehicles to adapt quickly and make efficient decisions, making it well-suited for dynamic urban air mobility operations. However, the lack of safety guarantees and transparency hinders the airworthiness certification of RL-based flight control systems, particularly in low-altitude urban environments with human presence. This paper proposes a trustworthy reinforcement learning algorithm that utilizes safe techniques to address the AI trustworthiness requirements for aviation safety, ensuring the transparent and certifiable deployment of RL in safety-critical aerial operations. Specifically, we proposed a Trustworthy Reinforcement learning Using Safe Techniques for UAV Pursuit (TRUST-UP), which consists of two key components: a safety filter constructed from Control Barrier Functions (CBFs) that transforms unsafe RL actions into provably safe flight commands, and a switching strategy that enhances feasibility while maintaining operational transparency. These components enable trustworthy AI deployment in urban airspace, satisfying technical robustness and transparency requirements for aviation certification. Simulation results demonstrate that TRUST-UP enables autonomous UAVs to safely navigate congested urban environments while maintaining human-interpretable decision logic. This work contributes toward certifiable and explainable AI frameworks for low-altitude aviation, addressing the critical need for trustworthy autonomous flight systems in future urban air mobility.


[71] 2504.16413

Hierarchical Distributed Architecture for the Least Allan Variance Atomic Timing

In this paper, we propose a hierarchical distributed timing architecture based on an ensemble of miniature atomic clocks. The goal is to ensure synchronized and accurate timing in a normal operating mode where Global Navigation Satellite System (GNSS) signals are available, as well as in an emergency operating mode during GNSS failures. At the lower level, the miniature atomic clocks employ a distributed control strategy that uses only local information to ensure synchronization in both modes. The resulting synchronized time or generated time scale has the best frequency stability, as measured by the Allan variance, over the short control period. In the upper layer, a supervisor controls the long-term behavior of the generated time scale. In the normal operating mode, the supervisor periodically anchors the generated time scale to the standard time based on GNSS signals, while in the emergency operating mode, it applies optimal floating control to reduce the divergence rate of the generated time scale, which is not observable from the measurable time difference between the miniature atomic clocks. This floating control aims to explicitly control the generated time scale to have the least Allan variance over the long control period. Finally, numerical examples are provided to show that the proposed architecture achieves an Allan variance on the order of $10^{-23}$ over averaging times ranging from one second to several days. This demonstrates its effectiveness and feasibility for high-precision, GNSS-resilient atomic timing.


[72] 2509.02426

Frequency-Domain Characterization of Load Demand from Electrified Highways

Electrified roadways (ER) equipped with dynamic wireless power transfer (DWPT) capabilities can patently extend the driving range and reduce the battery size of electric vehicles (EVs). However, due to the spatial arrangement of the transmitter coils in the ER, the DWPT load exhibits frequency content that could excite power system frequency dynamics. In this context, this work aims to study the spectrum of DWPT loads under different traffic conditions. Under simplifying assumptions, we develop statistical models to identify the location and relative magnitude of DWPT load harmonics. Our analysis reveals that the fundamental frequency depends on ER coil spacing and average EV speed. In the worst-case yet unlikely scenario that EVs move in a synchronized fashion, the amplitude of harmonics scales with the EV count. On the contrary, when EVs move freely, harmonics scale with the square root of the EV count. Platoon formations can accentuate harmonics. The spectral content around harmonics decreases in magnitude and increases in bandwidth with the harmonic index. The load of a single EV moving at a time-varying speed can be modeled as a frequency-modulated (FM) signal. Despite the simplifying assumptions, the derived models offer valuable insights for ER planners and grid operators. Dynamic simulations of a modified WECC model with DWPT loads synthesized from realistic EV trajectories and ER specifications corroborate some of these insights.


[73] 2509.25120

Data-Driven Active Power Flow Modeling: A Behavioral Systems Approach

The increasing decentralization of power systems driven by a large number of renewable energy sources poses challenges in power flow optimization: Partially unknown power line properties can render model-based approaches unsuitable. With the increasing deployment of sensors, data-driven methods rise as a promising alternative, offering flexibility to adapt changes and deal with unknown properties. In this paper, we propose a novel data-driven representation of nonlinear active power flow equations for radial grids based on Willems' Fundamental Lemma. Our approach allows for direct integration of input/output data into active power flow optimization, enabling cost minimization and constraint enforcement without requiring explicit knowledge of the electrical properties of the grid. Moreover, we derive a computationally tractable convex relaxation and show in a numerical case study that our approaches yield results that are identical to optimal active power flow formulations with known parameters.


[74] 2510.15347

Symmetric Entropy-Constrained Video Coding for Machines

As video transmission increasingly serves machine vision systems (MVS) instead of human vision systems (HVS), video coding for machines (VCM) has become a critical research topic. Existing VCM methods often bind codecs to specific downstream models, requiring retraining or supervised data, thus limiting generalization in multi-task scenarios. Recently, unified VCM frameworks have employed visual backbones (VB) and visual foundation models (VFM) to support multiple video understanding tasks with a single codec. They mainly utilize VB/VFM to maintain semantic consistency or suppress non-semantic information, but seldom explore how to directly link video coding with understanding under VB/VFM guidance. Hence, we propose a Symmetric Entropy-Constrained Video Coding framework for Machines (SEC-VCM). It establishes a symmetric alignment between the video codec and VB, allowing the codec to leverage VB's representation capabilities to preserve semantics and discard MVS-irrelevant information. Specifically, a bi-directional entropy-constraint (BiEC) mechanism ensures symmetry between the process of video decoding and VB encoding by suppressing conditional entropy. This helps the codec to explicitly handle semantic information beneficial to MVS while squeezing useless information. Furthermore, a semantic-pixel dual-path fusion (SPDF) module injects pixel-level priors into the final reconstruction. Through semantic-pixel fusion, it suppresses artifacts harmful to MVS and improves machine-oriented reconstruction quality. Experimental results on classical video understanding tasks and MLLM-based tasks show SOTA rate-task performance. It achieves significant bitrate savings over H.266/VVC reference software VTM on video instance segmentation (37.4%), video object segmentation (29.8%), object detection (46.2%), multiple object tracking (44.9%), and MLLM-based video grounding (97.6%).


[75] 2511.23201

A Framework for Geometric-based Statistical Channel Modeling in ISAC Systems

This paper proposes a comprehensive framework for a geometry-based statistical model for integrated sensing and communication (ISAC) tailored for bistatic systems. Our dual-component model decomposes the ISAC channel into a target channel encompassing all multipath components produced by a sensing target parameterized by the target's radar cross-section and scattering points, and a background channel comprising all other propagation paths that do not interact with the sensing target. The framework extends TR38.901 via a hybrid clustering approach, integrating spatiotemporally consistent deterministic clusters with stochastic clusters to preserve channel reciprocity and absolute delay alignment for sensing parameter estimation. Extensive simulations across urban macro, urban micro, and indoor factory scenarios demonstrate that the model maintains communication performance parity with the standard TR38.901, validated through bit-error rate analysis obtained via simulated and measured ISAC channels and channel capacity assessment, while enabling sensing performance evaluation, such as target ranging error for localization and receiver operating characteristic curves for detection probability.


[76] 2512.05780

Reformulating dq Impedance Matrices via Pauli Decomposition for Root-Cause Analysis of Instabilities in Grid-Connected Converters

The increasing penetration of converter-interfaced generators in power systems has led to the adoption of impedance-based criteria as an alternative framework for assessing and ensuring stable integration. However, when the impedance criterion is used, identifying the root cause of instabilities is generally more challenging compared to other approaches, such as modal analysis. Moreover, the eigenvalues and characteristic equation used in the impedance criterion are non-linear functions, making it difficult to establish a clear relationship between impedance components and closed-loop stability. To address this issue, this paper proposes the application of the Pauli decomposition to analyse dq impedance matrices and minor-loop equations. By using this decomposition technique, the dq representation can be reformulated into a quaternion-like form, which has explicit algebraic relationships with the determinant, trace, eigenvalues, and characteristic equation. Moreover, this decomposition enables systematic assessment of the influence of each impedance term in the system stability, thus facilitating finding the root-cause of instabilities. The primary objective of this work is to develop the mathematical foundation of the Pauli decomposition and demonstrate its implications for root-cause analysis. The theoretical contributions are validated using a case study consisting of a converter-interfaced generator connected to a weak grid that has been previously analysed in the literature using existing techniques. The proposed Pauli decomposition provides an algebraic tool that enhances interpretability of impedance-based stability analysis and establishes a basis for further investigation of complex converter interactions.


[77] 2512.15947

MCR-VQGAN: A Scalable and Cost-Effective Tau PET Synthesis Approach for Alzheimer's Disease Imaging

Tau positron emission tomography (PET) is a critical diagnostic modality for Alzheimer's disease (AD), but its widespread clinical adoption is hindered by radiation exposure, limited availability, high clinical workload, and substantial financial costs. To address these limitations, we propose the Multi-scale CBAM Residual Vector Quantized Generative Adversarial Network (MCR-VQGAN) to synthesize high-fidelity tau PET images from structural T1-weighted MRI. MCR-VQGAN advances the standard VQGAN architecture through three enhancements: multi-scale convolutions, ResNet blocks, and Convolutional Block Attention Modules (CBAM), which collectively improve the capture of local and global features. Using 222 paired T1-weighted MRI and tau PET scans from the ADNI database, we trained and compared MCR-VQGAN against cGAN, WGAN-GP, CycleGAN, and baseline VQGAN. MCR-VQGAN achieved superior image synthesis performance across all metrics (MSE = 0.0056 +/- 0.0061, PSNR = 30.65 +/- 4.47 dB, SSIM = 0.9263 +/- 0.0469). A CNN-based AD classifier trained on real tau PET achieved comparable accuracy on real (63.64%) and synthetic (65.91%) images, indicating that diagnostically relevant features are preserved. Regional SUVR-equivalent analysis across Braak-defined ROIs further indicated strong agreement between real and synthetic tau PET (Pearson r = 0.78-0.88; ICC = 0.71-0.84), with the strongest agreement in Braak V/VI (ICC = 0.838). Together, these results suggest that MCR-VQGAN offers a promising and scalable surrogate for conventional tau PET imaging, potentially improving the accessibility of tau biomarkers for AD research and clinical workflows.


[78] 2601.03427

Dual-Transformer Aided Hierarchical Deep Reinforcement Learning for Robust RIS-Assisted Near-Field Communications

The deployment of extremely large aperture arrays (ELAAs) in sixth-generation (6G) networks is expected to shift communications into the near-field regime, where spherical-wave propagation enables distance-aware beamfocusing but remains highly vulnerable to physical blockages that cause non-line-of-sight (NLoS) conditions. To resolve this inherent vulnerability, reconfigurable intelligent surfaces (RIS) can be utilized to circumvent these blockages and effectively establish reliable NLoS communication links. In envisioned deployment scenarios, accurately acquiring instantaneous CSI and predicting sudden blockages is profoundly challenging due to the prohibitive pilot overhead associated with massive passive arrays and the unpredictable mobility of environmental scatterers. To address this, we propose the Dual-Transformer Hierarchical Deep Reinforcement Learning (DT-HDRL) framework, which integrates two specialized transformer models with a two-timescale hierarchical control agent. The first transformer integrates a ray-tracing digital twin prior with distance-aware geometric correction features to yield rapid and precise CSI estimates, while a complementary vision transformer (ViT) processes sequential camera frames to forecast impending blockages prior to link degradation. These predictive outputs are then fed directly into the hierarchical control agent. Within this architecture, a high-level controller processes the slow-timescale blockage predictions to jointly dictate the user transmission path (line-of-sight (LoS) or RIS-assisted NLoS) and RIS active/sleep scheduling, whereas a low-level controller employs the fast-timescale CSI estimates to perform joint base station (BS) beamfocusing and RIS phase-shift optimization.


[79] 2601.13289

Semantic Communication for the Internet of Underwater Things: Architectures, Applications, Challenges, and Future Directions

The Internet of Underwater Things (IoUT) supports marine sensing, environmental monitoring, subsea inspection, and autonomous underwater operations. However, IoUT communication is constrained by limited bandwidth, long propagation delay, time-varying underwater channels, intermittent connectivity, and strict energy budgets. Semantic Communication (SC) offers a promising alternative by transmitting task-relevant meaning rather than raw data, thereby improving communication efficiency in resource-constrained underwater networks. This paper presents a critical and feasibility-aware survey of SC for IoUT, focusing on opportunities, challenges, limitations, and future research directions. We first review the fundamentals of SC-enabled IoUT systems, including semantic representations, layered architectures, semantic channel modeling, and task-oriented evaluation metrics. We then examine learning-driven approaches based on machine learning (ML), knowledge graphs (KGs), vision-language models (VLMs), generative models, and federated learning (FL), with emphasis on their feasibility under underwater edge constraints. Representative applications, including environmental monitoring, marine ecology, subsea infrastructure inspection, disaster response, and autonomous underwater vehicle (AUV) coordination, are analyzed from an SC perspective. Finally, we identify key research directions involving standardized semantic models, reproducible testbeds, compute--communication trade-offs, trustworthy reconstruction, hybrid underwater links, energy-aware edge intelligence, semantic security, digital twins (DTs), and cross-domain interoperability. This survey provides a structured foundation for developing reliable, efficient, and meaning-driven IoUT communication systems.


[80] 2603.05213

BabAR: from phoneme recognition to developmental measures of young children's speech production

Studying early speech development at scale requires automatic tools, yet automatic phoneme recognition, especially for young children, remains largely unsolved. Building on decades of data collection, we curate TinyVox, a corpus of more than half a million phonetically transcribed child vocalizations in English, French, Portuguese, German, and Spanish. We use TinyVox to train BabAR, a cross-linguistic phoneme recognition system for child speech. We find that pretraining the system on multilingual child-centered daylong recordings substantially outperforms alternatives, and that providing 20 seconds of surrounding audio context during fine-tuning further improves performance. Error analyses show that substitutions predominantly fall within the same broad phonetic categories, suggesting suitability for coarse-grained developmental analyses. We validate BabAR by showing that its automatic measures of speech maturity align with developmental estimates from the literature.


[81] 2603.24596

X-OPD: Cross-Modal On-Policy Distillation for Capability Alignment in Speech LLMs

While the shift from cascaded dialogue systems to end-to-end (E2E) speech Large Language Models (LLMs) improves latency and paralinguistic modeling, E2E models often exhibit a significant performance degradation compared to their text-based counterparts. The standard Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) training methods fail to close this gap. To address this, we propose X-OPD, a novel Cross-Modal On-Policy Distillation framework designed to systematically align the capabilities of Speech LLMs to their text-based counterparts. X-OPD enables the Speech LLM to explore its own distribution via on-policy rollouts, where a text-based teacher model evaluates these trajectories and provides token-level feedback, effectively distilling teacher's capabilities into student's multi-modal representations. Extensive experiments across multiple benchmarks demonstrate that X-OPD significantly narrows the gap in complex tasks while preserving the model's inherent capabilities.


[82] 2606.02185

Breaking the Pair: Evaluating Dyadic Interaction via Speaker Switching

Speakers in dialogue continuously adapt their communicative behavior across acoustic, lexical, and semantic dimensions, a phenomenon known as conversational entrainment. Modeling this process requires representations that capture the global structure of interaction, yet prior approaches fail to disentangle dyad-specific patterns from speaker-specific traits, limiting their ability to capture true conversational adaptation. We address this with the Dyadic Distance Matrix (DDM), which encodes all pairwise similarities between the turns of two speakers over an entire conversation, capturing long-range cross-speaker dependencies. This raises a key question: does the DDM represent genuine interaction, or merely reflect individual speaker characteristics? We propose the speaker-switch test, a principled control in which one speaker's turns are replaced with those from an unrelated speaker drawn from a different conversation. This preserves turn-level statistics while disrupting the original dyadic coadaptation. The ability to distinguish real from switched DDMs thus directly evaluates whether the representation encodes interaction-specific structure. Across four embedding types and classifiers including ResNet-50 on the CANDOR corpus, real DDMs are consistently distinguishable from their switched counterparts. Comparisons with LibriSpeech show higher discriminability in read speech, highlighting the role of prosodic variability in naturalistic conversations. GradCAM analysis further reveals distinct structural signatures driving classification. These results establish the speaker-switch test as a robust diagnostic for validating representations of dyadic conversational interaction.


[83] 2606.08803

Some Essential Constructive Foundations for Systems and Control

This work develops several constructive foundations for systems and control within Bishop-style constructive mathematics. For an engineer, the guiding principle is that an object claimed to exist, such as a trajectory, an optimal control law, a selector, or a viable solution, should come with finite data and an operation computing approximations to any prescribed precision. The style remains close to classical analysis, but existential statements are organized so that their computational content is visible. The paper begins with elementary geometric data in finite-dimensional Euclidean spaces: blocks, multiblocks, representable sets, regular functions, and certified integrals. This set-first integration route is meant to complement, rather than replace, abstract constructive integration theories such as Daniell-type or integration-space approaches. The developed apparatus is then applied to a constructive functional extremum-value theorem, selector extraction for multifunctions, Filippov-type and viable solutions of differential inclusions, regular probability densities, controlled Markov chains, and empirical density certificates. A short account of resolvent projectors and linear stability is included for completeness.


[84] 2501.11842

Harnessing Rydberg Atomic Receivers: From Quantum Physics to Wireless Communications

The intrinsic integration of Rydberg atomic receivers into wireless communication systems is proposed, by harnessing the principles of quantum physics in wireless communications. More particularly, we conceive a pair of Rydberg atomic receivers, one incorporates a local oscillator (LO), referred to as an LO-dressed receiver, while the other operates without an LO and is termed an LO-free receiver. The appropriate wireless model is developed for each configuration, elaborating on the receiver's responses to the radio frequency (RF) signal, on the potential noise sources, and on the signal-to-noise ratio (SNR) performance. The developed wireless model conforms to the classical RF framework, facilitating compatibility with established signal processing methodologies. Next, we investigate the associated distortion effects that might occur, specifically identifying the conditions under which distortion arises and demonstrating the boundaries of linear dynamic ranges. This provides critical insights into its practical implementations in wireless systems. Finally, extensive simulation results are provided for characterizing the performance of wireless systems, harnessing this pair of Rydberg atomic receivers. Our results demonstrate that LO-dressed systems achieve a significant SNR gain of approximately 40~50 dB over conventional RF receivers in the standard quantum limit regime. This SNR head-room translates into reduced symbol error rates, enabling efficient and reliable transmission with higher-order constellations.


[85] 2503.14331

ADAPT: An Autonomous Forklift for Construction Site Operation

Efficient material logistics play a critical role in controlling costs and schedules in the construction industry. However, manual material handling remains prone to inefficiencies, delays, and safety risks. Autonomous forklifts offer a promising solution to streamline on-site logistics, reducing reliance on human operators and mitigating labor shortages. This paper presents the development and evaluation of ADAPT (Autonomous Dynamic All-terrain Pallet Transporter), a fully autonomous off-road forklift designed for construction environments. Unlike structured warehouse settings, construction sites pose significant challenges, including dynamic obstacles, unstructured terrain, and varying weather conditions. To address these challenges, our system integrates AI-driven perception techniques with traditional approaches for decision making, planning, and control, enabling reliable operation in complex environments. We validate the system through extensive real-world testing, comparing its continuous performance against an experienced human operator across various weather conditions. Our findings demonstrate that autonomous outdoor forklifts can operate near human-level performance, offering a viable path toward safer and more efficient construction logistics.


[86] 2512.10966

Interpretable Alzheimer's Diagnosis via Multimodal Fusion of Regional Brain Experts

Accurate and early diagnosis of Alzheimer's disease (AD) is critical for effective intervention and requires integrating complementary information from multimodal neuroimaging data. However, conventional fusion approaches often rely on simple concatenation of features, which cannot adaptively balance the contributions of biomarkers such as amyloid PET and MRI across brain regions. In this work, we propose MREF-AD, a Multimodal Regional Expert Fusion model for AD diagnosis. It is a Mixture-of-Experts (MoE) framework that models mesoscopic brain regions within each modality as independent experts and employs a gating network to learn subject-specific fusion weights. Utilizing tabular neuroimaging and demographic information from the Alzheimer's Disease Neuroimaging Initiative (ADNI), MREF-AD achieves competitive performance over strong classic and deep baselines while providing interpretable, modality- and region-level insight into how structural and molecular imaging jointly contribute to AD diagnosis. The source code is available at this https URL.


[87] 2601.03971

Posterior error bounds for prior-driven balancing in linear Gaussian inverse problems

In large-scale Bayesian inverse problems, it is often necessary to apply approximate forward models to reduce the cost of forward model evaluations, while controlling approximation quality. In the context of Bayesian inverse problems with linear forward models, Gaussian priors, and Gaussian noise, we use perturbation theory for inverses to bound the error in the approximate posterior mean and posterior covariance resulting from a linear approximate forward model. We then focus on the smoothing problem of inferring the initial condition of linear time-invariant dynamical systems, using finitely many partial state observations. For such problems, and for a specific model order reduction method based on balanced truncation, we show that the impulse response of a certain prior-driven system is closely related to the prior-preconditioned Hessian of the inverse problem. This reveals a novel connection between systems theory and inverse problems. We exploit this connection to prove the first a priori error bounds for system-theoretic model order reduction methods applied to smoothing problems. The bounds control the approximation error of the posterior mean and covariance in terms of the truncated Hankel singular values of the underlying system.


[88] 2602.05670

HyperPotter: Spell the Charm of High-Order Interactions in Audio Deepfake Detection

Advances in AIGC technologies have enabled the synthesis of highly realistic audio deepfakes capable of deceiving human auditory perception. Although numerous audio deepfake detection (ADD) methods have been developed, most rely on local temporal/spectral features or pairwise relations, overlooking high-order interactions (HOIs). HOIs capture discriminative patterns that emerge from multiple feature components beyond their individual contributions. We propose HyperPotter, a hypergraph-based framework designed to capture high-order relations associated with synergistic patterns through clustering-based hyperedges with class-aware prototype initialization. Extensive experiments on 13 test sets show that HyperPotter improves over the baseline on 11 sets, yielding an average relative EER reduction of 12.68\% across all test sets and 22.15\% on the improved sets. These results demonstrate strong cross-scenario generalization, while also revealing robustness limits under severe codec or channel distortion.


[89] 2604.14193

QualiaNet: An Experience-Before-Inference Network

Human 3D vision involves two distinct stages: an Experience Module, where stereo depth is extracted relative to fixation, and an Inference Module, where this experience is interpreted to estimate 3D scene properties. Paradoxically, although stereo vision does not provide us with absolute distance information, it nonetheless affects our inferences about distance. We propose the Inference Module exploits a natural scene statistic: near scenes produce vivid disparity gradients, while far scenes appear comparatively flat. QualiaNet implements this two-stage architecture computationally: disparity maps simulating human stereo experience are passed to a CNN trained to estimate distance. The network can recover distance from disparity gradients alone, validating this approach.


[90] 2605.14998

Learning Developmental Scaffoldings to Guide Self-Organisation

From subcellular structures to entire organisms, many natural systems generate complex organisation through self-organisation: local interactions that collectively give rise to global structure without any blueprint of the outcome. Yet a significant portion of the information driving such processes is not produced by self-organisation itself, instead, it is often offloaded to initial conditions of the system. Biological development is a prime example, where maternal pre-patterns encode positional and symmetry-breaking information that scaffolds the self-organising process. From maternal morphogen gradients in early embryogenesis to tissue-level morphogenetic pre-patterns guiding organ formation, this transfer of information to initial conditions, analogous to a memory-compute trade-off in computational systems, is a fundamental part of developmental processes. In this work, we study this offloading phenomenon by introducing a model that jointly learns both the self-organisation rules and the pre-patterns, allowing their interplay to be varied and measured under controlled conditions: a Neural Cellular Automaton (NCA) paired with a learned coordinate-based pattern generator (SIREN), both trained simultaneously to generate a set of patterns. We provide information-theoretic analyses of how information is distributed between pre-patterns and the self-organising process, and show that jointly learning both components yields improvements in robustness, encoding capacity, and symmetry breaking over purely self-organising alternatives. Our analysis further suggests that effective pre-patterns do not simply approximate their targets; rather, they bias the developmental dynamics in ways that facilitate convergence, pointing to a non-trivial relationship between the structure of initial conditions and the dynamics of self-organisation.


[91] 2605.21136

LoRa and LoRaWAN simulator-cum-emulator with CAD and capture effect in Python

Existing LoRaWAN/LoRa simulators consist of large, complicated C++ codebases and often do not support all device classes. This paper presents the design of a simple to use, Python-based discrete-event simulator that addresses these gaps while also introducing a novel method for evaluating real device firmware in the simulator. The simulator is built on a custom asyncio-based simulation kernel, a three-phase packet delivery model that reproduces the capture effect, a full LoRaWAN 1.0.4 stack, and a containerized firmware system that cross-compiles real STM32 C firmware and redirects HAL calls into the simulator via CFFI. The simulator is distributed as a Python package via Github (this https URL) and requires no external simulation framework or dependencies.


[92] 2605.24795

Lifted Schrödinger Bridges for Gaussian Mixture Endpoints: Projection Gaps and Path-Space Obstructions

We study stochastic density control between Gaussian-mixture endpoint distributions under Brownian prior dynamics. Since the direct Schrödinger bridge between Gaussian mixtures is generally not available in closed form, we introduce a lifted path-space construction in which each trajectory is augmented with a source--target component label. Consequently, the problem decomposes into Gaussian component-to-component Schrödinger bridges with explicit marginal, drift, and cost formulas, while the mixture-level assignment reduces to a finite-dimensional entropic coupling problem with a Sinkhorn scaling form. We then analyze the projection obtained by discarding or forgetting the label. By construction, the projected law satisfies the original Gaussian-mixture endpoint constraints, but its relative entropy generally differs from the lifted relative entropy by a nonnegative conditional label-information gap. This gap reveals a path-space obstruction: the lifted optimizer cannot, in general, be identified with the direct unlabeled Schrödinger bridge after projection. We also derive the posterior-averaged Markov drift associated with the projected marginal flow, prove a kinetic-energy upper bound, and identify a common path-potential condition under which the projection gap vanishes. Several numerical illustrations showing density and shape control are recorded for a self-contained exposition.


[93] 2605.25025

Micro-Swarm Locomotion Optimization in Dynamic Flow using Multi-Objective Multi-Agent Reinforcement Learning

Coordinating micro-robotic swarms in realistic, time-dependent fluid environments remains a major challenge for biomedical and environmental applications. We present a hybrid CFD-MO-MARL (Computational Fluid Dynamics-Multi Objective-Multi Agent Reinforcement Learning) framework that couples a high-fidelity incompressible Navier--Stokes solver with decentralized proximal policy optimization to learn swarm control policies in oscillatory flow. Sixteen magnetically actuated micro-robots were simulated to navigate a pulsatile arterial waveform within a 2 mm channel while jointly optimizing upstream progression, energy efficiency, and motion smoothness. Conflicting objectives are resolved using Projected Conflicting Gradient (PCGrad) surgery. Without PCGrad, energy and smoothness rewards collapse during training, demonstrating that gradient conflict resolution is essential for stable multi-objective learning. The converged policy achieves progress rewards of 6.5-7.0, energy efficiency of 0.63-0.65, and smoothness of 0.97-0.99, outperforming brute-force baselines by more than 8 reward units on the primary objective. Training reveals three emergent behaviors not encoded in the reward function: hydrodynamic throttling formations that reduce peak flow velocities, a cycle-synchronized ratchet mechanism that exploits flow reversals for upstream movement, and individualized final-approach strategies near the target boundary. These results demonstrate that physically realistic fluid--agent interactions can be integrated directly into multi-objective reinforcement learning, providing a scalable framework for micro-swarm control in biomedical navigation, environmental monitoring, and microfluidic systems.


[94] 2606.08663

Probing Token Spaces under Generator Shift in AI-Generated Music Detection

AI-generated music detectors can appear robust on standard benchmark splits, yet their deployments require transfer to generator sources absent during training. We study this problem with source-restricted evaluation on \textsc{MoM-open}, an open reconstruction of MoM-CLAM that replaces the non-redistributable real corpus with FMA and MTG-Jamendo while preserving the fake-generator protocol. To isolate the role of representation, we introduce \textsc{CoMoE}, a compact fixed classifier for comparing heterogeneous audio token spaces while keeping the downstream architecture and training recipe unchanged. Experiments show that standard and real-source-restricted splits are nearly saturated, whereas fake-source restriction exposes large differences between token spaces: X-Codec tokens are strongest when training on Udio alone, while MERT-derived tokens are stronger when training on Suno-v3.5 alone. These results suggest that codec-style discrete token spaces should be treated as a primary experimental axis under generator shift in AI-generated music detection. Our code and data are available at this https URL.


[95] 2606.12349

Traceable Virtual Sea Trials in the Marine Robotics Unity Simulator for Manoeuvring Assessment of Unmanned Surface Vehicles

Accurate identification of hydrodynamic derivatives is essential for precise control and autonomous navigation of Unmanned Surface Vehicles (USVs). However, acquiring high-fidelity manoeuvring data from physical sea trials is often constrained by cost, safety, and environmental disturbances. Standard manoeuvring trials, particularly Turning Circle (TC) and Zig-Zag (ZZ), remain fundamental to IMO and ITTC assessment procedures because they provide comparable performance metrics reflective of underlying hydrodynamic behaviour. This paper extends the open-source Marine Robotics Unity Simulator (MARUS) by introducing a standardised Virtual Sea Trial framework for automated execution and data generation of TC/ZZ manoeuvres. The framework provides traceable command-actuation logging, system-identification (SI)-focused data conditioning, and automated extraction of IMO/ITTC-aligned manoeuvring metrics. A key contribution is a dedicated TC/ZZ data acquisition and post-processing pipeline, improving the repeatability and auditability of simulator-based manoeuvres while producing SI-ready datasets for hydrodynamic-derivative identification and digital-twin workflows. The framework also provides explicit command-execution separation for differential-thrust steering, where manoeuvre inputs are recorded as ordered rudder-equivalent commands and realised actuation is logged as an execution-level proxy derived from applied thrust. Case study results demonstrate repeatable and IMO-compliant manoeuvre behaviour. For TC tests, the normalised advance differs by approximately 3.9% between port and starboard turns, while the tactical diameter differs by 4.6-4.7%. For ZZ tests, first and second overshoot excesses remain below 1 degree for both +/-10-degree and +/-20-degree manoeuvres, satisfying IMO criteria, while peak yaw rates range from approximately 4.1 to 5.8 degrees/second.