New articles on Electrical Engineering and Systems Science


[1] 2604.20918

EDU-Net: Retinal Pathological Fluid Segmentation in OCT Images with Multiscale Feature Fusion and Boundary Optimization

Objective: Diabetic macular edema (DME) is the leading cause of severe visual impairment in patients with diabetes. Quantification of retinal fluid, particularly intraretinal fluid (IRF) and subretinal fluid (SRF), plays a critical role in the management of DME. Although optical coherence tomography (OCT) can be used for detection, the variable morphology of fluid accumulation and the blurred boundaries caused by noise interference still limit the accuracy of OCT's automatic segmentation. Methods: Retrospective model development and validation study. This study proposes a novel edge-guided dual-branch encoder-decoder network (EDU-Net) to achieve accurate and efficient automatic segmentation of OCT liquid lesions. The local feature extraction branch is based on the EfficientNet model, which precisely captures tiny lesions by leveraging its lightweight separable convolution and high-resolution feature preservation strategy. The global feature extraction branch is based on the large-kernel efficient convolution (LKEC) module and the downsampling layer design to enhance long-range dependencies and global semantics. EDU-Net applies a multi-category edge-guided attention module to fuse high-frequency boundary detail information to each resolution feature to optimize the boundary segmentation performance. Results: Extensive results on the in-house and public datasets demonstrate that EDU-Net achieves state-of-the-art DSC segmentation performance in terms of efficiency and robustness, especially in the segmentation of IRF lesions. Conclusions: EDU-Net integrates local details with global context and optimizes boundaries, achieving an improvement in the accuracy of automatic segmentation of retinal fluid.


[2] 2604.20979

A Complete Approach to Time Varying Linear Systems

This paper presents a unifying theory of Linear second order systems that allows time-varying and time invariant systems to be treated in the same way for the first time. In the process, a transformation is given that diagonalizes an arbitrary time varying state matrix in a spectrum invariant way. A canonical form for the fundamental matrix is given that depends on dynamic eigenvalues and related eigenvectors dependent upon the Riccati Characteristic Equation for the system, which intuitively generalizes the standard characteristic equation for time invariant systems. The technique is shown by examples to give a unified approach to the solutions of time invariant, time-varying, and periodic systems.


[3] 2604.21022

The Radon Transform, True Time Delay Beamforming, and Ultra-Wideband Antenna Arrays (Invited Paper)

The FR3 band has emerged as the major focus of 6G wireless research. FR3 cellular operation presents the challenge of extreme bandwidth combined with physically large antenna arrays. In this regime, conventional phase-shift beamforming entails a loss of coherence (beam-squint), and has to be replaced by true time delay beamforming (TTD). It happens that TTD is mathematically equivalent to taking the Radon transform of the space/time measurements. We exploit fifty years of research in the application of the Radon transform to computer tomography and to seismic exploration to elucidate the workings of TTD. We use the Radon transform combined with semblance detection and Radon slowness filtering to remove far-field signals from the measured space/time signals from a linear array, leaving only near-field signals. In turn we partition the array into sub-arrays. For each sub-array we estimate, via the semblance Radon transform, the angles-of-arrival of the near-field signals. We then use triangulation to estimate the coordinates of the near-field sources. Finally we integrate the original space/time data along hyperbolic trajectories to extract the individual near-field signal envelopes.


[4] 2604.21030

A Systematic Review and Taxonomy of Reinforcement Learning-Model Predictive Control Integration for Linear Systems

The integration of Model Predictive Control (MPC) and Reinforcement Learning (RL) has emerged as a promising paradigm for constrained decision-making and adaptive control. MPC offers structured optimization, explicit constraint handling, and established stability tools, whereas RL provides data-driven adaptation and performance improvement in the presence of uncertainty and model mismatch. Despite the rapid growth of research on RL--MPC integration, the literature remains fragmented, particularly for control architectures built on linear or linearized predictive models. This paper presents a comprehensive Systematic Literature Review (SLR) of RL--MPC integrations for linear and linearized systems, covering peer-reviewed and formally indexed studies published until 2025. The reviewed studies are organized through a multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains. In addition, a cross-dimensional synthesis is conducted to identify recurring design patterns and reported associations among these dimensions within the reviewed corpus. The review highlights methodological trends, commonly adopted integration strategies, and recurring practical challenges, including computational burden, sample efficiency, robustness, and closed-loop guarantees. The resulting synthesis provides a structured reference for researchers and practitioners seeking to design or analyze RL--MPC architectures based on linear or linearized predictive control formulations.


[5] 2604.21040

Online Long-Term Voltage Stability Margin Estimation for IBR/DER Dominated Power System with Integrated VSM-Aware TSO-DSO Framework

The rapid growth of inverter-based resources (IBRs) and distributed energy resources (DERs) has fundamentally altered the long-term voltage stability characteristics of modern power systems. This article leverages the advantages of machine learning (ML) for the online estimation of long-term voltage stability margin (VSM) and enhancement of VSM through coordinated transmission system operator-distribution system operator (TSO-DSO) optimization. An explicit analytical VSM expression is derived from offline T&D co-simulation data using a physics-informed ML-trained model under probabilistic loading and generation mix scenarios, while accounting for unbalanced distribution modeling. The resulting closed-form VSM representation is linearized and embedded into the TSO optimization problem, enabling real-time enforcement of minimum VSM constraints. We further enhance operational efficiency by incorporating VSM sensitivities into both transmission and distribution optimization, allowing prioritization of the most influential reactive power resources. Simulation studies conducted on the IEEE 30-bus transmission network integrated with multiple IEEE 37-node distribution feeders validate that the proposed framework successfully achieves the desired VSM enhancement while maintaining high estimation accuracy.


[6] 2604.21065

On the dynamic behavior of the network SIRS epidemic model

We study the Suscectible-Infected-Recovered-Susceptible (SIRS) epidemic model on deterministic networks. For connected but otherwise general interaction patterns and heterogeneous recovery and loss-of-immunity rates, we identify a fundamental parameter R_0 (the basic reproduction number), which fully characterizes the qualitative dynamic behavior of the system. This parameter is the dominant eigenvalue of a rescaled version of the interaction matrix, whose rows are normalized by the corresponding recovery rates. We prove that a transcritical bifurcation occurs as R_0 crosses the threshold value 1. Specifically, we show that, if R_0 does not exceed 1, then the disease-free equilibrium is globally asymptotically stable, whereas, if R_0 is larger than 1, then the disease-free equilibrium is unstable and there exists a unique endemic equilibrium, which is asymptotically stable. As a byproduct of our analysis, we also identify key monotonicity properties of the dependence of the endemic equilibrium on the model parameters (the interaction matrix as well as the recovery rates and the loss-of-immunity rates) and obtain a distributed iterative algorithm for its computation, with provable convergence guarantees. Our results extend existing ones available in the literature for network SIRS epidemic models with rank-one interaction matrices and homogeneous recovery rates (including the single homogeneous population SIRS epidemic model).


[7] 2604.21115

Complex Approximate Message Passing with Non-separable Denoising

Approximate Message Passing (AMP) is a general framework for iterative algorithms, originally developed for compressed sensing and later extended to a wide range of high-dimensional inference problems. Although recent work has advanced matrix AMP, complex AMP, and AMP for non-separable functions independently, a unified state evolution theory for complex AMP with non-separable denoisers has been lacking. This article fills that gap by establishing state evolution in the setting of complex, non-separable denoising functions. The proposed approach constructs an augmented real-valued system that lifts the problem to a higher-dimensional space, then recovers the complex domain through a many-to-one canonical transformation. Under this construction, the Onsager correction naturally involves Wirtinger derivatives, and the resulting state evolution reduces to scalar complex recursions despite the non-separable structure of the denoisers. The framework extends to the matrix-valued setting, accommodating multiple feature vectors simultaneously. This generalization enables AMP to exploit joint structural constraints, such as simultaneous group and element sparsity, in complex-valued recovery problems. The complex sparse group least absolute shrinkage and selection operator (LASSO) serves as a key instantiation, motivated by preamble detection in Orthogonal Time-Frequency Space (OTFS)-based unsourced random access. Numerical experiments confirm that state evolution accurately predicts performance and show that complex non-separable denoising can produce significant gains over separable and real-valued alternatives.


[8] 2604.21126

Threat Detection and Resilience Techniques in PRS-Assisted OTDOA 5G Positioning Systems

Precise positioning is a key enabler for emerging 5G applications, from autonomous transport to industrial automation. Yet the open physical layer (PL) leaves standard positioning reference signals (PRSs) vulnerable to manipulation. This work addresses the security of downlink observed time difference of arrival positioning (DL-OTDOA) through three contributions. First, we introduce VeriLoc, an open-source system-level simulator designed for realistic channel modeling and PL threat injection. Second, we propose three novel security techniques to enhance resilience and threat detection: encrypted PRS to prevent adversarial waveform synthesis, angular-based source authentication (ABSA), and a cross-layer downlink-uplink handshaking protocol to detect attacks that cannot be mitigated by encryption. Third, utilizing VeriLoc, we evaluate the proposed techniques alongside position tracking and a PRS authentication scheme, which extends the original hash-based message authentication code (HMAC) scheme design to support digital signatures. Simulation results demonstrate that while encryption, authentication schemes, and tracking robustly counter selective PRS spoofing and jamming, the proposed spatial and cross-layer mechanisms are essential for detecting meaconing, collectively maintaining attack detection rates in excess of 90% while keeping false alarm rates minimal.


[9] 2604.21163

Efficient Design of Fronthaul-Constrained Uplink Reception for Cell-Free XL-MIMO

With the evolution of multiple-input multiple-output (MIMO) technology toward extremely large (XL) MIMO systems comprising hundreds of, or more, antennas, this work investigates scalable and fronthaul-efficient reception design for the uplink of cell-free (CF) XL-MIMO systems. In such systems, the uplink signals transmitted by mobile user equipments (UEs) are jointly decoded at a central processing unit (CPU) connected to distributed access points (APs) via finite-capacity fronthaul links. We address the joint optimization of linear transform matrices, used by the APs to reduce the signal dimension and fronthaul load, and fronthaul compression strategies to maximize the uplink sumrate. A fractional programming (FP)-based iterative algorithm is first developed, followed by a reduced-complexity variant, termed accelerated FP (A-FP), along with its decentralized implementation whose fronthaul overhead remains independent of the number of AP antennas. Numerical results show that the proposed A-FP scheme significantly reduces computational complexity compared to FP implemented with general-purpose solvers, while substantially outperforming scalable baseline schemes that rely solely on local channel state information.


[10] 2604.21234

A Dynamic Phasor Framework for Analysis of IBR-Induced SSOs in Multi-Machine Systems

We propose a generalized dynamic phasor (DP) framework to analyze inverter-based resources (IBRs) connected to multi-machine systems under balanced and unbalanced conditions. It captures subsynchronous oscillations (SSOs) induced by grid-following (GFL) IBRs. The linearizability and time invariance of the framework enables us to perform eigen decomposition, which is a powerful tool for root-cause analysis of the SSO modes and damping controller design. The same framework also enables analysis of excitation of the SSO modes in presence of data center (DC) loads. The GFL IBRs are modeled in their respective $dq$-frame DPs and the detailed model of synchronous generators (SGs) along with dynamic transmission network models are represented in $pnz$-frame DPs. Several case studies are performed on the modified IEEE two-area benchmark system, where $2$ SGs are replaced by GFL IBRs and validated with EMTDC/PSCAD simulations. First, time- and frequency-domain analyses of the SSO mode are presented followed by the design of a robust decentralized $\mathcal{H}_\infty$ damping controller based on local signals of the GFL IBRs. Second, the dynamic behavior of the system following an unbalanced fault is demonstrated that is damped by the proposed damping controller. Finally, excitation of the SSO mode in presence of DC load is exhibited and its locational impact is analytically quantified.


[11] 2604.21248

Optimum adaptation of a Steiner network

The Euclidean Steiner tree problem, normally posed in two dimensions, seeks to connect a set of prescribed terminal nodes by placing additional nodes, known as Steiner points, with edges connecting such nodes either to another Steiner point or a terminal node, and with the placements minimising the sum of all the edge lengths of the associated tree. We consider a problem in which we start with a known solution to a Steiner tree problem, and the terminal positions are then perturbed. A first-order approximation theorem is established for efficiently updating the Steiner point positions to recover a Steiner tree solution after the perturbations to terminal nodes. Numerical examples illustrate the effectiveness of our approach (including a stepwise application for large perturbations) as well as its limitations.


[12] 2604.21259

A Convexified Eulerian Framework for Scalable Coordination of Massive DER Populations

This paper proposes a scalable coordination framework with aggregator-side privacy protection for storage-like distributed energy resources (DERs). The framework adopts a two-layer architecture. At the macroscopic layer, building upon an \emph{Eulerian} modeling perspective, the DER population is represented as a continuum whose density evolution is governed by a partial differential equation (PDE), such that the computational complexity is independent of the population size. To address the bilinear non-convexity in this PDE-constrained optimization problem, we develop a convexification method that combines finite-volume discretization with a flux-lifting technique, reformulating the macroscopic problem into a sparse linear program (LP). The LP solution yields a unified, state-dependent broadcast signal for population coordination. Furthermore, a Wasserstein-based relaxation is introduced to replace rigid cyclic constraints and provide additional operational flexibility for improved economic performance. At the microscopic layer, individual resources autonomously recover local setpoints from the broadcast signal and their local states, while an upstream data-mixing protocol aggregates individual states into a macroscopic density histogram without exposing raw individual states to the aggregator. Numerical studies validate the scalability, feasibility, and economic effectiveness of the proposed framework.


[13] 2604.21262

Frequency Security Assessment in Power Systems With High Penetration of Renewables Considering Spatio-Temporal Frequency Distribution

The increasing integration of renewable energy sources exacerbates the spatial and temporal differences in frequency across the power system, posing a serious challenge to the accurate and efficient assessment of system frequency security. To address this issue, a generic effective nodal frequency (ENF) model is first established to concisely characterize nodal frequency dynamics. This model is featured by the effective nodal inertia (ENI), damping, and primary regulation parameters, which retain only the dominant constant component governing nodal frequency dynamic performance. This model enables the tractable analytical formulation of nodal frequency trajectory and the key frequency security indicators. Quantitative analysis under the temporary power disturbance condition reveals that the ENI is the most influential parameter governing frequency security. Consequently, the critical nodal inertia for ensuring nodal frequency security is analytically derived. A system-level frequency security index based on the actual ENI and critical nodal inertia is proposed. On the basis of the proposed index, the system frequency security assessment is carried out with the procedure of ``offline calculation and online evaluation'', which is achieved using a lookup table approach and an interpolation method. Simulations on the modified IEEE 39-bus system verify the effectiveness of the proposed assessment method.


[14] 2604.21294

Analytical PI Tuning for Second-Order Plants with Monotonic Response and Minimum Settling Time

Background: Tuning proportional-integral (PI) controllers for second-order plants to achieve monotonic step response with minimum settling time is an important problem in analytical control design. Existing methods address these objectives only partially or require numerical optimization. Methods: A closed-form analytical solution is derived through pole placement in the framework of Astrom and Hagglund. The key insight is that designing the closed-loop poles slower than the fast plant pole forces pole-zero cancellation of the slow plant pole as a consequence, not an assumption. The critically damped condition is then applied to minimize settling time. Results: The optimal PI parameters are K=T1/(4KpT2), Ti=T1, where T1 and T2 are the plant time constants and Kp is the plant gain. No free parameter remains. The resulting closed-loop system possesses universal robustness properties independent of plant parameters: maximum complementary sensitivity Mt = 1, maximum sensitivity Ms = 1.155, and phase margin PM = 76.35 degree. Conclusions: The proposed tuning formulas are explicit, analytically proven, and apply directly to any stable second-order plant with two real poles. Simulation results across six plant configurations confirm the analytical predictions exactly. The notation follows Astrom and Hagglund [5] throughout. Keywords: PI controller; second-order plant; pole placement; critically damped; monotonic response; settling time; robustness


[15] 2604.21302

Scalable Sensor Scheduling for Continuous-Discrete Kalman Filtering via Information-Form Surrogate Dynamics

We study sensor scheduling for continuous-discrete Kalman filtering with Poisson measurement arrivals and propose an information-form deterministic surrogate for scalable offline design. Unlike the covariance-form surrogate, the sensing rates enter through sensor-specific additive information increments, eliminating mixed state-input derivatives in the transcribed nonlinear program and thereby yielding a simpler derivative structure. We further show that, together with the covariance-form surrogate, the proposed surrogate provides computable two-sided performance bounds for a given schedule under stochastic measurement arrivals. Numerical experiments demonstrate substantial computational savings, especially in many-sensor settings, while retaining comparable realized Monte Carlo performance and providing computable two-sided performance bounds for the returned schedule.


[16] 2604.21381

Privacy-Preserving Distributed Stochastic Optimization with Homomorphic Encryption and Heterogeneous Stepsizes

Distributed stochastic optimization enables multi-agent collaboration in applications such as distributed learning and sensor networks, but also raises critical privacy concerns due to the involvement of sensitive data. While existing privacy-preserving approaches often face limitations in balancing accuracy with efficiency, we propose a novel distributed stochastic gradient descent algorithm that integrates Paillier homomorphic encryption with heterogeneous and time-varying random stepsizes. The proposed algorithm provides inherent privacy protection against both internal honest-but-curious agents and external eavesdroppers, without relying on any trusted neighbors. Furthermore, we incorporate an attenuation factor to effectively mitigate quantization error induced by the encryption process, ensuring almost sure convergence to the optimal solution while maintaining privacy preservation. Numerical simulations demonstrate the effectiveness and efficiency of the proposed approach.


[17] 2604.21384

Estimation of Unknown Parameters in Presence of Perturbations and Noises with Application to GPEBO Design

A problem of online estimation of unknown parameters is considered for a linear regression equation, which is affected by an additive perturbation that can be caused by measurement noise (that corrupts regressor and regressand), as well as external perturbations. Known approaches to solve this problem typically have one of the following disadvantages: 1) they ensure convergence of a parametric error to a compact set with non-adjustable bound, 2) independence of all system regressor elements from the perturbation/noise is required to annihilate them, 3) an instrumental variable is needed to be selected. On the basis of the novel perturbation annihilation procedure, in the present paper, we propose three new estimation laws, which are free from the above-mentioned drawbacks and ensure exponential convergence of the parametric error to an arbitrarily small neighborhood of zero, particularly, in case more than a half (not all) of the regressor elements are independent from additive perturbation. One of the proposed estimation laws is used for the design of Generalized Parameter Estimation-Based Observer (GPEBO) for nonlinear affine systems to enhance GPEBO performance in case when the measured system output is corrupted by noise. The theoretical results are supported by examples and mathematical modelling.


[18] 2604.21406

Full-Duplex Interaction in Spoken Dialogue Systems: A Comprehensive Study from the ICASSP 2026 HumDial Challenge

Full-duplex interaction, where speakers and listeners converse simultaneously, is a key element of human communication often missing from traditional spoken dialogue systems. These systems, based on rigid turn-taking paradigms, struggle to respond naturally in dynamic conversations. The Full-Duplex Interaction Track of ICASSP 2026 Human-like Spoken Dialogue Systems Challenge (HumDial Challenge) aims to advance the evaluation of full-duplex systems by offering a framework for handling real-time interruptions, speech overlap, and dynamic turn negotiation. We introduce a comprehensive benchmark for full-duplex spoken dialogue systems, built from the HumDial Challenge. We release a high-quality dual-channel dataset of real human-recorded conversations, capturing interruptions, overlapping speech, and feedback mechanisms. This dataset forms the basis for the HumDial-FDBench benchmark, which assesses a system's ability to handle interruptions while maintaining conversational flow. Additionally, we create a public leaderboard to compare the performance of open-source and proprietary models, promoting transparent, reproducible evaluation. These resources support the development of more responsive, adaptive, and human-like dialogue systems.


[19] 2604.21410

Encrypted Visual Feedback Control Using RLWE-Based Cryptosystem

This study proposes an encrypted visual feedback control algorithm for regulating a one-dimensional stage using Ring Learning With Errors (RLWE) encryption. The proposed algorithm performs both feature extraction and controller computations directly on encrypted images, ensuring that sensitive visual data remain protected throughout the entire control process. Furthermore, an image captured by the camera is encrypted into a single ciphertext leveraging the message packing technique of RLWE encryption, thereby reducing computational cost. The effectiveness of the proposed framework is demonstrated through numerical simulations.


[20] 2604.21484

HyperCEUNet: Parameter-Aware Hypernetwork-Driven UNet for Channel Estimation

Deep learning-based channel estimation has been recognized as a promising technique for sixth-generation wireless systems. However, most existing approaches rely solely on least-squares estimates obtained from demodulation reference signals, which fail to explicitly exploit channel time-frequency correlation parameters. Inspired by the independent channel parameter estimation enabled by semi-static reference signals in modern wireless systems, this letter presents a parameter-aware deep learning-based channel estimation framework termed HyperCEUNet. Specifically, the proposed hypernetwork generates an adaptive front-end convolutional layer based on estimated channel parameters, serving as a pre-filtering stage before the UNet-based estimator. In addition, the Wiener-filtered channel estimates are adopted to provide a correlation-aware initialization for data resources. Simulation results demonstrate that our proposed HyperCEUNet effectively improves channel estimation accuracy compared with its conventional counterparts.


[21] 2604.21487

Monolithically Integrated VO$_2$ Mott Oscillators for Energy-Efficient Spiking Neurons

Brain-inspired non-Boolean computing offers intrinsic error tolerance and parallelism, but its practical deployment is limited by the lack of compact, energy-efficient spiking hardware compatible with large-scale integration. Mott phase-transition materials provide a promising route, as their abrupt insulator-to-metal transitions enable neuron-like thresholding and oscillatory dynamics in compact devices. Among these, vanadium dioxide (VO$_2$) stands out for its near-room-temperature transition, fast switching, and scalability. However, existing VO$_2$-based neuristors rely on discrete components, limiting integration density and system applicability. Here, we report monolithic back-end-of-the-line (BEOL) integration of one-transistor-one-VO2-memristor (1T-1MR) spiking neurons on CMOS-compatible platforms. VO$_2$ nanosheet devices are fabricated by pulsed-laser deposition below 430 °C on dielectrically isolated silicon-on-insulator (SOI) p-type junctionless field-effect transistors (JLFETs) in a compact 1T-1MR configuration. The architecture exhibits gate-tunable oscillations from 40 to 410 kHz in 60 nm-thick VO$_2$ devices with an active area of 6 $\mu$m$^2$, achieving energy consumption as low as 18 pJ per spike at room temperature, with memristor power dissipation of 8 $\mu$W and potential scaling toward sub-3 $\mu$W operation. We further uncover a non-monotonic dependence of oscillation frequency on current and temperature, along with bias-dependent stochastic firing dynamics, highlighting the rich behavior of integrated VO$_2$ memristor systems. Finally, we demonstrate voltage-controlled oscillator functionality and actively tunable resistive coupling of two nano-oscillators mediated by a JLFET. These results establish a pathway toward dense, energy-efficient, and monolithically integrated Mott-based neuromorphic hardware compatible with CMOS technology.


[22] 2604.21507

DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline

Speaker diarization (SD) is the task of answering "who spoke when" in a multi-speaker audio stream. Classically, an SD system clusters segments of speech belonging to an individual speaker's identity. Recent years have seen substantial progress in SD through end-to-end neural diarization (EEND) approaches. DiariZen, a hybrid SD pipeline built upon a structurally pruned WavLM-Large encoder, a Conformer backend with powerset classification, and VBx clustering, represents the leading open-source state of the art at the time of writing across multiple benchmarks. Despite its strong performance, the DiariZen architecture spans several repositories and frameworks, making it difficult for researchers and practitioners to understand, reproduce, or extend the system as a whole. This tutorial paper provides a self-contained, block-by-block explanation of the complete DiariZen pipeline, decomposing it into seven stages: (1) audio loading and sliding window segmentation, (2) WavLM feature extraction with learned layer weighting, (3) Conformer backend and powerset classification, (4) segmentation aggregation via overlap-add, (5) speaker embedding extraction with overlap exclusion, (6) VBx clustering with PLDA scoring, and (7) reconstruction and RTTM output. For each block, we provide the conceptual motivation, source code references, intermediate tensor shapes, and annotated visualizations of the actual outputs on a 30s excerpt from the AMI Meeting Corpus. The implementation is available at this https URL, which includes standalone executable scripts for each block and a Jupyter notebook that runs the complete pipeline end-to-end.


[23] 2604.21518

DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

Neural representations (NRs), such as neural fields and 3D Gaussians, effectively model volumetric data in computed tomography (CT) but suffer from severe artifacts under sparse-view settings. To address this, we propose DiffNR, a novel framework that enhances NR optimization with diffusion priors. At its core is SliceFixer, a single-step diffusion model designed to correct artifacts in degraded slices. We integrate specialized conditioning layers into the network and develop tailored data curation strategies to support model finetuning. During reconstruction, SliceFixer periodically generates pseudo-reference volumes, providing auxiliary 3D perceptual supervision to fix underconstrained regions. Compared to prior methods that embed CT solvers into time-consuming iterative denoising, our repair-and-augment strategy avoids frequent diffusion model queries, leading to better runtime performance. Extensive experiments show that DiffNR improves PSNR by 3.99 dB on average, generalizes well across domains, and maintains efficient optimization.


[24] 2604.21532

Using Assembly Language for Creating Games

The aim of this paper is to demonstrate some interesting and useful approaches for writing a program in the assembly language. In order to demonstrate the possibilities of the assembly language, a project called "Arkanoid" was created. This project is written in assembly language and it presents few interesting algorithms. Assembly language, which is used for designing the game is x86 Assembly language, which produces object code for the x86 class of processors. As a working environment is chosen Visual Studio 2015, because it gives the useful tools for debugging and testing of the created software (game). Execution of the program results in a "Arkanoid" game, placed in Windows OS Console.


[25] 2604.21542

A Characterization of Integral Input-to-state Stability for Hybrid Systems with Memory

This paper addresses characterizations of Integral Input-to-State Stability (iISS) for hybrid systems with memory. Based on the Krasovskii approach, a novel Lyapunov characterization of iISS is established to extend the hybrid system theory to the time-delay case. In particular, we introduce the notions of dissipativity, detectability and storage functional to describe the iISS property from different perspectives. Under mild regularity and convexity assumptions, the equivalence relations among diverse stability descriptions are established, which lays a solid foundation for the control design. Finally, a numerical example is presented to illustrate the derived results.


[26] 2604.21585

Scalable Multimodal Beam Alignment in V2X: An Anti-Imbalance Graph Learning Approach

Efficient beam alignment is fundamental to high-throughput and reliable connectivity in Vehicle-to-Everything (V2X) systems. However, conventional beam management in dynamic vehicular topologies incurs prohibitive alignment overhead and struggles to maintain robust links under rapid mobility. To overcome these challenges, this paper proposes a distributed multimodal graph beam alignment (GBA) framework. The core innovation lies in leveraging onboard multimodal sensing data to predict implicit feedback while employing graph neural networks to coordinate multi-user alignment, thereby jointly enhancing scalability and drastically reducing overhead. The architecture adopts a dual-network design with GBA-RSU and GBA-Vehicle units, optimized through a hybrid strategy of centralized learning and federated learning (FL) to balance global performance with local privacy. Furthermore, a dedicated data augmentation (DA) scheme is introduced to address multimodal data imbalance issues in vehicular networks. Negative augmentation applies dominant modality dropout to bolster robustness, while positive augmentation generates underrepresented samples to mitigate label imbalance. Numerical results demonstrate that GBA maintains a competitive sum rate on par with high-resolution codebook-based feedback yet reduces beam alignment overhead by over 90\% and scales efficiently in mobile scenarios. Notably, integrating DA enables GBA to consistently outperform state-of-the-art FL-based alignment benchmarks, with particularly pronounced gains under severe label and modality imbalance, establishing a practical solution for V2X beam management.


[27] 2604.21608

ADMM-Based Distributed Kalman-like Observer with Applications to Cooperative Localization

This paper addresses distributed state estimation for multi-agent systems with local and relative measurements, motivated by cooperative localization problems in which the global state dimension scales with the size of the network. We consider a Kalman-like observer in information form and introduce a sparsity-preserving prediction step based on an exponential forgetting factor, thereby avoiding the dense Riccati recursion of the standard information filter. The correction step is recast as a strongly convex quadratic program with structure induced by the sensing graph, which enables a distributed solution based on the alternating direction method of multipliers (ADMM). In the resulting scheme, each agent updates local copies of its own correction variable and those of its neighbors using only local communication, thus avoiding centralized matrix inversion and consensus over full global-state quantities. A two-time-scale stability analysis is developed for the interconnected observer: the reduced estimation-error dynamics are shown to be uniformly exponentially stable, the ADMM dynamics define an exponentially stable fast subsystem, and these properties are combined to establish uniform exponential stability of the overall distributed observer. Numerical simulations in a multi-agent cooperative localization scenario illustrate the performance of the proposed distributed observer.


[28] 2604.21618

Event-Triggered Distributed Target Tracking via PRIMEX

PRIMEX (prime-based graph encoding and extraction) is a recently proposed framework for scalable distributed fusion. In PRIMEX, the information pedigree of state estimates or probability density functions is encoded using the information codes, enabling lightweight arithmetic for redundancy removal and data integration. Building on PRIMEX and its memoryless fusion strategy based on a least-squares approximation, in this paper we present two efficient distributed tracking algorithms: a consensus-based PRIMEX method that fuses information from all neighbors, and a greedy gossip-based PRIMEX method that fuses with the most informative neighbor. To further increase communication efficiency, we incorporate an event-triggered mechanism, in which transmission decisions are driven by information novelty measured using differences between the information codes. The proposed methods are evaluated and compared with covariance intersection and centralized fusion in a distributed single target tracking scenario. Simulation results show that PRIMEX-based methods remain competitive in tracking accuracy while improving communication efficiency.


[29] 2604.21644

An Adaptive Kalman Filter that Learns the Coloring Dynamics of the Process Noise

In many applications of state estimation, the process noise is colored; this case is addressed by applying the standard Kalman filter (KF) to dynamics that are augmented with the coloring dynamics. The present paper considers the case where the coloring dynamics are unknown, which renders the estimates obtained from the standard approach suboptimal. To address this problem, the present paper proposes an adaptive technique based on the principle that, if the measurement noise is white, then the innovations sequence is white if and only if the process noise is white. Leveraging this fact, an Innovations-Whitening Adaptive Kalman Filter (IWAKF) is developed, which learns the process-noise coloring online. By embedding an unknown coloring filter in a state-augmentation framework, IWAKF adapts its parameters by minimizing the empirical autocorrelation of the innovations, thereby driving them toward whiteness and restoring near-optimality without prior knowledge of the coloring dynamics.


[30] 2604.21682

PHOTON: Non-Invasive Optical Tracking of Key-Lever Motion in Historical Keyboard Instruments

This paper introduces PHOTON (PHysical Optical Tracking of Notes), a non-invasive optical sensing system for measuring key-lever motion in historical keyboard instruments. PHOTON tracks the vertical displacement of the key lever itself, capturing motion shaped by both performer input and the instrument's mechanically imposed, time-varying load. Reflective optical sensors mounted beneath the distal end of each lever provide continuous displacement, timing, and articulation data without interfering with the action. Unlike existing optical systems designed for modern pianos, PHOTON accommodates the diverse geometries, limited clearances, and non-standard layouts of harpsichords, clavichords, and early fortepianos. Its modular, low-profile architecture enables high-resolution, low-latency sensing across multiple manuals and variable key counts. Beyond performance capture, PHOTON provides real-time MIDI output and supports empirical study of expressive gesture, human-instrument interaction, and the construction of instrument-specific MIDI corpora using real historical mechanisms. The complete system is released as open-source hardware and software, from schematics and PCB layouts developed in KiCad to firmware written in CircuitPython, lowering the barrier to adoption, replication, and extension.


[31] 2604.21685

Resilience Revisited: A Multidimensional Framework Derived from Realistic Attack Scenarios

Power systems are increasingly vulnerable to high-impact, low-probability (HILP) events, including coordinated cyberattacks targeting inverter-based resources. Existing resilience frameworks rely on single-dimensional metrics that fail to capture cross-dimensional coupling effects, underestimating real system degradation under multi-vector attack conditions. This study proposes a Multidimensional Resilience Index (MDRI) that decomposes power system degradation into five interacting dimensions: physical, operational, digital-cyber, climatic, and regulatory, explicitly separating independent and coupled contributions via a calibrated multiplicative interaction term. The framework is validated on the IEEE 39-bus system under two attack scenarios derived from the December 2025 cyberattack on the Polish energy infrastructure. MDRI results show that multi-vector attacks produce degradation exceeding linear expectations by a factor of 5.6, with simultaneous dimensional failures contributing an additional 60.6% through endogenous coupling, and exogenous factors amplifying it by an additional 84%.


[32] 2604.21740

A Case Study in Recovery of Drones using Discrete-Event Systems

Discrete-event systems and supervisory control theory provide a rigorous framework for specifying correct-by-construction behavior. However, their practical application to swarm robotics remains largely underexplored. In this paper, we investigate a topological recovery method based on discrete-event-systems within a swarm robotics context. We propose a hybrid architecture that combines a high-level discrete event systems supervisor with a low-level continuous controller, allowing lost drones to safely recover from fault or attack events and re-enter a controlled region. The method is demonstrated using ten simulated UAVs in the py-bullet-drones framework. We show recovery performance across four distinct scenarios, each with varying initial state estimates. Additionally, we introduce a secondary recovery supervisor that manages the regrouping process for a drone after it has re-entered the operational region.


[33] 2604.21839

A Hidden Markov Framework for Physically Interpretable Arc Stability Dynamics in Welding Systems

Electric arc welding (EAW) exhibits strongly non stationary and temporally evolving behavior, making reliable assessment of arc stability difficult using conventional frame based approaches. In this study, arc dynamics are modeled as a sequence of latent operational regimes within a probabilistic state-space framework. The welding current signal is transformed into a time-frequency domain using Short-Time Fourier Transform (STFT), and a set of physically meaningful spectral descriptors, including energy, entropy, and centroid, is extracted to construct the observation sequence. A Hidden Markov Model (HMM) is employed to capture temporal dependencies and estimate the evolution of arc states. The analysis reveals three dominant regimes, transient, stable, and extinction, with a clear monotonic increase in spectral energy and a corresponding decrease in entropy, indicating reduced variability under stable conditions. Despite partial overlap in the feature space, the inferred state sequence exhibits strong temporal coherence, supported by high state persistence and low transition rates. These findings highlight the limitations of static classification and emphasize the importance of temporal modeling. The proposed framework provides an interpretable and physically consistent representation of arc behavior, enabling more realistic monitoring and analysis of stability dynamics in welding processes.


[34] 2604.21891

A Multi-Stage Warm-Start Deep Learning Framework for Unit Commitment

Maintaining instantaneous balance between electricity supply and demand is critical for reliability and grid instability. System operators achieve this through solving the task of Unit Commitment (UC),ca high dimensional large-scale Mixed-integer Linear Programming (MILP) problem that is strictly and heavily governed by the grid physical constraints. As grid integrate variable renewable sources, and new technologies such as long duration storage in the grid, UC must be optimally solved for multi-day horizons and potentially with greater frequency. Therefore, traditional MILP solvers increasingly struggle to compute solutions within these tightening operational time limits. To bypass these computational bottlenecks, this paper proposes a novel framework utilizing a transformer-based architecture to predict generator commitment schedules over a 72-hour horizon. Also, because raw predictions in highly dimensional spaces often yield physically infeasible results, the pipeline integrates the self-attention network with deterministic post-processing heuristics that systematically enforce minimum up/down times and minimize excess capacity. Finally, these refined predictions are utilized as a warm start for a downstream MILP solver, while employing a confidence-based variable fixation strategy to drastically reduce the combinatorial search space. Validated on a single-bus test system, the complete multi-stage pipeline achieves 100\% feasibility and significantly accelerates computation times. Notably, in approximately 20\% of test instances, the proposed model reached a feasible operational schedule with a lower overall system cost than relying solely on the solver.


[35] 2604.20878

AITP: Traffic Accident Responsibility Allocation via Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) have achieved remarkable progress in Traffic Accident Detection (TAD) and Traffic Accident Understanding (TAU). However, existing studies mainly focus on describing and interpreting accident videos, leaving room for deeper causal reasoning and integration of legal knowledge. Traffic Accident Responsibility Allocation (TARA) is a more challenging task that requires multi-step reasoning grounded in traffic regulations. To address this, we introduce AITP (Artificial Intelligence Traffic Police), a multimodal large language model for responsibility reasoning and allocation. AITP enhances reasoning via a Multimodal Chain-of-Thought (MCoT) mechanism and integrates legal knowledge through Retrieval-Augmented Generation (RAG). We further present DecaTARA, a decathlon-style benchmark unifying ten interrelated traffic accident reasoning tasks with 67,941 annotated videos and 195,821 question-answer pairs. Extensive experiments show that AITP achieves state-of-the-art performance across responsibility allocation, TAD, and TAU tasks, establishing a new paradigm for reasoning-driven multimodal traffic analysis.


[36] 2604.20898

A Tendon-Driven Wrist Abduction-Adduction Joint Improves Performance of a 5 DoF Upper Limb Exoskeleton -- Implementation and Experimental Evaluation

Wrist function is essential in performing activities of daily living (ADLs). However, there is limited experimental evidence on the functional impact of wrist Abduction-Adduction (Ab-Ad) joint assistance in upper limb exoskeletons (ULEs) for rehabilitation. This study evaluates the effect of implementing an active wrist Ab-Ad joint in a five degree of freedom (DoF) ULE, EXOTIC2 exoskeleton, to support individuals with severe motor impairments. Methods: A compact, lightweight wrist module with tendon-driven abduction and spring-driven adduction was integrated into the EXOTIC exoskeleton. Eight adults with no motor disabilities completed drinking and scratching tasks under randomized wrist-enabled and wrist-locked conditions along with a preliminary feasibility test in one individual with Amyotrophic lateral sclerosis (ALS). Kinematic and task performance metrics including wrist range of motion, task completion time, spillage and leveling metrics were assessed. Results: Implementing the wrist Ab-Ad DoF improved task success metrics. Spill incidence during the drinking task decreased from 56% to 3%, and leveling success for scratching task improved from 28% to 75%. Conclusion: Integrating wrist Ab-Ad assistance improved key functional task outcomes without increasing execution time. Significance: The study provides the experimental evidence that active wrist Ab-Ad control enhances task-level performance in exoskeleton-assisted ADLs.


[37] 2604.20910

Planetary Exploration 3.0: A Roadmap for Software-Defined, Radically Adaptive Space Systems

The surface and subsurface of worlds beyond Mars remain largely unexplored. Yet these worlds hold keys to fundamental questions in planetary science - from potentially habitable subsurface oceans on icy moons to ancient records preserved in Kuiper Belt objects. NASA's success in Mars exploration was achieved through incrementalism: 22 progressively sophisticated missions over decades. This paradigm, which we call Planetary Exploration 2.0 (PE 2.0), is untenable for the outer Solar System, where cruise times of a decade or more make iterative missions infeasible. We propose Planetary Exploration 3.0 (PE 3.0): a paradigm in which unvisited worlds are explored by a single or a few missions with radically adaptive space systems. A PE 3.0 mission conducts both initial exploratory science and follow-on hypothesis-driven science based on its own in situ data returns, evolving spacecraft capabilities to work resiliently in previously unseen environments. The key enabler of PE 3.0 is software-defined space systems (SDSSs) - systems that can adapt their functions at all levels through software updates. This paper presents findings from a Keck Institute for Space Studies (KISS) workshop on PE 3.0, covering: (1) PE 3.0 systems engineering including science definition, architecture, design methods, and verification & validation; (2) software-defined space system technologies including reconfigurable hardware, multi-functionality, and modularity; (3) onboard intelligence including autonomous science, navigation, controls, and embodied AI; and (4) three PE 3.0 mission concepts: a Neptune/Triton smart flyby, an ocean world explorer, and an Oort cloud reconnaissance mission.


[38] 2604.20967

Clinical Evaluation of a Tongue-Controlled Wrist Abduction-Adduction Assistance in a 6-DoF Upper-Limb Exoskeleton for Individuals with ALS and SCI

Upper-limb exoskeletons (ULEs) have the potential to restore functional independence in individuals with severe motor impairments; however, the clinical relevance of wrist degrees of freedom (DoF), particularly abduction-adduction (Ab-Ad), remains insufficiently evaluated. This study investigates the functional and user-perceived impact of wrist Ab-Ad assistance during two activities of daily living (ADLs). Wrist Ab-Ad assistance in a tongue-controlled 6-DoF ULE, EXOTIC2, was evaluated in a within-subject study involving one individual with amyotrophic lateral sclerosis and five individuals with spinal cord injury. Participants performed drinking and scratch stick leveling tasks with EXOTIC2 under two conditions: with and without wrist Ab-Ad assistance. Outcome measure included task success, task completion time, kinematic measures, and a usability questionnaire capturing comfort, functional perception, and acceptance. Enabling wrist Ab-Ad improved task success rates across both ADLs, with consistent reductions in spillage (from 77.8% spillages to 22.2%) and failed placements (from 66.7% to 16.7%). Participants utilized task-specific subsets of the available wrist range of motion, indicating that effective control within functional ranges was more critical than maximal joint excursion. Questionnaire responses indicated no increase in discomfort with the additional DoF and reflected perceived improvements in task performance. In conclusion, wrist Ab-Ad assistance enhances functional task performance in assistive exoskeleton use without compromising user comfort. However, its effectiveness depends on task context, control usability, and individual user strategies. This study provides clinically relevant, user-centered evidence supporting the inclusion of wrist Ab-Ad in ULEs, emphasizing the importance of balancing functional capability with usability in assistive device design.


[39] 2604.20980

The Riccati Characteristic Equation

The Riccati differential equation is examined in light of its connection to second order linear time varying systems. In that light it becomes the clear generalization for the characteristic equation of linear time invariant systems, and is called the Riccati Characteristic Equation (RCE). Consequently, the RCE becomes the unifying centerpiece for the study of linear systems. Its solutions are considered in complementary pairs that form a continuum based on a primitive pair. Pairs may always be found as purely real solutions, despite the fact that complex conjugate primitive solutions are shown to exist in many cases. Not only is the pairing unique, but the general form of solutions, shown here for the first time, is uniquely compact and encompasses all known solutions, while allowing for all initial conditions. Classical engineering mathematics examples are shown to conform to this approach, which provides new insights to all, especially Floquet theory.


[40] 2604.20990

A Survey of Legged Robotics in Non-Inertial Environments: Past, Present, and Future

Legged robots have demonstrated remarkable agility on rigid, stationary ground, but their locomotion reliability remains limited in non-inertial environments, where the supporting ground moves, tilts, or accelerates. Such conditions arise in ground transportation, maritime platforms, and aerospace settings, and they introduce persistent time-varying disturbances that break the stationary-ground assumptions underlying conventional legged locomotion. This survey reviews the state of the art in modeling, state estimation, and control for legged robots in non-inertial environments. We summarize representative application domains and motion characteristics, analyze the root causes of locomotion performance degradation, and review existing methods together with their key assumptions and limitations. We further identify open problems in robot-environment coupling, observability, robustness, and experimental validation, and discuss future directions in autonomy, system-level design, bio-inspired strategies, safety, and testing. The survey aims to clarify the technical foundations of this emerging area and support the development of reliable legged robots for real-world dynamic environments.


[41] 2604.21270

CLT-Optimal Parameter Error Bounds for Linear System Identification

There has been remarkable progress over the past decade in establishing finite-sample, non-asymptotic bounds on recovering unknown system parameters from observed system behavior. Surprisingly, however, we show that the current state-of-the-art bounds do not accurately capture the statistical complexity of system identification, even in the most fundamental setting of estimating a discrete-time linear dynamical system (LDS) via ordinary least-squares regression (OLS). Specifically, we utilize asymptotic normality to identify classes of problem instances for which current bounds overstate the squared parameter error, in both spectral and Frobenius norm, by a factor of the state-dimension of the system. Informed by this discrepancy, we then sharpen the OLS parameter error bounds via a novel second-order decomposition of the parameter error, where crucially the lower-order term is a matrix-valued martingale that we show correctly captures the CLT scaling. From our analysis we obtain finite-sample bounds for both (i) stable systems and (ii) the many-trajectories setting that match the instance-specific optimal rates up to constant factors in Frobenius norm, and polylogarithmic state-dimension factors in spectral norm.


[42] 2604.21565

Pulse Shaping for Superconducting Qubits

High-fidelity control of superconducting qubits requires carefully shaped microwave pulses that account for multiple error channels. In this work, we present a pedagogical introduction to pulse-shaping techniques for transmon qubits, aiming to provide a unified, accessible framework that integrates physical intuition for pulse design, analytical understanding of gate-level descriptions, and practical considerations of hardware. This article further aims to serve as a guide for students and early researchers entering superconducting quantum computing. We begin by examining simple pulse envelopes and their spectral properties, highlighting how finite bandwidth leads to leakage outside the computational subspace. These observations motivate the introduction of the derivative removal by adiabatic gate (DRAG) technique, which uses a quadrature component proportional to the pulse's time derivative to suppress off-resonant excitations. We analyze the single-qubit case using the Magnus expansion, which provides a clear understanding of the order-by-order introduction of error channels. We discuss the practical hardware realities of control pulse generation, focusing on arbitrary waveform generators (AWG), local oscillators (LO), and IQ mixing. Common imperfections are discussed in terms of their impact on the effective pulse shape and qubit Hamiltonian. Finally, we extend the discussion to two-qubit operations, focusing on the cross-resonance gate and the emergence of effective interactions.


[43] 2604.21636

A microwave super-resolution imaging approach towards breast cancer margin mapping

Accurate characterisation of margins in excised breast cancer tumours is critical to the success of surgical interventions, yet margin status is typically confirmed post-operatively using histopathology. Here we present a new approach to intraoperative margin assessment based on microwave single pixel imaging, demonstrating tissue phantom hydration mapping across large areas (~10 cm x 10 cm) at ~1 mm resolution. By leveraging the photo-induced change in microwave transparency of a silicon modulator placed under the sample, we map the microwave reflectivity and identify positive margins with deeply sub-wavelength resolution. We test the discriminatory capabilities of our approach using gelatine-based tumour phantoms with variations in water density representative of the margin and cancerous tissues of a resected tumour. We demonstrate the capability to identify, locate and quantify inadequate margins up to the typically targeted minimum thickness of 2 mm. Furthermore, using numerical modelling, we show that our approach is expected to be resilient to patient-specific tissue differences. Our technique has potential for future deployment as a real-time intraoperative tissue margin analysis tool.


[44] 2604.21651

Dilated CNNs for Periodic Signal Processing: A Low-Complexity Approach

Denoising of periodic signals and accurate waveform estimation are core tasks across many signal processing domains, including speech, music, medical diagnostics, radio, and sonar. Although deep learning methods have recently shown performance improvements over classical approaches, they require substantial computational resources and are usually trained separately for each signal observation. This study proposes a computationally efficient method based on DCNN and Re-sampling, termed R-DCNN, designed for operation under strict power and resource constraints. The approach targets signals with varying fundamental frequencies and requires only a single observation for training. It generalizes to additional signals via a lightweight resampling step that aligns time scales in signals with different frequencies to re-use the same network weights. Despite its low computational complexity, R-DCNN achieves performance comparable to state-of-the-art classical methods, such as autoregressive (AR)-based techniques, as well as conventional DCNNs trained individually for each observation. This combination of efficiency and performance makes the proposed method particularly well suited for deployment in resource-constrained environments without sacrificing denoising or estimation accuracy.


[45] 2604.21905

Low-Rank Adaptation Redux for Large Models

Low-rank adaptation (LoRA) has emerged as the de facto standard for parameter-efficient fine-tuning (PEFT) of foundation models, enabling the adaptation of billion-parameter networks with minimal computational and memory overhead. Despite its empirical success and rapid proliferation of variants, it remains elusive which architectural choices, optimization techniques, and deployment constraints should guide practical method selection. This overview revisits LoRA through the lens of signal processing (SP), bridging modern adapter designs with classical low-rank modeling tools and inverse problems, as well as highlighting how SP principles can inform principled advances of fine-tuning approaches. Rather than providing a comprehensive enumeration and empirical comparisons of LoRA variants, emphasis is placed on the technical mechanisms underpinning these approaches to justify their effectiveness. These advances are categorized into three complementary axes: architectural design, efficient optimization, and pertinent applications. The first axis builds on singular value decomposition (SVD)-based factorization, rank-augmentation constructions, and cross-layer tensorization, while the second axis deals with initialization, alternating solvers, gauge-invariant optimization, and parameterization-aware methods. Beyond fine-tuning, emerging applications of LoRA are accounted across the entire lifecycle of large models, ranging from pre- and post-training to serving/deployment. Finally, open research directions are outlined at the confluence of SP and deep learning to catalyze a bidirectional frontier: classical SP tools provide a principled vocabulary for designing principled PEFT methods, while the unique challenges facing modern deep learning, especially the overwhelming scale and prohibitive overhead, also offer new research lines benefiting the SP community in return.


[46] 2410.18217

Accurate Analytical Modeling of Small-Size Rotary Transformers for Wound-Rotor Resolvers

Rotary transformers are commonly used in wound rotor resolvers to transfer excitation signals to the rotating winding without mechanical contact. In many analyses, the rotary transformer is modeled as an ideal transformer, where the voltage transfer ratio is assumed to be equal to the turns ratio. However, in miniature rotary transformers used in compact resolver systems, leakage inductance can become comparable to the magnetizing inductance due to reduced core dimensions and unavoidable air gaps, leading to deviations from the ideal voltage transfer behavior. This paper presents an accurate equivalent circuit model for miniature rotary transformers employed in wound rotor resolvers. The proposed model analytically derives the magnetizing and leakage inductances using a magnetic equivalent circuit that accounts for flux fringing and air gap effects. The model is validated through three dimensional finite element analysis and experimental measurements on a fabricated prototype under both no load and resolver excitation conditions. The results demonstrate improved prediction accuracy of the secondary voltage compared with conventional models, enabling more reliable characterization of excitation transfer in compact resolver systems.


[47] 2502.03484

Dementia classification from spontaneous speech using wrapper-based feature selection

Dementia encompasses a group of syndromes that impair cognitive functions such as memory, reasoning, and the ability to perform daily activities. As populations globally age, over 10 million new dementia diagnoses are reported annually. Currently, clinical diagnosis of dementia remains challenging due to overlapping symptoms, the need to exclude alternative conditions and the requirement for a comprehensive clinical evaluation and cognitive assessment. This underscores the growing need to develop feasible and accurate methods for detecting cognitive deficiencies. Recent advances in machine learning have highlighted spontaneous speech as a promising noninvasive, cost-effective, and scalable biomarker for dementia detection. In this study, spontaneous speech recordings from the ADReSS and Pitt Corpus datasets are analyzed, consisting of picture description tasks performed by cognitively healthy individuals and people with Alzheimer's disease. Unlike prior approaches that focus solely on speech-active segments, acoustic features are extracted from entire recordings using the openSMILE toolkit. This representation reduces the number of feature vectors and improves computational efficiency without compromising classification performance. Classification models with classifier-based wrapper feature selection are employed to estimate feature importance and identify diagnostically relevant acoustic characteristics. Among the evaluated models, the Extreme Minimal Learning Machine achieved competitive classification accuracy with substantially lower computational cost, reflecting an inherent property of the model formulation and learning procedure. Overall, the results demonstrate that the proposed framework is computationally efficient, interpretable, and well suited as a supportive tool for speech-based dementia assessment.


[48] 2509.01331

Comparison between Supervised and Unsupervised Learning in Deep Unfolded Sparse Signal Recovery

This paper investigates the impact of loss function selection in deep unfolding techniques for sparse signal recovery algorithms. Deep unfolding transforms iterative optimization algorithms into trainable lightweight neural networks by unfolding their iterations as network layers, with various loss functions employed for parameter learning depending on application contexts. We focus on deep unfolded versions of the fundamental iterative shrinkage thresholding algorithm (ISTA) and the iterative hard thresholding algorithm (IHT), comparing supervised learning using mean squared error with unsupervised learning using the objective function of the original optimization problem. Our simulation results reveal that the effect of the choice of loss function significantly depends on the convexity of the optimization problem. For convex $\ell_1$-regularized problems, supervised-ISTA achieves better final recovery accuracy but fails to minimize the original objective function, whereas we empirically observe that unsupervised-ISTA converges to a nearly identical solution as conventional ISTA but with accelerated convergence. Conversely, for nonconvex $\ell_0$-regularized problems, both supervised-IHT and unsupervised-IHT converge to better local minima than the original IHT, showing similar performance under the training conditions regardless of the loss function employed. However, when the test conditions differ from the training conditions, unsupervised-IHT generalizes well whereas supervised-IHT tends to suffer from performance degradation, suggesting that unsupervised learning offers better robustness to distribution mismatch. These findings provide valuable insights into the design of effective deep unfolded networks for sparse signal recovery applications.


[49] 2509.13576

Cross-Distribution Diffusion Priors-Driven Iterative Reconstruction for Sparse-View CT

Sparse-View CT (SVCT) reconstruction enhances temporal resolution and reduces radiation dose, yet its clinical use is hindered by artifacts due to view reduction and domain shifts from scanner, protocol, or anatomical variations, leading to performance degradation in out-of-distribution (OOD) scenarios. In this work, we propose a Cross-Distribution Diffusion Priors-Driven Iterative Reconstruction (CDPIR) framework to tackle the OOD problem in SVCT. CDPIR integrates cross-distribution diffusion priors, derived from a Scalable Interpolant Transformer (SiT), with model-based iterative reconstruction methods. Specifically, we train a SiT backbone, an extension of the Diffusion Transformer (DiT) architecture, to establish a unified stochastic interpolant framework, leveraging Classifier-Free Guidance (CFG) across multiple datasets. By randomly dropping the conditioning with a null embedding during training, the model learns both domain-specific and domain-invariant priors, enhancing generalizability. During sampling, the globally sensitive transformer-based diffusion model exploits the cross-distribution prior within the unified stochastic interpolant framework, enabling flexible and stable control over multi-distribution-to-noise interpolation paths and decoupled sampling strategies, thereby improving adaptation to OOD reconstruction. By alternating between data fidelity and sampling updates, our model achieves state-of-the-art performance with superior detail preservation in SVCT reconstructions. Extensive experiments demonstrate that CDPIR significantly outperforms existing approaches, particularly under OOD conditions, highlighting its robustness and potential clinical value in challenging imaging scenarios.


[50] 2509.19318

Scensory: Real-Time Robotic Olfactory Perception for Joint Identification and Source Localization

While robotic perception has advanced rapidly in vision and touch, enabling robots to reason about indoor fungal contamination from weak, diffusion-dominated chemical signals remains an open challenge. We introduce Scensory, a learning-based robotic olfaction framework that simultaneously identifies fungal species and localizes their source from short time series measured by affordable, cross-sensitive VOC sensor arrays. Temporal VOC dynamics encode both chemical and spatial signatures, which we decode through neural networks trained on robot-automated data collection with spatial supervision. Across five fungal species, Scensory achieves up to 89.85% species accuracy and 87.31% source localization accuracy under ambient conditions with 3-7s sensor inputs. These results demonstrate real-time, spatially grounded perception from diffusion-dominated chemical signals, enabling scalable and low-cost source localization for robotic indoor environmental monitoring.


[51] 2510.20998

Is Repeater-Assisted Massive MIMO Compatible with Dynamic TDD?

We present a framework for joint amplification and phase shift optimization of the repeater gain in dynamic time-division duplex (TDD) repeater-assisted massive MIMO networks. Repeaters, being active scatterers with amplification and phase shift, enhance the received signal strengths for users. However, they inevitably also amplify undesired noise and interference signals, which become particularly prominent in dynamic TDD systems due to the concurrent downlink (DL) and uplink (UL) transmissions, introducing cross-link interference among access points and users operating in opposite transmit directions. This causes a non-trivial trade-off between amplification of desired and undesired signals. To underpin the conditions under which such a trade-off can improve performance, we first derive DL and UL spectral efficiencies (SEs), and then develop a repeater gain optimization algorithm for SE maximization. Numerically, we show that our proposed algorithm successfully calibrates the repeater gain to amplify the desired signal while limiting the interference.


[52] 2510.27487

Towards robust quantitative photoacoustic tomography via learned iterative methods

Photoacoustic tomography (PAT) is a medical imaging modality that can provide high-resolution tissue images based on the optical absorption. Classical reconstruction methods for quantifying the absorption coefficients rely on sufficient prior information to overcome noisy and imperfect measurements. As these methods utilize computationally expensive forward models, the computation becomes slow, limiting their potential for time-critical applications. As an alternative approach, deep learning-based reconstruction methods have been established for faster and more accurate reconstructions. However, most of these methods rely on having a large amount of training data, which is not the case in practice. In this work, we adopt the model-based learned iterative approach for the use in Quantitative PAT (QPAT), in which additional information from the model is iteratively provided to the updating networks, allowing better generalizability with scarce training data. We compare the performance of different learned updates based on gradient descent, Gauss-Newton, and Quasi-Newton methods. The learning tasks are formulated as greedy, requiring iterate-wise optimality, as well as end-to-end, where all networks are trained jointly. The implemented methods are tested with ideal simulated data as well as against a digital twin dataset that emulates scarce training data and high modeling error.


[53] 2511.18884

Robust Nonlinear Transform Coding: A Framework for Generalizable Joint Source-Channel Coding

This paper proposes robust nonlinear transform coding (Robust-NTC), a generalizable digital joint source-channel coding (JSCC) framework that couples variational latent modeling with channel-adaptive transmission. Unlike learning-based JSCC methods that implicitly absorb channel variations, Robust-NTC explicitly models element-wise latent distributions via a variational objective with a Gaussian proxy for quantization and channel noise, allowing encoder-decoder to capture latent uncertainty without channel-specific training. Using the learned statistics, Robust-NTC also facilitates rate-distortion optimization to adaptively select element-wise quantizers and bit depths according to online channel conditions. To support practical deployment, Robust-NTC is integrated into an orthogonal frequency-division multiplexing (OFDM) system, where a unified resource allocation framework jointly optimizes latent quantization, bit allocation, modulation order, and power allocation to minimize transmission latency while guaranteeing learned distortion targets. Simulation results demonstrate that for practical OFDM systems, Robust-NTC achieves superior rate-distortion efficiency and stable reconstruction fidelity compared to both a conventional separated coding scheme and digital JSCC baselines across various channel conditions.


[54] 2512.04914

Analytical and Cross-Sectional Clinical Validity of a Smartphone-Based U-Turn Test in Multiple Sclerosis

Background: Gait and balance impairment can profoundly impact people with multiple sclerosis (PwMS). Objectives: To evaluate the analytical and clinical validity of the U-Turn Test (UTT), a smartphone-based assessment of dynamic balance in PwMS. Methods: The GaitLab study (ISRCTN15993728) enrolled adult PwMS (EDSS 0.0-6.5). PwMS performed the UTT in a gait laboratory (supervised) using 6 smartphones at different wear locations and daily during a two-week remote period (unsupervised) using one smartphone (belt front). Median turn speed was computed per UTT. In the supervised setting, turn detection accuracy of smartphones was compared to motion capture (mocap) via F1 scores. Agreement between smartphone- and mocap-derived turn speed was assessed by Bland-Altman and ICC(3,1). In the unsupervised setting, test-retest reliability (ICC[2,1]) and correlations with Timed 25-Foot Walk (T25FW), EDSS, Ambulation Score, 12-item Multiple Sclerosis Walking Scale (MSWS-12), and Activities-specific Balance Confidence scale (ABC) were evaluated. Results: Ninety-six PwMS were included. Turn speed was comparable across supervised (1.44 rad/s) and unsupervised settings (1.47 rad/s). In the supervised setting, turn detection was highly accurate (F1 >95% across wear locations). Turn speed agreement with mocap was high (ICC[3,1]: 0.87-0.92), with minimal bias (-0.04 to 0.11 rad/s). Unsupervised test-retest reliability (ICC[2,1]) was >0.90 when aggregating >=2 tests. Turn speed correlated with T25FW (rho=-0.79), EDSS (rho=-0.75), Ambulation score (rho=-0.73), MSWS-12 (rho=-0.65), and ABC (rho=-0.61). Conclusion: The UTT accurately and reproducibly measures turn speed across wear locations and settings, providing complementary dynamic balance insights to clinical measures and showing potential for use in multiple sclerosis trials.


[55] 2512.08216

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

Accurate segmentation of lung tumors from 3D computed tomography (CT) scans is essential for automated treatment planning and response assessment. Despite self-supervised pretraining on numerous datasets, state-of-the-art transformer backbones remain susceptible to out-of-distribution (OOD) inputs, often producing confidently incorrect segmentations with potential for risk in clinical deployment. Hence, we introduce RF-Deep, a lightweight post-hoc random forests-based framework that leverages deep features trained with limited outlier exposure, requiring as few as 40 labeled scans (20 in-distribution and 20 OOD), to improve scan-level OOD detection. RF-Deep repurposes the hierarchical features from the pretrained-then-finetuned segmentation backbones, aggregating features from multiple regions-of-interest anchored to predicted tumor regions to capture OOD likelihood. We evaluated RF-Deep on 2,232 CT volumes spanning near-OOD (pulmonary embolism, COVID-19 negative) and far-OOD (kidney cancer, healthy pancreas) datasets. RF-Deep achieved AUROC >~93 on the challenging near-OOD datasets, where it outperformed the next best method by 4--7 percentage points, and produced near-perfect detection (AUROC >~99) on far-OOD datasets. The approach also showed transferability to two blinded validation datasets under the ensemble configuration (COVID-19 positive and breast cancer; AUROC >~94). RF-Deep maintained consistent performance across backbones of different depths and pretraining strategies, demonstrating applicability of post-hoc detectors as a safety filter for clinical deployment of tumor segmentation pipelines.


[56] 2512.16001

Concurrence: A dependence criterion for time series, applied to biological data

Measuring the statistical dependence between observed signals is a primary tool for scientific discovery. However, biological systems often exhibit complex non-linear interactions that currently cannot be captured without a priori knowledge or large datasets. We introduce a criterion for dependence, whereby two time series are deemed dependent if one can construct a classifier that distinguishes between temporally aligned vs. misaligned segments extracted from them. We show that this criterion, concurrence, is theoretically linked with dependence, and can become a standard approach for scientific analyses across disciplines, as it can expose relationships across a wide spectrum of signals (fMRI, physiological and behavioral data) without ad-hoc parameter tuning or large amounts of data.


[57] 2601.09317

Range-Doppler-Acceleration Estimation for Fast-Moving and Accelerating Targets

A central aspect of every pulsed radar signal processor is the targets Range-Doppler estimation within a Coherent Processing Interval. Conventional methods typically rely on simplifying assumptions, such as linear target motion, narrowband operation, or constant velocity, to enable fast computation. However, these assumptions break down in scenarios involving quadratic range-time behavior, high radial velocities or accelerations, or wideband signals, leading to undesired effects such as intra-pulse Doppler shift/stretch and target migration across Range-Doppler cells. This paper presents a generalized waveform-independent Range-Doppler compression approach that compensates for these effects while maintaining minimal Signal-to-Noise-Ratio loss and practical computational efficiency. The performance limits of the proposed method are analyzed and expressed through a unified metric that depends on both scene and system parameters. Comparison with other approaches is presented, showing their estimation bias and performance degradation.


[58] 2601.09384

Uplink Multi-User MIMO Implementation in OpenAirInterface

Cell-Free Multiple-Input Multiple-Output (MIMO) and Open Radio Access Network (O-RAN) have been active research topics in the wireless communication community in recent years. As an open-source software implementation of the 3rd Generation Partnership Project (3GPP) 5th Generation (5G) protocol stack, OpenAirInterface (OAI) has become a valuable tool for deploying and testing new ideas in wireless communication systems. In this paper, we present our OAI-based real-time uplink Multi-User MIMO (MU-MIMO) testbed developed at Fraunhofer HHI. As a part of our Cell-Free MIMO testbed development, we built a 2x2 MU-MIMO system using general purpose computers and commercially available software defined radios (SDRs). Using a modified OAI next-Generation Node-B (gNB) and two unmodified OAI user equipment (UE), we show that it is feasible to use Sounding Reference Signal (SRS) channel estimates to compute uplink combiners. Our results verify that this method can be used to separate and decode signals from two users transmitting in non-orthogonal time-frequency resources. This work serves as an important verification step to build a complete Cell-Free MU-MIMO system that leverages time domain duplexing (TDD) reciprocity to perform downlink beamforming over multiple cells


[59] 2601.12334

Worst-case Nonlinear Regression with Error Bounds

We propose an active-learning method for nonlinear minimax regression. Given a nonlinear function that can be arbitrarily evaluated over a compact set, we fit a surrogate model, such as a feedforward neural network, by minimizing the maximum absolute approximation error. To handle the nonsmoothness of this worst-case loss, we introduce a smooth $L_\infty$ approximation that enables efficient gradient-based training. The training set is iteratively enriched by querying points of largest error via global optimization. We also derive constant and input-dependent worst-case error bounds over the entire input domain. The approach is validated on approximations of nonlinear functions and nonconvex sets, uncertain models of nonlinear dynamics, and explicit model predictive control laws. A Python library is available at this https URL.


[60] 2602.02866

Estimation of Cell-to-Cell Variation and State of Health for Battery Modules with Parallel-Connected Cells

Estimating cell-to-cell variation (CtCV) and state of health (SoH) for battery modules composed of parallel-connected cells is challenging when only module-level signals are measurable and individual cell behaviors remain unobserved. Although progress has been made in SoH estimation, CtCV estimation remains unresolved in the literature. This paper proposes a unified framework that accurately estimates both CtCV and SoH for modules using only module-level information extracted from incremental capacity analysis (ICA) and differential voltage analysis (DVA). With the proposed framework, CtCV and SoH estimations can be decoupled into two separate tasks, allowing each to be solved with dedicated algorithms without mutual interference and providing greater design flexibility. The framework also exhibits strong versatility in accommodating different CtCV metrics, highlighting its general-purpose nature. Experimental validation on modules with three parallel-connected cells demonstrates that the proposed framework can systematically select optimal module-level features for CtCV and SoH estimations, deliver accurate CtCV and SoH estimates with high confidence and low computational complexity, remain effective across different C-rates, and be suitable for onboard implementation.


[61] 2602.08260

Towards Optimal Semantic Communications: Reconsidering the Role of Semantic Feature Channels

This paper investigates the optimization of transmitting the encoder outputs, termed semantic features (SFs), in semantic communication (SC). We begin by modeling the entire communication process from the encoder output to the decoder input, encompassing the physical channel and all transceiver operations, as the SF channel, thereby establishing an encoder-SF channel-decoder pipeline. In contrast to prior studies that assume a fixed SF channel, we note that the SF channel is configurable, as its characteristics are shaped by various transmission and reception strategies, such as power allocation. Based on this observation, we formulate the SF channel optimization problem under a mutual information constraint between the SFs and their reconstructions, and analytically derive the optimal SF channel under a linear encoder-decoder structure and Gaussian source assumption. Building on this analysis, we propose a joint optimization framework for the encoder-decoder and SF channel applicable to both analog and digital SC systems. To realize the optimized SF channel, we also propose a physical-layer calibration strategy that enables real-time power control and adaptation to varying channel conditions. Simulation results demonstrate that the proposed SF channel optimization achieves superior task performance under various communication environments.


[62] 2603.06545

LiveSense: A Real-Time Wi-Fi Sensing Platform for Range-Doppler on COTS Laptop

We present LiveSense - a cross-platform that transforms a commercial off-the-shelf (COTS) Wi-Fi Network Interface Card (NIC) on a laptop into a centimeter-level Range-Doppler sensor while preserving simultaneous communication capability. The laptops are equipped with COTS Intel AX211 (Wi-Fi 6E) or Intel BE201 (Wi-Fi 7) NICs. LiveSense can (i) Extract fully-synchronized channel state information (CSI) at >= 40 Hz, (ii) Perform time-phase alignment and self-interference cancellation on-device, and (iii) Provide a real-time stream of range, Doppler, subcarrier magnitude/phase and annotated video frames to a Python/Qt Graphical User Interface (GUI). The demo will showcase the ability to detect (i) Distance and radial velocity of attendees within a few meters of the device, (ii) Micro-motion (respiration), and (iii) Hand-gesture ranging. To the best of our knowledge, this is the first-ever demo to obtain accurate range information of targets from commercial Wi-Fi, despite the limited 160 MHz bandwidth.


[63] 2603.10845

Human Presence Detection via Wi-Fi Range-Filtered Doppler Spectrum on Commodity Laptops

Human Presence Detection (HPD) is key to enable intelligent power management and security features in everyday devices. In this paper we propose the first HPD solution that leverages monostatic Wi-Fi sensing and detects user position using only the built-in Wi-Fi hardware of a device, with no need for external devices, access points, or additional sensors. In contrast, existing HPD solutions for laptops require external dedicated sensors which add cost and complexity, or rely on camera-based approaches that introduce significant privacy concerns. We herewith introduce the Range-Filtered Doppler Spectrum (RF-DS), a novel Wi-Fi sensing technique for presence estimation that enables both range-selective and temporally windowed detection of user presence. By applying targeted range-area filtering in the Channel Impulse Response (CIR) domain before Doppler analysis, our method focuses processing on task-relevant spatial zones, significantly reducing computational complexity. In addition, the use of temporal windows in the spectrum domain provides greater estimator stability compared to conventional 2D Range-Doppler detectors. Furthermore, we propose an adaptive multi-rate processing framework that dynamically adjusts Channel State Information (CSI) sampling rates-operating at low frame rates (10Hz) during idle periods and high rates (100Hz) only when motion is detected. To our knowledge, this is the first low-complexity solution for occupancy detection using monostatic Wi-Fi sensing on a built-in Wi-Fi network interface controller (NIC) of a commercial off-the-shelf laptop that requires no external network infrastructure or specialized sensors. Our solution can scale across different environments and devices without calibration or retraining.


[64] 2603.27427

Dissipativity-Based Distributed Control and Communication Topology Co-Design for Nonlinear DC Microgrids

This paper presents a dissipativity-based distributed droop-free control and communication topology co-design framework for voltage regulation and current sharing in DC microgrids (MGs), where constant-power loads (CPLs) and voltage-source converter (VSC) input saturation introduce significant nonlinearities. In particular, CPLs introduce an inherently destabilizing nonlinearity, while VSC input saturation imposes hard amplitude constraints on applicable control input at each distributed generator (DG), collectively making the DC MG control system design extremely challenging. To this end, the DC MG is modeled as a networked system of DGs, transmission lines, and loads coupled through a static interconnection matrix. Each DG is equipped with a local PI-based controller with an anti-windup compensator and a distributed consensus-based global controller, from which a nonlinear networked error dynamics model is derived. The CPL nonlinearity is characterized via sector-boundedness with the S-procedure applied directly to yield tight LMI conditions, while the VSC input saturation is handled via a dead-zone decomposition and sector-boundedness, with both nonlinearities simultaneously absorbed into the dissipativity analysis. Both nonlinearities are simultaneously absorbed into the dissipativity analysis using the S-procedure. Subsequently, local controller gains and passivity indices, and distributed controller gains and the communication topology are co-designed by solving a sequence of local and global Linear Matrix Inequality (LMI) problems, enabling a one-shot co-design process that avoids iterative procedures. The effectiveness of the proposed framework is validated through simulation of an islanded DC MG under multiple operating scenarios, demonstrating robust performance superior to conventional control approaches.


[65] 2603.28758

Distributionally Robust Planning with $\mathcal{L}_1$ Adaptive Control

Safe operation of autonomous systems requires robustness to both model uncertainty and uncertainty in the environment. We propose DRP-$\mathcal{L}_1$AC, a hierarchical framework for stochastic nonlinear systems that integrates distributionally robust model predictive control (DR-MPC) with $\mathcal{L}_1$-adaptive control. The key idea is to use the $\mathcal{L}_1$-adaptive controller's online distributional certificates that bound the Wasserstein distance between nominal and true state distributions, thereby certifying the ambiguity sets used for planning without requiring distribution samples. Environmental uncertainty is captured via data-driven ambiguity sets constructed from finite samples. These are incorporated into a DR-MPC planner enforcing distributionally robust chance constraints over a receding horizon. Using Wasserstein duality, the resulting problem admits tractable reformulations and a sample-based implementation. We show theoretically and via numerical experimentation that our framework ensures certifiable safety in the presence of simultaneous system and environmental uncertainties.


[66] 2604.01120

Diff-VS: Efficient Audio-Aware Diffusion U-Net for Vocals Separation

While diffusion models are best known for their performance in generative tasks, they have also been successfully applied to many other tasks, including audio source separation. However, current generative approaches to music source separation often underperform on standard objective metrics. In this paper, we address this issue by introducing a novel generative vocal separation model based on the Elucidated Diffusion Model (EDM) framework. Our model processes complex short-time Fourier transform spectrograms and employs an improved U-Net architecture based on music-informed design choices. Our approach matches discriminative baselines on objective metrics and achieves perceptual quality comparable to state-of-the-art systems, as assessed by proxy subjective metrics. We hope these results encourage broader exploration of generative methods for music source separation


[67] 2604.14524

Bridging Standardized Codebook and Site-Specific Beamforming: A Unified Limited-Feedback Framework

A site-specific Type-II codebook design is proposed for downlink massive multiple-input multiple-output (MIMO) limited-feedback beamforming. The key idea is to embed a learned site-specific propagation prior into the Type-II channel state information (CSI) feedback pipeline. Specifically, the base station (BS) uses a low-overhead reference signal received power (RSRP) fingerprint collected during synchronization signal block (SSB) probing to infer a user equipment (UE)-dependent dominant beam subspace before explicit CSI acquisition. The UE then estimates and feeds back only the low-dimensional effective channel coefficients within this inferred subspace, thereby avoiding full-dimensional online subspace discovery while retaining a rich multi-beam representation capability. To analyze the proposed design and compare it with standardized feedback mechanisms, a unified subspace-projection framework is developed by jointly characterizing CSI acquisition, UE-side compression, BS-side reconstruction, and effective spectral efficiency. Under this framework, Type-I, Type-II, port-selection feedback, and the proposed scheme are interpreted as different ways of inducing a feedback representation subspace. The probing codebook and the BS-side subspace inference network are then formulated as a coupled task-oriented design problem and are optimized end-to-end by maximizing the normalized CSI-capture efficiency. Extensive simulation results demonstrate that the proposed feedback scheme achieves Type-II-comparable CSI-capture capability with substantially lower online overhead and UE-side complexity, thereby improving the effective spectral efficiency.


[68] 2604.17169

Two-Tier High Altitude Platform Stations (HAPS) for Exploring Wireless Energy Harvesting

In sixth-generation (6G) cellular networks and beyond, aerial platforms, such as uncrewed aerial vehicles (UAVs) and high-altitude platform stations (HAPS), are anticipated to play a crucial role in enhancing connectivity, expanding network coverage, and supporting advanced communication services. However, the deployment of energy-efficient onboard communication systems is essential for their widespread adoption and effectiveness. The integration of energy harvesting (EH) into aerial platforms is envisioned to be pivotal in promoting both energy and cost efficiency. In this paper, we propose a new paradigm for aerial platforms in which they can collect energy from the transmitted signals of nearby aerial platforms. The paper employs a two-tier architecture with HAPS super-macro base stations (HAPS-SMBS) system: regular HAPS-SMBS nodes serve as base stations, while a "mother" HAPS-SMBS node acts as a manager to coordinate communications between regular HAPS-SMBS and the ground station, thus enabling wireless energy transfer. Specifically, we analyze the characteristics of EH-enabled HAPS-SMBS and compare their performance with those without EH. Additionally, we derive the optimal regular HAPS-SMBS positioning to mitigate signal attenuation and power loss. Subsequently, we formulate a joint optimization problem for regular HAPS-SMBS positioning and the EH factor. We solve the problem using the iterative distance and EH factor algorithm (IDFA); however, we employ $Q$-learning to verify its effectiveness. Our findings indicate that, compared to conventional EH systems, IDFA and $Q$-learning exhibit higher data rate performance. In contrast, $Q$-learning outperforms IDFA systems in linear modelswith intensive training in approximating optimal values. Furthermore, maximizing transmit power achieves higher gains than systems without EH.


[69] 2604.17647

Prosody as Supervision: Bridging the Non-Verbal--Verbal for Multilingual Speech Emotion Recognition

In this work, we introduce a paralinguistic supervision paradigm for low-resource multilingual speech emotion recognition (LRM-SER) that leverages non-verbal vocalizations to exploit prosody-centric emotion cues. Unlike conventional SER systems that rely heavily on labeled verbal speech and suffer from poor cross-lingual transfer, our approach reformulates LRM-SER as non-verbal-to-verbal transfer, where supervision from a labeled non-verbal source domain is adapted to unlabeled verbal speech across multiple target languages. To this end, we propose NOVA ARC, a geometry-aware framework that models affective structure in the Poincaré ball, discretizes paralinguistic patterns via a hyperbolic vector-quantized prosody codebook, and captures emotion intensity through a hyperbolic emotion lens. For unsupervised adaptation, NOVA-ARC performs optimal transport based prototype alignment between source emotion prototypes and target utterances, inducing soft supervision for unlabeled speech while being stabilized through consistency regularization. Experiments show that NOVA-ARC delivers the strongest performance under both non-verbal-to-verbal adaptation and the complementary verbal-to-verbal transfer setting, consistently outperforming Euclidean counterparts and strong SSL baselines. To the best of our knowledge, this work is the first to move beyond verbal-speech-centric supervision by introducing a non-verbal-to-verbal transfer paradigm for SER.


[70] 2604.19935

A Hybrid Gauss Markov LSTM Mobility Model for Indoor OWC

Optical wireless communication (OWC) has emerged as a promising candidate for future high-capacity indoor wireless networks, driven by its large unregulated spectrum, high spatial reuse, and ability to support multi-gigabit data rates. However, OWC systems are highly sensitive to user mobility, as link performance depends strongly on the spatial alignment between transmitter and receiver. Accurate modelling of user position and device orientation is therefore essential for reliable channel estimation and system evaluation. To that effect, this paper proposes a hybrid Gauss--Markov and long short-term memory (GM--LSTM) mobility model for indoor OWC environments. The Gauss--Markov component captures the temporal correlation of user motion, while the LSTM learns residual behaviour to model non-linear movement patterns and orientation dynamics. The proposed model jointly predicts user position and device orientation, enabling improved representation of mobility in OWC channels. Performance is evaluated using prediction accuracy and per-user data rate evolution. Results show that the proposed hybrid GM--LSTM model outperforms conventional Random Waypoint and Gauss--Markov models, providing more accurate mobility prediction and more stable communication performance in dynamic indoor environments.


[71] 2604.20459

Rank-Aware Link Adaptation for XR Tethering Groups with Realistic Tethering Link: A Multi-Offset OLLA Framework

We investigate higher-rank transmissions for multi-connected Extended Reality (XR) devices enabled through tethering group (TGr), in which a nearby tethering User Equipment (UE) cooperates with an XR UE via a short-range tethering link (TL). In contrast to prior studies that are limited to rank-1 transmission and ideal tethering assumptions, we analyze TGr performance under higher-rank point-to-multipoint (PTM) transmission and realistic TL delays. Conventional single Outer Loop Link Adaptation (OLLA) offset results in inaccurate throughput prediction across ranks, leading to suboptimal rank selection. To address this limitation, we propose a multi-offset Outer Loop Link Adaptation (MO-OLLA) framework that introduces rank-dependent signal-to-interference-plus-noise ratio (SINR) correction to improve Link Adaptation (LA) accuracy. Furthermore, a Wireless Fidelity (WiFi) based delay model is incorporated to characterize the impact of practical TL constraints including limited bandwidth and achievable throughput on XR capacity and cellular resource utilization, providing the first such analysis for higher-rank multi-connected XR device. System-level simulations demonstrate that MO-OLLA provides up to 20% performance improvement over conventional OLLA for multi-connected XR UEs. Moreover, TGrs effectively exploit higher-rank transmission, achieving XR capacity gains of 180-200% over single-link XR UEs under ideal TL conditions. Critically, the gains of the TGr remain at 165-180% under realistic high-throughput TLs relative to single-link XR UEs, confirming the practical viability of TGr based cooperation for XR capacity enhancements within existing cellular resources.


[72] 2305.01626

Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks

Computational models of syntax are predominantly text-based. Here we propose that the most basic first step in the evolution of syntax can be modeled directly from raw speech in a fully unsupervised way. We focus on one of the most ubiquitous and elementary suboperations of syntax -- concatenation. We introduce \textit{spontaneous concatenation}: a phenomenon where a ciwGAN/fiwGAN models (based on convolutional neural networks) trained on acoustic recordings of individual words start generating outputs with two or even three words concatenated without ever accessing data with multiple words in the training data. We replicate this finding in several independently trained models with different hyperparameters and training data. Additionally, networks trained on two words learn to embed words into novel unobserved word combinations. We also show that the concatenated outputs contain precursors to compositionality. To our knowledge, this is a previously unreported property of CNNs trained in the ciwGAN/fiwGAN setting on raw speech and has implications both for our understanding of how these architectures learn as well as for modeling syntax and its evolution in the brain from raw acoustic inputs. We also propose and formalize a neural mechanism called \textit{disinhibition} that outlines a possible artificial and biological neural pathway towards concatenation and compositionality and suggests our modeling is useful for generating testable predictions for biological and artificial neural processing of spoken language.


[73] 2307.00385

Sulcal Pattern Matching with the Wasserstein Distance

We present the unified computational framework for modeling the sulcal patterns of human brain obtained from the magnetic resonance images. The Wasserstein distance is used to align the sulcal patterns nonlinearly. These patterns are topologically different across subjects making the pattern matching a challenge. We work out the mathematical details and develop the gradient descent algorithms for estimating the deformation field. We further quantify the image registration performance. This method is applied in identifying the differences between male and female sulcal patterns.


[74] 2502.15793

Anomaly Detection in Smart Power Grids with Graph-Regularized MS-SVDD: a Multimodal Subspace Learning Approach

Anomaly detection in smart power grids is a critical challenge due to the complexity, heterogeneity, and dynamic nature of sensor data streams. Existing one-class classification methods, particularly Subspace Support Vector Data Description (SVDD), have been extended to multimodal scenarios but often fail to fully exploit the structural dependencies across modalities, limiting their robustness in real-world applications. In this paper, we address this gap by proposing a generalized Multimodal Subspace Support Vector Data Description (MS-SVDD) model with graph-embedded regularization. The method projects data from multiple modalities into a shared low-dimensional subspace while preserving modality-specific structure through Laplacian regularizers. Our approach is evaluated on a three-modality dataset derived from smart grid event time series, using a dedicated preprocessing pipeline for constructing one-class classification training samples. The results demonstrate that our graph-embedded MS-SVDD improves robustness of event detection compared to conventional approaches, highlighting the potential of integrating graph priors with multimodal subspace learning for advancing anomaly detection in critical infrastructure. More broadly, this work contributes to the wider field of AI by illustrating how relational and structural information can be systematically embedded into one-class models, enabling robust learning under complex, high-dimensional, and multimodal conditions.


[75] 2503.10475

Stratified Topological Autonomy for Long-Range Coordination (STALC)

In this paper, we present Stratified Topological Autonomy for Long-Range Coordination (STALC), a hierarchical planning approach for multi-robot coordination in real-world environments with significant inter-robot spatial and temporal dependencies. At its core, STALC consists of a multi-robot graph-based planner which combines a topological graph with a novel, computationally efficient mixed-integer programming formulation to generate highly-coupled multi-robot plans in seconds. To enable autonomous planning across different spatial and temporal scales, we construct our graphs so that they capture connectivity between free-space regions and other problem-specific features, such as traversability or risk. We then use receding-horizon planners to achieve local collision avoidance and formation control. To evaluate our approach, we consider a multi-robot reconnaissance scenario where robots must autonomously coordinate to navigate through an environment while minimizing the risk of detection by observers. Through simulation-based experiments, we show that our approach is able to scale to address complex multi-robot planning scenarios. Through hardware experiments, we demonstrate our ability to generate graphs from real-world data and successfully plan across the entire hierarchy to achieve shared objectives.


[76] 2505.22266

FGAS: Fixed Decoder Network-Based Audio Steganography with Adversarial Perturbation Generation

The rapid development of Artificial Intelligence Generated Content (AIGC) has made high-fidelity generated audio widely available across the Internet, driving the advancement of audio steganography. Benefiting from advances in deep learning, current audio steganography schemes are mainly based on encoder-decoder network architectures. While these methods guarantee a certain level of perceptual quality for stego audio, they typically face high computational cost and long implementation time, as well as poor anti-steganalysis performance. To address the aforementioned issues, we pioneer a Fixed Decoder Network-Based Audio Steganography with Adversarial Perturbation Generation (FGAS). Adversarial perturbations carrying a secret message are embedded into the cover audio to generate stego audio. The receiver only needs to share the structure and key of the fixed decoder network to accurately extract the secret message from the stego audio. In FGAS, we propose an Audio Adversarial Perturbation Generation (A2PG) strategy with an optional robust extension and design a lightweight fixed decoder. The fixed decoder guarantees reliable extraction of the hidden message, while adversarial perturbations are optimized to keep the stego audio perceptually and statistically close to the cover audio, thereby improving anti-steganalysis performance. The experimental results show that FGAS significantly improves stego audio quality, achieving an average PSNR gain of over 10 dB compared to SOTA methods. Furthermore, FGAS demonstrates strong robustness against common audio processing attacks. Moreover, FGAS exhibits superior anti-steganalysis performance across different relative payloads; under high-capacity embedding, it achieves a classification error rate about 2% higher, indicating stronger anti-steganalysis performance than current SOTA methods.


[77] 2509.22321

Distributed Associative Memory via Online Convex Optimization

An associative memory (AM) enables cue-response recall, and associative memorization has recently been noted to underlie the operation of modern neural architectures such as Transformers. This work addresses a distributed setting where agents maintain a local AM to recall their own associations as well as selective information from others. Specifically, we introduce a distributed online gradient descent method that optimizes local AMs at different agents through communication over routing trees. Our theoretical analysis establishes sublinear regret guarantees, and experiments demonstrate that the proposed protocol consistently outperforms existing online optimization baselines.


[78] 2510.23274

Privacy-Preserving Semantic Communication over Wiretap Channels with Learnable Differential Privacy

While semantic communication (SemCom) improves transmission efficiency by focusing on task-relevant information, it also raises critical privacy concerns. Many existing secure SemCom approaches rely on restrictive or impractical assumptions, such as favorable channel conditions for the legitimate user or prior knowledge of the eavesdropper's model. To address these limitations, this paper proposes a novel secure SemCom framework for image transmission over wiretap channels, leveraging differential privacy (DP) to provide approximate privacy guarantees. Specifically, our approach first extracts disentangled semantic representations from source images using generative adversarial network (GAN) inversion method, and then selectively perturbs private semantic representations with approximate DP noise. Distinct from conventional DP-based protection methods, we introduce DP noise with learnable pattern, instead of traditional white Gaussian or Laplace noise, achieved through adversarial training of neural networks (NNs). This design mitigates the inherent non-invertibility of DP while effectively protecting private information. Moreover, it enables explicitly controllable security levels by adjusting the privacy budget according to specific security requirements, which is not achieved in most existing secure SemCom approaches. Experimental results demonstrate that, compared with the previous DP-based method and direct transmission, the proposed method significantly degrades the reconstruction quality for the eavesdropper, while introducing only slight degradation in task performance. Under comparable security levels, our approach achieves an LPIPS advantage of 0.06-0.29 and an FPPSR advantage of 0.10-0.86 for the legitimate user compared with the previous DP-based method.


[79] 2601.20896

A Study of Data Selection Strategies for Pre-training Self-Supervised Speech Models

Self-supervised learning (SSL) has transformed speech processing, yet its reliance on massive pre-training datasets remains a bottleneck. While robustness is often attributed to scale and diversity, the role of the data distribution is less understood. We systematically examine how curated subsets of pre-training data influence Automatic Speech Recognition (ASR) performance. Surprisingly, optimizing for acoustic, speaker, or linguistic diversity yields no clear improvements over random sampling. Instead, we find that prioritizing the longest utterances achieves superior ASR results while using only half the original dataset, reducing pre-training time by 24% on a large corpora. These findings suggest that for pre-training speech SSL models, data length is a more critical factor than either data diversity or overall data quantity for performance and efficiency, offering a new perspective for data selection strategies in SSL speech processing.


[80] 2604.12067

Vectorized Gaussian Belief Propagation for Near Real-Time Fully-Distributed PMU-Based State Estimation

Electric power systems require accurate, scalable, distributed, and near real-time state estimation (SE) to support reliable monitoring and control under increasingly complex operating conditions. Limited monitoring capabilities can lead to inefficient operation and, in extreme cases, large-scale disturbances such as blackouts. To address these challenges, this paper proposes a vectorized Gaussian belief propagation (GBP) framework for phasor measurement unit-based SE, formulated over factor graphs and specifically designed to support distributed and near real-time monitoring. The proposed framework includes multivariate and fusion-based GBP formulations. The multivariate formulation jointly models related state variables and their measurement relationships, while the fusion-based formulation reduces factor graph complexity by combining multiple measurements associated with the same set of variables, resulting in a structure that more closely reflects the underlying electrical coupling of the power system. The resulting algorithms operate in a fully distributed manner at the bus level and achieve fast convergence and high estimation accuracy, often within a few iterations, as demonstrated by numerical results on systems ranging from 60 to 13659 buses, where the fusion-based formulation achieves single-digit millisecond iteration times on the largest test case.


[81] 2604.12671

Differentiating Physical and Psychological Stress Using Wearable Physiological Signals and Salivary Cortisol

Objective: This study aimed to assess how wearable physiological signals, alone and combined with salivary cortisol, distinguish physical and psychological stress and their recovery states. Methods: Six healthy adults completed three laboratory sessions on separate days: rest, physical stress (high-intensity cycling), or psychological stress (modified Trier Social Stress Test). Heart rate, heart rate variability, electrodermal activity, and wrist accelerometry were recorded continuously, and salivary cortisol was sampled at five time points. Features were extracted in non-overlapping 10-minute windows and labelled as rest, physical stress, physical recovery, psychological stress, or psychological recovery. A gradient boosting classifier was trained using wearable features alone and with five additional cortisol features per window. Performance was evaluated using leave-one-participant-out cross-validation. Results: Wearable-only classification achieved 77.8% overall accuracy, with high accuracy for physical stress and recovery but frequent misclassification of psychological stress and recovery (recall 50.0% and 54.2%). Including cortisol improved overall accuracy (94.4%), particularly for psychological states, increasing recall to 83.3% and 87.5%. Cortisol also reduced misclassification between psychological stress and rest. Conclusion: Wearable signals alone were insufficient to reliably distinguish psychological stress from rest and recovery. Integrating salivary cortisol improved classification of psychological stress and recovery and reduced confusion with rest, highlighting the value of endocrine context alongside wearable physiology. Significance: These findings support multimodal stress monitoring and motivate larger, ecologically valid studies and scalable alternatives to repeated cortisol sampling.


[82] 2604.13359

BioTrain: Sub-MB, Sub-50mW On-Device Fine-Tuning for Edge-AI on Biosignals

Biosignals exhibit substantial cross-subject and cross-session variability, inducing severe domain shifts that degrade post-deployment performance for small, edge-oriented AI models. On-device adaptation is therefore essential to both preserve user privacy and ensure system reliability. However, existing sub-100 mW MCU-based wearable platforms can only support shallow or sparse adaptation schemes due to the prohibitive memory footprint and computational cost of full backpropagation (BP). In this paper, we propose BioTrain, a framework enabling full-network fine-tuning of state-of-the-art biosignal models under milliwatt-scale power and sub-megabyte memory constraints. We validate BioTrain using both offline and on-device benchmarks on EEG and EOG datasets, covering Day-1 new-subject calibration and longitudinal adaptation to signal drift. Experimental results show that full-network fine-tuning achieves accuracy improvements of up to 35% over non-adapted baselines and outperforms last-layer updates by approximately 7% during new-subject calibration. On the GAP9 MCU platform, BioTrain enables efficient on-device training throughput of 17 samples/s for EEG and 85 samples/s for EOG models within a power envelope below 50 mW. In addition, BioTrain's efficient memory allocator and network topology optimization enable the use of a large batch size, reducing peak memory usage. For fully on-chip BP on GAP9, BioTrain reduces the memory footprint by 8.1x, from 5.4 MB to 0.67 MB, compared to conventional full-network fine-tuning using batch normalization with batch size 8.


[83] 2604.14844

Matched and Euclidean-Mismatched Decoding on Fourier-Curve Constellations with Tangent Noise

We study matched and Euclidean-mismatched decoding on finite Fourier-curve constellations with tangent-space artificial noise. Each hypothesis induces a Gaussian law with symbol-dependent rank-one covariance. We derive exact Euclidean pairwise errors for arbitrary pairs and an exact Gaussian-expectation representation for matched decoding on bilaterally tangent-orthogonal pairs. For uniform even constellations, the Euclidean side yields explicit distance spectra and symbol-error bounds across all offset classes; the matched side is exact on antipodal pairs and benchmarked numerically at the full-codebook level via Monte Carlo. By isolating the detection-theoretic consequence of tangent-space artificial noise, these results clarify analytically how noise fraction and constellation density enter the mismatch behavior; secrecy-rate implications require additional channel and adversary modeling.