New articles on Electrical Engineering and Systems Science


[1] 2604.14154

An Edge-Cloud Collaborative Architecture for Proactive Elderly Care: Real-Time Risk Assessment and Three-Level Emergency Response

The rapid aging of global populations has created an urgent need for intelligent healthcare monitoring systems to ensure the safety of elderly individuals living independently. Existing cloud-centric platforms face critical limitations, including high latency unsuitable for emergency response, privacy risks from continuous transmission of sensitive data, and limited, single-channel alert mechanisms lacking scalability and context awareness. This paper proposes an edge-cloud collaborative architecture that addresses these challenges through real-time multi-modal sensor fusion, a four-dimensional risk assessment model, and a three-level emergency response system. The framework adopts a five-layer design - device, edge, service, data, and application layers - enabling real-time risk evaluation with end-to-end alert latency under three seconds. At the edge, a weighted multi-modal fusion algorithm integrates data from five sensor types with confidence propagation. A unified risk score is generated by combining fall probability, physiological indicators, behavioral patterns, and sensor anomaly metrics. Based on dynamic thresholds, a three-tier notification system coordinates responses among family members, community doctors, and nearby volunteers. Experiments on CASAS, MIMIC-III, and SisFall datasets show that the approach achieves 91% activity recognition accuracy and an 84% anomaly detection F1-score, outperforming single-sensor methods. Deployment on Raspberry Pi 4 gateways demonstrates sub-100 ms inference latency while preserving privacy by keeping raw data local. This architecture advances practical, privacy-preserving, and responsive elderly care systems.


[2] 2604.14184

End-to-End Learning-based Operation of Integrated Energy Systems for Buildings and Data Centers

Buildings and data centers (DCs) are energy-intensive sectors, playing a critical role to achieve the low-carbon and sustainable energy transition targets. To this end, integrated energy system (IES) that incorporates diverse renewables, energy generation, conversion, and storage technologies to enable coordinated multi-energy supply have been widely investigated for both buildings and DCs. However, few works consider the two sectors jointly within IES to exploit their substantial synergistic benefits. Meanwhile, the operational optimization of IES remains challenging due to the difficulty to predict the multi-energy demand and supply accurately. To address these gaps, this paper investigates IES for coordinated multi-energy supply of buildings and DC, where the waste heat from DCs is recovered and reused to enhance energy efficiency. Moreover, an end-to-end learning-based method is proposed for the operational optimization of IES under uncertainty. Unlike conventional predict-then-optimize approaches, the proposed method integrates the training of prediction models for uncertain variables with the constrained optimization of IES into a unified learning framework, guiding the training of prediction models to improve operational performance, rather than prediction accuracy, thereby mitigating the impacts of predictions errors. Case studies based on real-world datasets show that the proposed methods improves the operational performance of IES by about 7-9% compared to existing predict-then-optimize methods. In addition, coordinating buildings and DCs within IES shows substantial economic benefits. In particular, the waste heat recovery from DCs leads to approximately 10% of total energy cost reduction of the IES.


[3] 2604.14185

On the Instantaneous Phase and Frequency Estimation of a Non-Stationary Multicomponent Signal. The JADE Algorithm

Many real-life signals, such as gravitational wave measurements, biomedical signals, or geophysical data, are strongly non-stationary but can be decomposed into mono-component signals that contain only one active frequency over time. This is made possible thanks to decomposition methods developed in recent years that can handle non-stationary signals. The problem now is how to compute, in an accurate and stable way, the instantaneous frequency, phase, and amplitude of such mono-component signals. Numerous approaches have been developed so far, but they can be unstable in the presence of noise and struggle to capture quick and intrawave changes in frequency. In this work, we present an alternative approach, called the JADE method, which is based on the Dynamic Time Warping algorithm and which we combine with the FIF algorithm to handle and study multicomponent non-stationary signals. We test the robustness of JADE to noise and run comparisons with classical methods used for instantaneous frequency, phase, and amplitude estimation.


[4] 2604.14186

HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Large self-supervised speech (SSL) models achieve strong downstream performance, but their size limits deployment in resource-constrained settings. We present HArnESS, an Arabic-centric self-supervised speech model family trained from scratch with iterative self-distillation, together with lightweight student variants that offer strong accuracy-efficiency trade-offs on Automatic Speech Recognition (ASR), Dialect Identification (DID), and Speech Emotion Recognition (SER). Our approach begins with a large bilingual Arabic-English teacher and progressively distills its knowledge into compressed student models while preserving Arabic-relevant acoustic and paralinguistic representations. We further study PCA-based compression of the teacher supervision signal to better match the capacity of shallow and thin students. Compared with HuBERT and XLS-R, HArnESS consistently improves performance on Arabic downstream tasks, while the compressed models remain competitive under substantial structural reduction. These results position HArnESS as a practical and accessible Arabic-centric SSL foundation for real-world speech applications.


[5] 2604.14192

Closed-Form Analytical Solution for Effective Resistance in Finite 2D Anisotropic Resistor Grids via Jacobi Theta Functions

Computing the effective resistance between nodes in finite discrete resistor grids is a classical problem in circuit analysis with applications in VLSI power delivery network analysis, graph theory, and network science. Recent advances, particularly the infinity mirror technique, provide an elegant physical interpretation for boundary conditions in finite grids. Building upon this foundation, this paper presents a closed-form analytical expression that avoids numerical truncation or polynomial fitting. Our theoretical development proceeds in two steps. First, we derive an exact analytical primitive for the singular integral term $R_2$ within the integral operator $\Omega_\alpha$. Second, we transform the doubly infinite mirror series into a compact expression using the Jacobi theta function $\vartheta_1$. This transformation achieves machine precision with only a few terms. However, under high anisotropy, the pure analytical approximation exhibits a distinct "cross-shaped" residual error. To address this, we introduce a hybrid engineering remediation: a dynamic numerical cache that performs localized grid integration (LGI), combining $O(1)$ speed with exact near-field accuracy. Numerical experiments demonstrate mean relative errors below 0.04% compared to SPICE simulations, eliminating axis-localized error artifacts. To facilitate further research, the implementation of our proposed 2D resistor grid calculator is available at: this https URL.


[6] 2604.14203

Energetic Resilience under Temporal Logic Specifications

In environments with uncertainties or undesirable influences, control systems can require additional energy to achieve their task while remaining resilient to these influences. In this paper, we present an energetic resilience metric that quantifies the maximal additional energy used by a system under undesired effects, while satisfying complex specifications encoded through temporal logic. We prove that this metric satisfies properties that enable its computation even for compositions of these specifications, thus allowing considerations of sequential reachability and safety tasks. For specifications related to finite-horizon reachability and safety, we describe how synthesizing a control input and computing this metric reduces to solving efficient quadratic programs. Two case studies on a fighter-jet model and a planar mobile robot illustrate how the synthesized control inputs satisfy given specifications despite undesired and potentially adversarial effects. Further, we demonstrate how the energetic resilience metric varies with the initial state as well as the magnitude of undesired effects.


[7] 2604.14205

Consensus and Synchronization of Multi-agent Systems over Finite Fields -- Graph Topologies

This paper brings cooperative protocols for multi-agent systems with agents having a finite state-space. Both scalar single-integrator consensus and general LTI systems synchronization are considered. Systems having a finite state-space describe agents with minimal memory capacity processing only a finite alphabet. Such systems are remarkably resilient to communication noise. The crucial problem, however, is to construct the admissible communication topology, which is NP-hard. We address this by efficiently exploring the subsets of admissible matrices and propose two new algorithms to generate the topologies. Simulations validate the proposed approach.


[8] 2604.14207

Probabilistic Connectivity Analysis of Recursive Satellite Release for Formation Initialization

In the initial deployment of large-scale distributed space systems using small satellites, achieving a reliable transition to passively stable orbits while maintaining inter-satellite distances within effective control and communication ranges is crucial, particularly given the presence of deployment errors and uncontrolled coasting phases. This study presents a framework for designing formation initialization that provides probabilistic safety guarantees. The scope covers the initial deployment phase, from sequential release by a single carrier to commissioning, control activation, and transition to passive stabilization. Strict separation limits during initialization necessitate low release velocities to minimize relative drift before control activation. However, in the low-velocity regime, the allowable tolerances for release velocity and angular rate errors tighten significantly to satisfy distance constraints, making hardware requirements a critical bottleneck. To address this, we model the initialization sequence as a stochastic process and derive closed-form constraints on deployment errors and control activation intervals. These conditions ensure that inter-satellite distances remain within the allowable separation limit with a prescribed probability. Monte Carlo simulations, configured using the error bounds and intervals derived from the proposed constraints, demonstrate that inter-satellite distances are successfully maintained within the allowable range. The proposed framework enables the safe initialization of large-scale distributed space systems by translating strict separation constraints into quantifiable hardware requirements.


[9] 2604.14308

High Order Tuners for Adaptive Safety of Robotic Systems

The combination of control barrier functions (CBFs) and adaptive control -- a framework referred to as adaptive safety -- has proven to be a powerful paradigm for safety-critical control of nonlinear systems with parametric uncertainties. Yet the theoretical conditions for forward invariance within this framework are often quite conservative, and may require using large adaptation gains to achieve acceptable performance, an approach that is traditionally discouraged in adaptive control. This paper mitigates these issues via high-order tuners, a recent class of higher-order adaptation laws that leverages different adaptation gains at different orders of differentiation. We illustrate that these high-order tuners decouple adaptation gain conditions from those placed on the initial conditions of the system required for set invariance. We extend these results to robotic systems whose linear-in-the-parameters structure proves particularly useful for adaptive control. The efficacy of our results are illustrated via simulations.


[10] 2604.14354

Who is Speaking or Who is Depressed? A Controlled Study of Speaker Leakage in Speech-Based Depression Detection

This study investigates whether speech-based depression detection models learn depression-related acoustic biomarkers or instead rely on speaker identity cues. Using the DAIC-WOZ dataset, we propose a data-splitting strategy that controls speaker overlap between training and test sets while keeping the training size constant, and evaluate three models of varying complexity. Results show that speaker overlap significantly boosts performance, whereas accuracy drops sharply on unseen speakers. Even with a Domain-Adversarial Neural Network, a substantial performance gap remains. These findings indicate that depression-related features extracted by current speech models are highly entangled with speaker identity. Conventional evaluation protocols may therefore overestimate generalization and clinical utility, highlighting the need for strictly speaker-independent evaluation.


[11] 2604.14372

AC-OPF Feasibility Analysis and Sensitivity-Guided Capacitor Placement in a High-PV Islanded Microgrid

This paper presents a comparative AC Optimal Power Flow study on a real world city scale islanded microgrid with high solar PV penetration, implemented within a Digital Twin framework. Four objective function cases economic dispatch, voltage stress exposure via PV power factor variation, then optimal load delivery, and capacitor enhanced economic dispatch as recovery options are evaluated over a 47 hour time series horizon on the same network under a shared loading scenario. Optimization sensitivities OSQ and OSV extracted from all cases are combined into a composite placement score used to rank candidate buses for shunt capacitor upgrades. A post processing planning optimization balances capacitor upgrade cost against avoided value-of-lost-load, enabling direct economic comparison of infrastructure investment versus reliability penalties. Results demonstrate that sensitivity guided capacitor placement restores full load service across the horizon and provides targeted reactive support at a quantifiable cost trade off against corrective load shedding.


[12] 2604.14410

Integrated Investment and Policy Planning for Power Systems via Differentiable Scenario Generation

We formulate a method to co-optimize power system capacity planning decisions and policy investments that shape electricity load patterns. To this end, we leverage a gradient-based solution technique that enables the efficient solution of operation-aware planning models. To compute gradients with respect to the conditions that define daily electricity demand profiles, we introduce and formalize the concept of differentiable scenario generation and show that generative machine learning models satisfy the mathematical requirements needed to compute consistent gradients. We demonstrate the feasibility of the proposed approach through numerical experiments using a diffusion model-based scenario generator and a stylized generation and capacity expansion planning model.


[13] 2604.14413

Comprehensive Review of Doppler Shift Localization Methods: Advances, Limitations, and Research Opportunities

Reliable geolocation of non-cooperative emitters in environments where Global Navigation Satellite Systems (GNSS) are unavailable or degraded is a key enabler for spectrum regulation, emergency response, autonomous mobility, and Integrated Sensing and Communication (ISAC) services in 5G/6G systems. Doppler-based techniques - from single-receiver Signal Doppler Frequency (SDF) fixes through multi-node Frequency Difference of Arrival (FDOA) and Direct Position Determination (DPD) to derivative-enhanced and learning-assisted hybrids - exploit radial-velocity-induced frequency shifts as a passive, high-resolution localization cue accessible with commodity software-defined radios, millimeter-wave access points, or acoustic sensors. This review consolidates over a decade of research across radio, acoustic, and satellite domains. It introduces a unifying taxonomy that divides the field into five technique families, outlining their evolution, measurement models, and estimator archetypes. It then compares algebraic, Bayesian, convex, and neural inference frameworks under realistic impairments such as oscillator drift, multipath, and asynchronous clocks, highlighting conditions where derivative Doppler metrics tighten the Cramer-Rao bound with minimal hardware cost. Environment-specific deployments are examined, from urban canyons and GNSS-denied tunnels to underwater, radar, UAV-swarm, and multi-orbit satellite scenarios, with prototype accuracies reaching meter scale using low-size, weight, and power payloads. Finally, the survey distils design recommendations for mobile and tactical operations and identifies open research challenges in frequency-reference integrity, multipath-aware modelling, edge-constrained computation, and trajectory-aware sensing.


[14] 2604.14441

Batch Effects In Brain Foundation Model Embeddings

Foundation models show strong potential for large-scale, high-dimensional biomedical applications, yet their ability to capture relevant neurobiological characteristics remains underexplored. We systematically evaluate embeddings from two neuroimaging foundation models, BrainLM and SwiFT, across multi-site fMRI datasets using a comprehensive evaluation framework. Our results show that foundation model embeddings encode substantial batch-related variability, often dominating diagnosis-related information across heterogeneous datasets. We further investigate how harmonization, applied to reduce batch effects, influences these embeddings. In addition, we find that BrainLM prefers to capture fine-grained regional activity, whereas SwiFT tends to represent interactions between regions, consistent with their respective model architectures. Our study highlights the importance of accounting for batch effects in foundation models and motivates future work on disentangling biologically meaningful signals from acquisition-related variability.


[15] 2604.14476

ProtoAoA: Few-Shot Angle-of-Arrival Estimation using Prototypical Networks

Angle-of-arrival (AoA) estimation is a crucial function in wireless communications used for localization, beam-forming, interference management, and other applications. Deep learning (DL) solutions have been proposed for AoA to mitigate limitations of traditional AoA estimation techniques such as sensitivity to noise and the inability to generalize across different array characteristics. A challenge, however, of DL-based approaches is their reliance on large data collection campaigns and model training. This paper proposes the application of Prototypical Networks (PN) to address this challenge and utilizes a real-world dataset collected on a software defined radio (SDR) testbed to validate the effectiveness of the proposed solution. Prototypical Networks excel in extracting representative embeddings from unstructured input data, establishing class prototypes during training that can be few-shot trained on unseen classes. We demonstrate the efficacy of PNs for AoA classification using complex IQ samples, focusing on its ability to correctly classify new, unseen angles that the model was not trained on previously. Our results show that training our proposed ProtoAoA on only 23% of the AoA dataset classes can attain a mean absolute error (MAE) of 3 degrees with only 4-shots of training on the unseen angles - and an MAE of 2 degrees with 32-shots of training data. These results demonstrate that the developed prototypical network architecture requires remarkably few data samples to achieve reliable AoA estimation - and highlights its potential for other wireless applications where data availability is limited.


[16] 2604.14499

Quantification and Regulation of Energy Reserves for Distributed Frequency and Voltage Control of Grid-Forming Inverters

The introduction of Renewable Energy Sources (RES) and Distributed Energy Resources (DERs) has led to the formulation of Microgrids (MGs) and Networks of MGs (NMGs). MGs and NMGs can operate in islanded mode, transforming the grid into a more distributed system. This has led to extensive studies in the literature on distributed hierarchical control strategies. Previous works have proposed distributed secondary level frequency and voltage regulation control schemes for Battery Energy Storage System (BESS)-based Grid-Forming (GFM) inverters with State of Charge (SoC) balancing. However, links to tertiary level control in terms of service-based reserves and local resource adequacy in MGs are largely unexplored. Therefore, this paper proposes a BESS energy reserves framework, to quantify reserves for hierarchical control operation. Additionally, to partially regulate the proposed energy reserves, we propose the formulation of a modified Distributed-Averaging Proportional-Integral (DAPI) controller with regulation energy reserve consensus. Controller Hardware-In-the-Loop (CHIL) simulation is performed on an MG topologically based on the IEEE 13 bus test feeder system in MATLAB/Simulink. The proposed scheme results illustrate effective frequency and voltage regulation along with improved power and energy sharing across droop-controlled and Virtual Synchronous Machine (VSM) controlled inverters.


[17] 2604.14523

Quantifying and Improving the Accuracy of Electromagnetic Transient-Transient Stability Hybrid Simulation

The increasing penetration of inverter-based resources introduces new dynamic challenges to modern power grids, such as sub- and super-synchronous oscillations and other faster dynamics. These dynamics are typically fast in nature and are difficult to accurately model and analyze using standard transient stability (TS) methods, necessitating the need for electromagnetic transient (EMT) analysis. However, EMT simulations are notoriously slow for large-scale grids due to both equation formulations and computational limitations. To overcome this challenge, EMT-TS hybrid simulation is often used, since it offers a balanced trade-off between accuracy and speed, making it feasible to perform EMT analysis on large systems. One open question about EMT-TS hybrid simulation is the accuracy of the EMT-TS boundary or interface. This paper introduces an error index to quantify EMT-TS hybrid interface errors, identifies conditions where the hybrid simulation approach may become inaccurate, and suggests EMT region expansions to improve the simulation accuracy. Additionally, a three-sequence hybrid interface model is proposed to mitigate inaccuracies caused by unbalanced conditions.


[18] 2604.14524

Bridging Standardized Codebook and Site-Specific Beamforming: A Unified Limited-Feedback Framework

A site-specific Type-II codebook design is proposed for downlink massive multiple-input multiple-output (MIMO) limited-feedback beamforming. The key idea is to embed a learned site-specific propagation prior into the Type-II channel state information (CSI) feedback pipeline. Specifically, the base station (BS) uses a low-overhead reference signal received power (RSRP) fingerprint collected during synchronization signal block (SSB) probing to infer a user equipment (UE)-dependent dominant beam subspace before explicit CSI acquisition. The UE then estimates and feeds back only the low-dimensional effective channel coefficients within this inferred subspace, thereby avoiding full-dimensional online subspace discovery while retaining a rich multi-beam representation capability. To analyze the proposed design and compare it with standardized feedback mechanisms, a unified subspace-projection framework is developed by jointly characterizing CSI acquisition, UE-side compression, BS-side reconstruction, and effective spectral efficiency. Under this framework, Type-I, Type-II, port-selection feedback, and the proposed scheme are interpreted as different ways of inducing a feedback representation subspace. The probing codebook and the BS-side subspace inference network are then formulated as a coupled task-oriented design problem and are optimized end-to-end by maximizing the normalized CSI-capture efficiency. Extensive simulation results demonstrate that the proposed feedback scheme achieves Type-II-comparable CSI-capture capability with substantially lower online overhead and UE-side complexity, thereby improving the effective spectral efficiency.


[19] 2604.14557

Beam Squinting Effects in Super Wideband Communication Systems

Beam squint, the frequency-dependent shift of the main beam, poses a major challenge for wideband antenna arrays. This paper focuses on the beam squint effects in super wideband (SW) systems, where high mutual coupling (MC) effects are present. These high MC effects complicate beamforming (BF) by creating frequency-dependent phase relationships that invalidate conventional approaches. To accurately model MC effects, this paper uses a circuit-theoretic framework for tightly coupled SW uniform linear arrays (ULAs). We derive closed-form expressions for the average received signal-to-noise ratio (SNR) with BF in conventional half-wavelength spaced, weakly coupled arrays and validate them. Extending our analysis to tightly coupled SW arrays, we demonstrate that, in contrast to conventional weakly coupled arrays, the effective true time delays exhibit a nonlinear dependence on frequency due to coupling-induced phase shifts. A comparative analysis reveals that strong MC in SW arrays significantly reduces squint in phase-controlled BF, extending the usable bandwidth considerably.


[20] 2604.14606

UniPASE: A Generative Model for Universal Speech Enhancement with High Fidelity and Low Hallucinations

Universal speech enhancement (USE) aims to restore speech signals from diverse distortions across multiple sampling rates. We propose UniPASE, an extension of the low-hallucination PASE framework tailored for USE. At its core is DeWavLM-Omni, a unified representation-level enhancement module fine-tuned from WavLM via knowledge distillation on a large-scale supervised multi-distortion dataset. This module directly converts degraded waveforms into clean and linguistically faithful phonetic representations, ensuring robust enhancement with minimal linguistic hallucination. Based on these enhanced phonetic representations, an Adapter generates enhanced acoustic representations containing rich acoustic details, which a neural Vocoder uses to reconstruct corresponding high-fidelity 16-kHz waveforms. A PostNet then converts the waveforms to 48~kHz before resampling them to their original rates, enabling seamless handling of inputs and outputs at multiple sampling rates. Experimental results on several evaluation datasets, covering sub-tasks and full tasks, demonstrate that UniPASE achieves superior or competitive performance compared with existing state-of-the-art models. The proposed model also serves as the backbone of our submission to the URGENT 2026 Challenge, which achieved 1st place in the objective evaluation. The source code and audio demos are available at this https URL.


[21] 2604.14666

Low-Complexity Soft-Feedback Detector for AFDM Systems

Affine frequency division multiplexing (AFDM), an emerging multi-carrier modulation scheme, has garnered significant attention due to its resilience to Doppler shifts and capability to achieve full diversity in doubly dispersive channels. However, existing data detection algorithms for AFDM systems face a significant trade-off between computational complexity and accuracy. In this paper, a novel low-complexity data detection scheme, termed the soft-feedback detector (SFD), is proposed. Particularly, building upon a maximum ratio combining (MRC) estimator framework, the SFD leverages the a priori symbol distribution to mitigate error propagation during iterative detection. Specifically, soft-decision feedback is incorporated as extrinsic information derived from the log-likelihood ratios of the transmitted symbols. As a result, the proposed detector significantly enhances detection accuracy while maintaining low computational complexity. Simulation results demonstrate that the SFD consistently outperforms benchmark decision-feedback detectors. In particular, compared with the conventional MRC detector, the proposed scheme achieves approximately a 3 dB signal-to-noise ratio (SNR) gain at the bit error rate (BER) of $10^{-3}$.


[22] 2604.14678

Energy-based Regularization for Learning Residual Dynamics in Neural MPC for Omnidirectional Aerial Robots

Data-driven Model Predictive Control (MPC) has lately been the core research subject in the field of control theory. The combination of an optimal control framework with deep learning paradigms opens up the possibility to accurately track control tasks without the need for complex analytical models. However, the system dynamics are often nuanced and the neural model lacks the potential to understand physical properties such as inertia and conservation of energy. In this work, we propose a novel energy-based regularization loss function which is applied to the training of a neural model that learns the residual dynamics of an omnidirectional aerial robot. Our energy-based regularization encourages the neural network to cause control corrections that stabilize the energy of the system. The residual dynamics are integrated into the MPC framework and improve the positional mean absolute error (MAE) over three real-world experiments by 23% compared to an analytical MPC. We also compare our method to a standard neural MPC implementation without regularization and primarily achieve a significantly increased flight stability implicitly due to the energy regularization and up to 15% lower MAE. Our code is available under: this https URL.


[23] 2604.14689

ISAC with Backscattering RFID Tags: Beamforming and Codebook Design

This paper explores an integrated sensing and communication (ISAC) system with backscattering RFID tags. In this setup, an access point employs communication beams to serve communication users while leveraging a sensing beam to interrogate RFID tags. Under the total transmit power constraint of the system, our objective is to design a joint sensing and communication beamforming codebook by considering the tag interrogation and communication requirements. To lay a foundation for the codebook design problem, we first study the beamforming design problem in a single-tag scenario and investigate two approaches: (i) a zero-forcing approach with optimized sensing/communication power allocation, for which a closed-form solution is derived under a dominant sensitivity condition, and (ii) a joint sensing and communication beamforming design obtained by transmit power minimization. Then, we investigate the codebook design problem in a multi-tag scenario. To resolve this, we propose a sector-based joint sensing and communication beamforming codebook that scans the region of interest. For each sector, semidefinite relaxation and generalized Benders decomposition are employed to handle the resulting optimization. The simulation results show that the proposed joint beamforming designs can effectively mitigate the mutual interference between sensing and communication functionalities, thus enhancing the interrogation range of the tags with minimized transmit power. Also, the efficacy of the proposed sector-based codebook design has been demonstrated in terms of interrogation success rate, offering a promising approach for the ISAC-backscattering systems.


[24] 2604.14713

Optimal Robust Adaptive Beamforming for a General-Rank Signal Model via Equivalence of Maximin and Minimax SINR Problems

The globally optimal robust adaptive beamforming (RAB) solution is studied for worst-case signal-to-interference-plus-noise ratio (SINR) maximization (the maximin SINR problem) under convex and closed uncertainty sets for the desired signal covariance and interference-plus-noise covariance (INC) matrices, considering a general-rank signal model. First, the corresponding minimax SINR problem is reformulated as a convex optimization problem. In particular, this problem becomes a semidefinite programming (SDP) problem when the uncertainty sets can be represented by finitely many linear matrix inequality constraints. It is then shown that, for a general-rank signal model, the maximin and minimax SINR problems are equivalent when the uncertainty sets are convex and closed, in the sense that they share the same optimal value and the same set of optimal solutions. The requirement of closedness is weaker than the compactness assumption previously used to establish the equivalence between minimax and maximin SINR problems for the rank-one signal model, a state-of-the-art result reported approximately two decades ago. Consequently, an optimal solution to the minimax SINR problem is also globally optimal for the maximin SINR problem, and this solution can be obtained by solving the equivalent SDP of the minimax problem in a single step. In contrast, existing iterative approximation algorithms for the maximin SINR problem yield only locally optimal solutions. Simulation results demonstrate that these approximation algorithms return suboptimal values that can be strictly smaller than the optimal value of the minimax problem, and that the beamformer output SINR obtained via the minimax formulation is higher than that achieved by beamformers derived from the maximin problem using approximation algorithms.


[25] 2604.14714

Temporal Logic Resilience for Continuous-time Systems

In this paper, we present a novel framework for quantifying a lower bound on resilience in continuous-time (non)linear systems subject to external disturbances while ensuring satisfaction of signal temporal logic specifications. Unlike robustness, which evaluates how well a system satisfies a specification under a given disturbance, resilience measures the maximum disturbance a system can tolerate from a given initial state while maintaining specification satisfaction. We first derive bounds on the perturbed trajectories and then use them to formulate a computational method based on scenario optimization to efficiently compute the maximum admissible disturbance. We validate our approach through case studies, including dc motor, temperature regulation, a nonlinear numerical example, and a vehicle collision avoidance case.


[26] 2604.14736

Optimization for Pinching Antennas System With Multiple Carriers and Rate Splitting Multiple Access

To meet the urgent demands for spectral efficiency and multi-user access in high-frequency application scenario for the sixth-generation wireless communication, this paper investigates a rate splitting multiple access (RSMA) system assisted by pinching antennas (PAs) with multiple waveguides and multiple carriers, aiming to maximize the overall system sum rate. To address the high sensitivity of high-frequency signals to PA movement in the overloaded scenarios, a two-stage PA position optimization method based on both path loss and phase shift error minimization is proposed under RSMA framework. Specifically, the first step is to perform coarse adjustment by minimizing large-scale path loss. Then, based on the derivation of a closed-form solution for the ideal phase shift in a single-user single-carrier case, the fine-grained positions of PAs are optimized via a one-dimensional line search to minimize the composite phase shift error across all users and carriers. In order to meet the quality of service requirements, the Lagrange dual method is employed to obtain the closed form of beamforming vectors after the PA positions are determined. Simulation results demonstrate that the proposed scheme achieves significant improvement in sum rate and confirm that RSMA exhibits stronger robustness to inaccurate PA positions caused by both discrete position channel estimation and physical hardware compared to other multiple-access techniques in PA-assisted systems. Furthermore, the results validate that fine-grained PA position adjustment is particularly crucial in high-frequency bands.


[27] 2604.14754

Utilizing Improper Gaussian Signaling for Downlink Rate-Splitting Multiple Access with Imperfect Successive Interference Cancellation

To mitigate the residual interference from imperfect successive interference cancellation (SIC) in Rate-Splitting Multiple Access (RSMA), this paper incorporates improper Gaussian signaling (IGS) into the downlink RSMA framework. Unlike existing RSMA--IGS works that embed impropriety within IQ-imbalanced frameworks, we show that IGS alone effectively counters SIC-induced residual interference. For a basic SISO setup with IGS on the common stream and PGS on private streams, we establish three key results: the optimal impropriety degree for private rate maximization attains its maximum; closed-form optimal solutions with rigorous monotonicity conditions are derived for common rate maximization; and a soft actor-critic (SAC) algorithm is developed for the non-convex sum rate problem. Numerical results show that IGS consistently outperforms PGS, with the gain widening as SIC imperfection increases.


[28] 2604.14766

Temporal Cross-Modal Knowledge-Distillation-Based Transfer-Learning for Gas Turbine Vibration Fault Detection

Preventing machine failure is inherently superior to reactive remediation, particularly for critical assets like gas turbines, where early fault detection (FD) is a cornerstone of industrial sustainability. However, modern deep learning-based FD models often face a significant trade-off between architectural complexity and real-time operational constraints, often hindered by a lack of temporal context within restricted vibration signal windows. To address these challenges, this study proposes a Temporal Cross-Modal Knowledge-Distillation Transfer-Learning (TCMKDTL) framework. The framework employs a "privileged" teacher model trained on expansive temporal windows incorporating both past and future signal context to distill latent feature-based knowledge into a compact student model. To mitigate issues of data scarcity and domain shift, the framework leverages robust pre-training on benchmark datasets (such as CWRU) followed by adaptation to target industrial data. Extensive evaluation using experimental and industrial gas turbine (MGT-40) datasets demonstrates that TCMKDTL achieves superior feature separability and diagnostic accuracy compared to conventional pre-trained architectures. Ultimately, this approach enables high-performance, unsupervised anomaly detection suitable for deployment on resource-constrained industrial hardware.


[29] 2604.14774

Co-Design of Cryptographic Parameters and Delay-Aware Feedback Gain for Encrypted Control Systems

Encrypted control employs homomorphic encryption (HE) to protect both the computation and communication stages, making it a promising approach for secure networked control systems. Most existing results pre-design a controller in the plaintext domain and then implement it over encrypted data. However, this can be problematic because HE induces non-negligible communication and computation delays, which typically increase with the security level, potentially degrading control performance and even destabilizing the closed-loop system. To address this issue, we propose a co-design framework for cryptographic parameters and delay-aware feedback gain. We characterize the encryption-induced delay as a function of the cryptographic parameters and derive a sufficient condition for the existence of a stabilizing delay-aware feedback gain, expressed as a finite set of linear matrix inequalities. This leads to a tractable outer-inner design procedure that searches over cryptographic parameters that satisfy a desired security level and, for each such parameter, seeks a stabilizing feedback gain.


[30] 2604.14787

Towards Trustworthy 6G Network Digital Twins: A Framework for Validating Counterfactual What-If Analysis in Edge Computing Resources

Network Digital Twins (NDTs) enable safe what-if analysis for 6G cloud-edge infrastructures, but adoption is often limited by fragmented workflows from telemetry to validation. We present a data-driven NDT framework that extends 6G-TWIN with a scalable pipeline for cloud-edge telemetry aggregation and semantic alignment into unified data models. Our contributions include: (i) scalable cloud-edge telemetry collection, (ii) regime-aware feature engineering capturing the network's scaling behavior, and (iii) a validation methodology based on Sign Agreement and Directional Sensitivity. Evaluated on a Kubernetes-managed cluster, the framework extrapolates performance to unseen high-load regimes. Results show both Deep Neural Network (DNN) and XGBoost achieve high regression accuracy (R2 > 0.99), while the XGBoost model delivers superior directional reliability (Sa > 0.90), making the NDT a trustworthy tool for proactive resource scaling in out-of-distribution scenarios.


[31] 2604.14800

Generative Modeling of Complex-Valued Brain MRI Data

Objective. Standard Magnetic Resonance Imaging (MRI) reconstruction pipelines discard phase information captured during acquisition, despite evidence that it encodes tissue properties relevant to tumor diagnosis. Current machine learning approaches inherit this limitation by operating exclusively on reconstructed magnitude images. The aim of this study is to build a generative framework which is capable of jointly modeling magnitude and phase information of complex-valued MRI scans. Approach. The proposed generative framework combines a conditional variational autoencoder, which compresses complex-valued MRI scans into compact latent representations while preserving phase coherence, with a flow-matching-based generative model. Synthetic sample quality is assessed via a real-versus-synthetic classifier and by training downstream classifiers on synthetic data for abnormal tissue detection. Main results. The autoencoder preserves phase coherence above 0.997. Real-versus-synthetic classification yields low AUROC values between 0.50 and 0.66 across all acquisition sequences, indicating generated samples are nearly indistinguishable from real data. In downstream normal-versus-abnormal classification, classifiers trained entirely on synthetic data achieve an AUROC of 0.880, surpassing the real-data baseline of 0.842 on a publicly available dataset (fastMRI). This advantage persists on an independent external test set from a different institution with biopsy-confirmed labels. Significance. The proposed framework demonstrates the feasibility of jointly modeling magnitude and phase information for normal and abnormal complex-valued brain MRI data. Beyond synthetic data generation, it establishes a foundation for the usage of complete brain MRI information in future diagnostic applications and enables systematic investigation of how magnitude and phase jointly encode pathology-specific features.


[32] 2604.14818

CBF-based Probabilistic Safe Navigation under Unknown Nonlinear Obstacle Dynamics

Safe navigation for an ego vehicle in uncertain environments characterized by dynamic obstacles with unknown nonlinear dynamics is a challenging problem of significant practical interest. Existing approaches in the literature either lack formal safety guarantees, require full model knowledge, or fail to account for the risk associated with the vehicle's exact body geometry and the temporal evolution of uncertainty between sampling instants. In this paper, we propose a data-driven observer for the unknown obstacle dynamics that generates an alpha-confidence set flow, which is exactly transformed into a Control Barrier Function (CBF) to enforce (1-alpha)-probability safety. The proposed framework accommodates nonlinear ego vehicle dynamics of arbitrary relative degree, as demonstrated through case studies involving first- and second-order dynamics of an unmanned surface vehicle.


[33] 2604.14841

Generalizability of Learning-based Occupancy Detection in Residential Buildings

This paper investigates non-intrusive occupancy detection methods for residential buildings using environmental sensor data from the KTH Live-In Lab in Stockholm, Sweden. Three machine learning approaches, namely, logistic regression (LR), support vector machines (SVM), and long short-term memory (LSTM) network enhanced with an attention mechanism, are evaluated in terms of predictive performance and computational complexity. The analysis considers the trade-off between sensor availability (investment cost) and prediction accuracy in real applications, as well as the models' cross-apartment generalizability. Hyperparameters for both the SVM and LSTM models are optimized using Bayesian optimization. All three models are evaluated on data collected from apartments not used during training, and on data generated from a calibrated digital model of the testbed. Results show that all models achieve comparable performance on the same-apartment test data (accuracy of approximately 0.83, F1 score of approximately 0.86). When assessed on cross-apartment data, the LSTM model demonstrates the strongest generalization capability (accuracy of 0.84, F1 score of 0.85), while LR provides a competitive, low-complexity alternative for applications that do not require cross-apartment generalization.


[34] 2604.14842

Simplification Ad Absurdum? Revisiting Gas Flow Modeling for Integrated Energy System Planning

This paper analyzes the implications of simplified pipeline gas flow models for integrated energy system planning. A case study of an integrated power-hydrogen expansion planning problem shows that simplifying pressure-flow relationships and gas dynamics can lead to expansion plans that incur substantial regret when evaluated under a more realistic dynamic gas flow model -- due to suboptimal system expansion, operation, and non-supplied hydrogen. Numerical experiments show that planning under the highly simplified transport and transport-linepack models -- commonly used in expansion studies -- can result in regret exceeding several thousand percent and yield expansion plans that lack robustness across demand levels. Planning under steady-state conditions partially mitigates these effects, but still leaves significant cost-reduction potential untapped compared to dynamic planning due to neglected linepack flexibility. Developing efficient solution algorithms for the dynamic model is a promising direction for future research.


[35] 2604.14869

An Open-Source Hardware-Aware Sub-THz Radio-Stripe Simulator

Sub-Terahertz radio-stripe and distributed MIMO architectures promise extreme spatial reuse and multi-GHz bandwidths, but the cascaded fiber front-haul and RF hardware impairments strongly shape end-to-end performance. This paper presents an open-source, configuration-driven simulator that models the full waveform-level signal chain from CP-OFDM baseband generation in the central unit, through measurement-parameterized polymer microwave fiber and coupler links, to booster/active Radio Units (RUs) with configurable nonlinearity, noise, in-phase and quadrature imbalance, and oscillator phase noise and carrier frequency offset. Wireless propagation is supported via lightweight deterministic and stochastic per-subcarrier channel models as well as site-specific ray-tracing datasets generated with a companion Sionna ray-tracer module. The simulator exports intermediate waveforms and system metrics (e.g., normalised mean square error, signal-to-noise-and-distortion ratio, bit error rate) to enable reproducible studies of impairment accumulation, calibration, and algorithmic choices such as RU selection and beam management.


[36] 2604.14887

Impact of deployment on energy efficiency of sub-THz transmission

Sub-THz bands are promising high bandwidth and data rates, and in the recent years the device technologies made large progress and provided a multitude of transceiver, power amplifier (PA) and phased array devices supporting the frequency bands above 100 GHz. The more painful aspect of sub-THz transmission is the increased power consumption, caused by the large data rates and the related data conversion and processing effort, and on the analog side the low achievable PA efficiency and the reduced achievable output power. When planning a deployment of sub-THz communication systems, the target coverage and throughput can be achieved with a variety of scenarios, which will be different with respect to locations and number of base stations and system architectures. Although leading to similar performance, they will differ significantly in the overall power consumption. With an accurate power consumption model, including also baseband (BB) processing functionality, and system level simulations for different hybrid beamforming and MIMO schemes the related variations in power consumption in relation to a given performance are evaluated. This paper shows the critical design aspects for energy efficient sub-THz deployments by highlighting the sub- THz specific trade-offs between different number of BS with different transmit powers but also changing number of BB units and RF chains.


[37] 2604.14905

Data-driven Linear Quadratic Integral Control: A Convex Formulation and Policy Gradient Approach

This paper studies the data-driven synthesis of linear quadratic integral (LQI) controllers for continuous-time systems. The objective is to achieve optimal state-feedback control with integral action for reference tracking using only measured data. To this end, we derive a data-driven closed-loop parameterization of the augmented dynamics that incorporates the integral state while relying solely on input-state-output measurements of the underlying system. Based on this parameterization, a data-driven convex optimization problem is formulated whose solution yields the optimal linear quadratic regulator (LQR) feedback gain for the augmented system without explicit knowledge of the system matrices. In addition, a policy gradient flow is derived to compute the optimal controller within the space of stabilizing gains. The proposed approach enables data-driven optimal tracking control while avoiding explicit state augmentation in the data collection phase. The effectiveness of the method is demonstrated through a numerical example involving a distributed generation unit (DGU) in a DC microgrid.


[38] 2604.14919

A Numerical and Experimental Evaluation of Microbubble Communication Using OpenFOAM

Reliable communication in confined environments, such as blood vessels or industrial pipelines, remain challenging due to signal attenuation and limited sensor accessibility. Therefore, this work investigates microbubbles as robust information carriers within the Internet of Bio-Nano Things (IoBNT) paradigm, leveraging their established use as ultrasound contrast agents. It presents a combined experimental and numerical analysis characterizing microbubble transport under varying flow conditions relevant to biomedical and industrial applications. Experiments with SonoVue microbubbles in a recirculating water channel validate an OpenFOAM-based Computational Fluid Dynamics (CFD) simulation using the incompressibleDenseParticleFluid solver. Key cases examine water vs. blood-like media and high vs. physiological flow velocities, analyzing the relative influence of fluid properties and advection on microbubble dynamics. Recirculation effects are considered in relation to in vivo circulation timescales.


[39] 2604.14960

Modelling and identification of diffusively coupled linear networks with additional directed links

Dynamic networks consist of interconnected dynamical systems. The subsystems can be viewed as transformations of input signals into output signals, where signals flow from one system into another through interconnections. The signal flows represent directions of information flow, thus a dynamic network can be visualised by a directed graph. In contrast, natural and physical laws only impose relations between systems variables, while variables are shared among systems via interconnections. Sharing is independent of direction, and therefore a dynamic network originating from physics can be visualised by an undirected graph. Typically, dynamic networks are considered to have either directed or undirected interconnections. For both situations, network models, analytic tools, and identification algorithms have been developed. However, dynamic networks can also have both directed and undirected interconnections, for example, in physical networks equipped with digital controllers. In this work, we present mixed linear dynamic networks that contain both undirected and directed interconnections, where the nature of the interconnecting dynamics needs to be incorporated into the modelling framework, identifiability analysis, and identification procedure. For these mixed networks, we derive dynamic network models; formulate conditions for consistent identification of all dynamics in the network; and develop a tractable identification algorithm that delivers consistent estimates.


[40] 2604.14977

Minimal Input Cardinality Disturbance Decoupling of Coupled Oscillators via Output Feedback with Application to Power Networks

In this paper, we identify the smallest set of control input nodes and an associated output feedback law that achieves complete disturbance decoupling for a class of coupled oscillator networks. The focus is specifically on systems linearized around a stable phase-locked synchronized state. The proposed theoretical framework is applied to the linearized swing dynamics of power grids operating near synchronization. In this context, the disturbance decoupling problem corresponds to isolating subsets of nodes from exogenous disturbances by means of batteries that can both add or withdraw active power. Numerical simulations carried out on the IEEE New England 39-bus system show that the proposed methodology not only yields a minimal actuator placement ensuring effective disturbance rejection, but also preserves the internal stability of the closed-loop system.


[41] 2604.14994

Degradation-aware Predictive Energy Management for Fuel Cell-Battery Ship Power System with Data-driven Load Forecasting

Hydrogen-based zero-emission ships are a key element in the decarbonization of the maritime sector. To strengthen these their economic competitiveness, it is key to drive their costs to a minimum. Current literature mainly focuses on fuel consumption minimization, but there is a lack of explicit consideration of costs arising from cell degradation and optimization-based approaches that leverage information on future load trajectories. This work aims at minimizing the operational cost of fuel cell-battery hybrid shipboard power systems, accounting for hydrogen consumption and cell degradation as the main cost drivers. A degradation-aware predictive energy management strategy utilizing data-driven load forecasting is designed and showcased at the example of a virtually retrofitted harbor tug. This work shows that the real onboard measurements of the vessel can be utilized to make accurate load predictions over 15min. Results indicate that the degradation-aware, predictive control simultaneously reduces the hydrogen consumption by up to 5.8% and the cell degradation by up to 36.4% with an aged fuel cell system when compared to a filter-based benchmark applied to real operating data of the harbor tug. With an increased prediction horizon of 1h, further significant reductions of 3.8% and 14.0% could be shown.


[42] 2604.15004

On-Line Policy Iteration with Trajectory-Driven Policy Generation

We consider deterministic finite-horizon optimal control problems with a fixed initial state. We introduce an on-line policy iteration method, which starting from a given policy, however obtained, generates a sequence of cost improving policies and corresponding trajectories. Each policy produces a trajectory, which is used in turn to generate data for training the next policy. The method is motivated by problems that are repeatedly solved starting from the same initial state, including discrete optimization and path planning for repetitive tasks. For such problems, the method is fast enough to be used on-line. Under a natural consistency condition, we show that the sequence of costs of the generated policies is monotonically improving for the given initial state (but not necessarily for other states). We illustrate our results with computational studies from combinatorial optimization and 3-dimensional path planning for drones in the presence of obstacles. We also discuss briefly a stochastic counterpart of our algorithm. Our proposed framework combines elements of rollout and policy iteration with flexible trajectory-based policy representations, and applies to problems involving a single as well as multiple decision makers. It also provides a principled way to train neural network-based policies using trajectory data, while preserving monotonic cost improvement.


[43] 2604.15028

Nonlinear backstepping with saturation for low-thrust station-keeping of libration point orbits

This paper presents a novel nonlinear backstepping control law for continuous, low-thrust station-keeping in the Earth-Moon system. Quasi-periodic libration point orbits are targeted under a high-fidelity model of the dynamics. Almost global uniform exponential stability guarantees are attained, as shown through Lyapunov's stability theory. Saturation of the actuators is formally included in the controller design, such that these guarantees hold even in the event of saturation. The relationship between saturation threshold, control gains, and deviation is studied and an optimal procedure for gain selection is discussed. The control solution is tested numerically through a Monte Carlo analysis over representative application cases, subject to operational errors, constraints, and external perturbations. Station-keeping under actuation saturation is validated considering a conservative threshold for typical electric propulsion systems.


[44] 2604.15055

Enhancing time-frequency resolution with optimal transport and barycentric fusion of multiple spectrogram

Time-frequency representations, such as the short-time Fourier transform (STFT), are fundamental tools for analyzing non-stationary signals. However, their ability to achieve sharp localization in both time and frequency is inherently limited by the Gabor-Heisenberg uncertainty principle. In this paper, we address this limitation by introducing a method to generate super-resolution spectrograms through the fusion of two or more spectrograms with varying resolutions. Specifically, we compute the super-resolution spectrogram as the barycenter of input spectrograms using optimal transport (OT) divergences. Unlike existing fusion approaches, our method does not require the input spectrograms to share the same time-frequency grid. Instead, the input spectrograms can be computed using any STFT parameters, and the resulting super-resolution spectrogram can be defined on an arbitrary user-specified grid. We explore various OT divergences based on different transportation costs. Notably, we introduce a novel transportation cost that preserves time-frequency geometry while significantly reducing computational complexity compared to standard Wasserstein barycenters. We adopt the unbalanced OT framework and derive a new block majorization-minimization algorithm for efficient barycenter computation. We validate the proposed method on controlled synthetic signals and recorded speech using both quantitative and qualitative evaluations. The results show that our approach combines the best localization properties of the input spectrograms and outperforms an unsupervised state-of-the-art fusion method.


[45] 2604.15083

A Novel 6G Dynamic Channel Map Based on a Hybrid Channel Model

In the sixth generation (6G) wireless communication networks, the device density, antenna number, and the complexity of communication scenarios will significantly increase, which brings great challenges for system design and network optimization. By obtaining channel information in advance, channel map has become a promising solution to these challenges in 6G era. However, conventional channel maps cannot be updated in time as physical environment changes. To solve the problem, a novel dynamic channel map (DCM) is proposed in this work. For DCM construction, we further present a ray tracing (RT) and geometric stochastic hybrid channel model (RT-GSHCM), which pre-constructs the DCM offline by RT and updates it online by geometry-based stochastic channel model (GBSM). By this way, the DCM can provide time-varying channel information and channel properties while matintaining accuracy. Next, a channel measurement campaign is conducted, and the measurement results are compared with the RT-GSHCM, RT, and GBSM. The comparison results validate the accuracy of DCM. Meanwhile, the time cost on DCM update is compared with that of conventional channel maps, illustrating the time-efficiency of DCM. Finally, important statistical channel properties of RT-GSHCM are further derived, analyzed, and compared under different configurations of interaction objects in physical environment.


[46] 2604.15139

Ternary Noise Modulation

By exploiting noise as an information-bearing resource, noise-driven communication offers a promising framework for low-complexity and secure wireless system design. In this letter, the scheme of ternary noise modulation (T-NoiseMod) is proposed for noise-based wireless communication scenarios, where information is encoded into the statistical characteristics of artificial noise. Unlike conventional binary NoiseMod, which employs two variance levels, the proposed scheme introduces a third transmission state: intentional silence. By pairing two consecutive noise blocks, the signaling scheme is expanded to eight valid state combinations, enabling the transmission of three information bits per signaling interval. In our proposed scheme, the two-stage receiver is developed, consisting of mean-based silent-state detection followed by variance-based low/high classification. An analytical expression for the bit error probability (BEP) is derived for Rayleigh fading. Our computer simulation results match closely with our theoretical results and show the effects of key system parameters. Furthermore, comparisons with binary NoiseMod demonstrate the inherent trade-off between reliability and rate.


[47] 2604.15223

Eccentricity Confound in EEG-based Visual Attention Decoding from Gaze-Fixated Neural Tracking of Motion in Natural Videos

Objective. Decoding visual attention from brain signals during naturalistic video viewing has emerged as a new direction in brain-computer interface research. Current methods assume that stronger coupling between object motion and neural activity indicates higher attention, but this can be confounded by eye movement artifacts and stimulus properties. This study investigates how visual eccentricity (the distance between a visual object and the fixation point) affects neural responses when eye movement artifacts are controlled. Approach. EEG signals were recorded across three tasks that manipulated object eccentricity and attention conditions while participants maintained gaze fixation. Correlation analysis and match-mismatch decoding were performed to quantify the neural tracking of object motion. Main results. The analysis supports three conclusions: (1) neural tracking of object motion in natural videos works under gaze fixation; (2) the strength of neural tracking under gaze fixation is predictive of attention; and (3) there exists a significant eccentricity confound in the EEG responses, with poorer neural tracking of motion at larger eccentricities. Significance. These results provide critical evidence that findings from previous free-viewing studies reflect genuine neural processing rather than mere oculomotor artifacts. However, the identified eccentricity effect highlights a major limitation for current decoding approaches that assume coupling strength reflects attention levels alone.


[48] 2604.15232

Physical Layer Security Performance of Pinching-Antenna Systems With In-Waveguide Attenuation

Pinching antenna (PA) systems have recently gained significant attention. While their physical-layer security (PLS) is being explored, most studies rely on idealized lossless models, ignoring practical waveguide attenuation. In this paper, we investigate the PLS performance of PA systems under a more realistic attenuation-incorporated waveguide model. Specifically, we investigate a PA system-based secure communication scenario consisting of a base station (BS), a legitimate user, and a passive eavesdropper. We derive expressions for closed-form upper and lower bounds on both the secrecy outage probability (SOP) and ergodic secrecy capacity (ESC). The results indicate that the PA system outperforms conventional fixed-antenna systems.


[49] 2604.15238

A Nonlinear Separation Principle: Applications to Neural Networks, Control and Learning

This paper investigates continuous-time and discrete-time firing-rate and Hopfield recurrent neural networks (RNNs), with applications in nonlinear control design and implicit deep learning. First, we introduce a nonlinear separation principle that guarantees global exponential stability for the interconnection of a contracting state-feedback controller and a contracting observer, alongside parametric extensions for robustness and equilibrium tracking. Second, we derive sharp linear matrix inequality (LMI) conditions that guarantee the contractivity of both firing rate and Hopfield neural network architectures. We establish structural relationships among these certificates-demonstrating that continuous-time models with monotone non-decreasing activations maximize the admissible weight space, and extend these stability guarantees to interconnected systems and Graph RNNs. Third, we combine our separation principle and LMI framework to solve the output reference tracking problem for RNN-modeled plants. We provide LMI synthesis methods for feedback controllers and observers, and rigorously design a low-gain integral controller to eliminate steady-state error. Finally, we derive an exact, unconstrained algebraic parameterization of our contraction LMIs to design highly expressive implicit neural networks, achieving competitive accuracy and parameter efficiency on standard image classification benchmarks.


[50] 2604.15252

Tube-Based Robust Data-Driven Predictive Control

This paper presents a tractable tube-based robust data-driven predictive control scheme that uses only a single finite noisy input-state trajectory of an unknown discrete-time linear time-invariant (LTI) system. A simplex constraint is imposed on the Hankel coefficient vector, yielding explicit polyhedral bounds on the prediction mismatch induced by bounded measurement noise. Using certified initial and terminal robust positively invariant (RPI) sets, we derive a tube-tightened formulation whose online optimization problem is a strictly convex quadratic program (QP). The resulting controller guarantees recursive feasibility, robust satisfaction of input and state constraints, and practical input-to-state stability of the closed loop with respect to measurement noise. Numerical examples illustrate the effectiveness, robustness, and closed-loop performance of the proposed method.


[51] 2604.14152

From Black Box to Glass Box: Cross-Model ASR Disagreement to Prioto Review in Ambient AI Scribe Documentation

Ambient AI "scribe" systems promise to reduce clinical documentation burden, but automatic speech recognition (ASR) errors can remain unnoticed without careful review, and high-quality human reference transcripts are often unavailable for calibrating uncertainty. We investigate whether cross-model disagreement among heterogeneous ASR systems can act as a reference-free uncertainty signal to prioritize human verification in medical transcription workflows. Using 50 publicly available medical education audio clips (8 h 14 min), we transcribed each clip with eight ASR systems spanning commercial APIs and open-source engines. We aligned multi-model outputs, built consensus pseudo-references, and quantified token-level agreement using a majority-strength metric; we further characterized disagreements by type (content vs. punctuation/formatting) and assessed per-model agreement via leave-one-model-out (jackknife) consensus scoring. Inter-model reliability was low (ICC[2,1] = 0.131), indicating heterogeneous failure modes across systems. Across 76,398 evaluated token positions, 72.1% showed near-unanimous agreement (7-8 models), while 2.5% fell into high-risk bands (0-3 models), with high-risk mass varying from 0.7% to 11.4% across accent groups. Low-agreement regions were enriched for content disagreements, with the content fraction increasing from 53.9% to 73.9% across quintiles of high-risk mass. These results suggest that cross-model disagreement provides a sparse, localizable signal that can surface potentially unreliable transcript spans without human-verified references, enabling targeted review; clinical accuracy of flagged regions remains to be established.


[52] 2604.14193

QualiaNet: An Experience-Before-Inference Network

Human 3D vision involves two distinct stages: an Experience Module, where stereo depth is extracted relative to fixation, and an Inference Module, where this experience is interpreted to estimate 3D scene properties. Paradoxically, although our experience of stereo vision does not provide us with distance information, it does affect our inferences about visual scale. We propose the Inference Module exploits a natural scene statistic: near scenes produce vivid disparity gradients, while far scenes appear comparatively flat. QualiaNet implements this two-stage architecture computationally: disparity maps simulating human stereo experience are passed to a CNN trained to estimate distance. The network can recover distance from disparity gradients alone, validating this approach.


[53] 2604.14204

Disentangled Dual-Branch Graph Learning for Conversational Emotion Recognition

Multimodal emotion recognition in conversations aims to infer utterance-level emotions by jointly modeling textual, acoustic, and visual cues within context. Despite recent progress, key challenges remain, including redundant cross-modal information, imperfect semantic alignment, and insufficient modeling of high-order speaker interactions. To address these issues, we propose a framework that combines dual-space feature disentanglement with dual-branch graph learning. A shared encoder and modality-specific encoders are used to separate modality-invariant and modality-specific representations. The invariant features are modeled by a Fourier graph neural network to capture global consistency and complementary patterns, with a frequency-domain contrastive objective to enhance discriminability. In parallel, a speaker-aware hypergraph is constructed over modality-specific features to model high-order interactions, along with a speaker-consistency constraint to maintain coherent semantics. Finally, the two branches are fused for utterance-level emotion prediction. Experiments on IEMOCAP and MELD demonstrate that the proposed method achieves superior performance over strong baselines, validating its effectiveness.


[54] 2604.14229

Magnitude Is All You Need? Rethinking Phase in Quantum Encoding of Complex SAR Data

Synthetic Aperture Radar (SAR) data is inherently complex-valued, while quantum machine learning (QML) models naturally operate in complex Hilbert spaces. This apparent alignment suggests that incorporating both magnitude and phase information into quantum encoding should improve performance in SAR Automatic Target Recognition (ATR). In this work, we systematically evaluate this assumption by comparing five quantum encoding strategies: magnitude-only, joint complex, I/Q-based, preprocessed phase, and pure quantum, under a unified experimental framework on the MSTAR benchmark dataset. Contrary to expectation, we observe a consistent pattern: in hybrid quantum-classical architectures, magnitude-only encoding outperforms all complex-valued strategies, achieving 99.57% accuracy on a 3-class task and 71.19% on an 8-class task, while phase-aware methods provide negligible (~0%) or negative improvements. In contrast, in purely quantum architectures with only 184-224 trainable parameters and no classical components, phase information becomes essential, contributing up to 21.65% improvement in accuracy. These results reveal that the utility of phase information is not inherent to the data, but depends critically on the model architecture. Hybrid models rely on classical components that compensate for missing phase information, whereas purely quantum models require phase to construct discriminative representations. Our findings provide practical design guidelines for encoding complex-valued data in QML and highlight the importance of encoding-architecture co-design in the NISQ era.


[55] 2604.14259

Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay

Functional magnetic resonance imaging (fMRI) is widely used for studying and diagnosing brain disorders, with functional connectivity (FC) matrices providing powerful representations of large-scale neural interactions. However, existing diagnostic models are trained either on a single site or under full multi-site access, making them unsuitable for real-world scenarios where clinical data arrive sequentially from different institutions. This results in limited generalization and severe catastrophic forgetting. This paper presents the first continual learning framework specifically designed for fMRI-based diagnosis across heterogeneous clinical sites. Our framework introduces a structure-aware variational autoencoder that synthesizes realistic FC matrices for both patient and control groups. Built on this generative backbone, we develop a multi-level knowledge distillation strategy that aligns predictions and graph representations between new-site data and replayed samples. To further enhance efficiency, we incorporate a hierarchical contextual bandit scheme for adaptive replay sampling. Experiments on multi-site datasets for major depressive disorder (MDD), schizophrenia (SZ), and autism spectrum disorder (ASD) show that the proposed generative model enhances data augmentation quality, and the overall continual learning framework substantially outperforms existing methods in mitigating catastrophic forgetting. Our code is available at this https URL.


[56] 2604.14309

Aerial Multi-Functional RIS in Fluid Antennas-Aided Full-Duplex Networks: A Self-Optimized Hybrid Deep Reinforcement Learning Approach

To address high data traffic demands of sixth-generation (6G) networks, this paper proposes a novel architecture that integrates autonomous aerial vehicles (AAVs) and multi-functional reconfigurable intelligent surfaces (MF-RISs) as AM-RIS in fluid antenna (FA)-assisted full-duplex (FD) networks. The AM-RIS provides hybrid functionalities, including signal reflection, amplification, and energy harvesting (EH), potentially improving both signal coverage and sustainability. Meanwhile, FA facilitates fine-grained spatial adaptability at FD-enabled base station (BS), which complements residual self-interference (SI) suppression. We aim at maximizing the overall energy efficiency (EE) by jointly optimizing transmit DL beamforming at BS, UL user power, configuration of AM-RIS, and positions of the FA and AM-RIS. Owing to the hybrid continuous-discrete parameters and high dimensionality of the intractable problem, we have conceived a self-optimized multi-agent hybrid deep reinforcement learning (DRL) framework (SOHRL), which integrates multi-agent deep Q-networks (DQN) and multi-agent proximal policy optimization (PPO), respectively handling discrete and continuous actions. To enhance self-adaptability, an attention-driven state representation and meta-level hyperparameter optimization are incorporated, enabling multi-agents to autonomously adjust learning hyperparameters. Simulation results validate the effectiveness of the proposed AM-RIS-enabled FA-aided FD networks empowered by SOHRL algorithm. The results reveal that SOHRL outperforms benchmarks of the case without attention mechanism and conventional hybrid/multi-agent/standalone DRL. Moreover, AM-RIS in FD achieves the highest EE compared to half-duplex, conventional rigid antenna arrays, partial EH, and conventional RIS without amplification, highlighting its potential as a compelling solution for EE-aware wireless networks.


[57] 2604.14360

Digital Guardians: The Past and The Future of Cyber-Physical Resilience

Resilience in cyber-physical systems (CPS) is the fundamental ability to maintain safety and critical functionality despite adverse "perturbations," which includes security attacks, environmental disruptions, and hardware or software failures. This survey provides a comprehensive review of CPS resilience, framing the field through five interconnected themes that are required in an integrated whole to achieve real-world resilience. The article first posits that resilience is a system-wide property emerging from interactions between hardware, software, and human users. Second, it addresses the challenges of learning-enabled CPS, which often operate in data-scarce environments characterized by imbalanced or noisy data, requiring innovative solutions like synthetic data generation and foundation model adaptation. Third, the survey examines proactive measures for resilience, which include distinctive aspects of verification, testing, and redundancy. Fourth, it explores recovery mechanisms, moving beyond traditional fault models to design "just good enough" recovery strategies that prioritize safety-critical functions during perturbations. Finally, it highlights the central role of the human, focusing on the different levels of human intervention, the necessity of trust calibration, and the requirement for explainable AI to support human-CPS teaming. These themes are illustrated through representative application domains, primarily Connected and Autonomous Transportation Systems (CATS) and Medical CPS (MCPS). By integrating the five interconnected themes, this survey provides a systematic roadmap for achieving the resilient CPS in increasingly complex and adversarial environments.


[58] 2604.14392

Spatiotemporal Analysis of VIIRS Satellite Observations and Network Traffic During the 2025 Manitoba Wildfires

Climate change has intensified extreme weather and wildfire conditions globally. Canada experienced record-breaking wildfires in 2023 and 2025, burning millions of hectares and severely impacting the Prairie provinces, with Manitoba facing its worst season in 30 years. These events highlight the urgent need to understand and mitigate escalating fire risks. While existing research largely focuses on wildfire management approaches, few studies have explored the relationship between user network traffic and wildfire activity, despite the potential of such correlations to provide valuable spatiotemporal insights into wildfire dynamics. This paper investigates the relationship between wildfire intensity and network performance during the 2025 Manitoba wildfire season, using Visible Infrared Imaging Radiometer Suite (VIIRS) satellite-derived Fire Radiative Power data and large-scale Speedtest measurements. We found statistically significant correlations between wildfire intensity and several network performance metrics in both the province-wide and region-wide case studies, as measured by Spearman's correlation coefficients ($\rho$) and corresponding p-values. Throughput-related metrics showed inverse correlations with wildfire intensity (e.g., download speed: $\rho = -0.214$, $p\_value = 0.004$), whereas latency-related metrics showed positive correlations (e.g., round-trip time latency: $\rho = 0.162$, $p\_value = 0.0308$). The findings suggest satellite fire indicators and network performance metrics together can reveal vulnerabilities during extreme environmental events and support diaster response and recovery efforts.


[59] 2604.14399

SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing

Autonomous on-orbit servicing demands embodied agents that perceive through visual sensors, reason about 3D spatial situations, and execute multi-phase tasks over extended horizons. We present SpaceMind, a modular and self-evolving vision-language model (VLM) agent framework that decomposes knowledge, tools, and reasoning into three independently extensible dimensions: skill modules with dynamic routing, Model Context Protocol (MCP) tools with configurable profiles, and injectable reasoning-mode skills. An MCP-Redis interface layer enables the same codebase to operate across simulation and physical hardware without modification, and a Skill Self-Evolution mechanism distills operational experience into persistent skill files without model fine-tuning. We validate SpaceMind through 192 closed-loop runs across five satellites, three task types, and two environments, a UE5 simulation and a physical laboratory, deliberately including degraded conditions to stress-test robustness. Under nominal conditions all modes achieve 90--100% navigation success; under degradation, the Prospective mode uniquely succeeds in search-and-approach tasks where other modes fail. A self-evolution study shows that the agent recovers from failure in four of six groups from a single failed episode, including complete failure to 100% success and inspection scores improving from 12 to 59 out of 100. Real-world validation confirms zero-code-modification transfer to a physical robot with 100% rendezvous success. Code: this https URL


[60] 2604.14428

Distributed Learning of Quantum State Tomography Robust to Readout Errors

Scalable estimation of quantum states with readout errors is a central challenge in large multiqubit systems. Existing overlapping-tomography methods improve scalability by working with local subsystems, but they usually assume known or separately calibrated measurements. At the same time, readout-estimation methods model measurement errors without enforcing consistency among overlapping regional states. In this context, the present paper introduces a unified framework for joint regional quantum state tomography with readout errors. A multiqubit system is partitioned in overlapping regions, each region is assigned to a local density operator and a local confusion matrix, and neighboring regions are coupled through reduced-state consistency on shared subsystems. This leads to a structured bilinear optimization problem. To solve it, a distributed alternating method is developed in which the state-update step is handled by the alternating direction method of multipliers (ADMM), while the confusion-matrix updates are carried out locally in parallel. Analytical guarantees are also established, including a sufficient condition for local identifiability, local quadratic growth of the population misfit, and convergence of the inner state-update procedure. Simulations on Ring, Ladder, Torus, and Hub graph geometries show that joint estimation improves state recovery over fixed-readout reconstruction, recovers a substantial portion of oracle performance, and reveals a clear tradeoff between state estimation performance, communication, and computation.


[61] 2604.14527

Design and Validation of a Low-Cost Smartphone Based Fluorescence Detection Platform Compared with Conventional Microplate Readers

A low cost fluorescence-based optical system is developed for detecting the presence of certain microorganisms and molecules within a diluted sample. A specifically designed device setup compatible with conventional 96 well plates is chosen to create an ideal environment in which a smart phone camera can be used as the optical detector. In comparison with conventional microplate reading machines such as Perkin Elmer Victor Machine, the device presented in this paper is not equipped with expensive elements such as exciter filer, barrier filter and photomultiplier; instead, a phone camera is all needed to detect fluorescence within the sample. The strategy being involved is to determine the relationship between the image color of the sample in RGB color space and the molar concentration of the fluorescence specimen in that sample. This manuscript is a preprint version of work related to a publication in IEEE. The final version may differ from this manuscript.


[62] 2604.14548

VoxSafeBench: Not Just What Is Said, but Who, How, and Where

As speech language models (SLMs) transition from personal devices into shared, multi-user environments, their responses must account for far more than the words alone. Who is speaking, how they sound, and where the conversation takes place can each turn an otherwise benign request into one that is unsafe, unfair, or privacy-violating. Existing benchmarks, however, largely focus on basic audio comprehension, study individual risks in isolation, or conflate content that is inherently harmful with content that only becomes problematic due to its acoustic context. We introduce VoxSafeBench, among the first benchmarks to jointly evaluate social alignment in SLMs across three dimensions: safety, fairness, and privacy. VoxSafeBench adopts a Two-Tier design: Tier1 evaluates content-centric risks using matched text and audio inputs, while Tier2 targets audio-conditioned risks in which the transcript is benign but the appropriate response hinges on the speaker, paralinguistic cues, or the surrounding environment. To validate Tier2, we include intermediate perception probes and confirm that frontier SLMs can successfully detect these acoustic cues yet still fail to act on them appropriately. Across 22 tasks with bilingual coverage, we find that safeguards appearing robust on text often degrade in speech: safety awareness drops for speaker- and scene-conditioned risks, fairness erodes when demographic differences are conveyed vocally, and privacy protections falter when contextual cues arrive acoustically. Together, these results expose a pervasive speech grounding gap: current SLMs frequently recognize the relevant social norm in text but fail to apply it when the decisive cue must be grounded in speech. Code and data are publicly available at: this https URL


[63] 2604.14565

Model-Based Reinforcement Learning Exploits Passive Body Dynamics for High-Performance Biped Robot Locomotion

Embodiment is a significant keyword in recent machine learning fields. This study focused on the passive nature of the body of a biped robot to generate walking and running locomotion using model-based deep reinforcement learning. We constructed two models in a simulator, one with passive elements (e.g., springs) and the other, which is similar to general humanoids, without passive elements. The training of the model with passive elements was highly affected by the attractor of the system. This lead that although the trajectories quickly converged to limit cycles, it took a long time to obtain large rewards. However, thanks to the attractor-driven learning, the acquired locomotion was robust and energy-efficient. The results revealed that robots with passive elements could efficiently acquire high-performance locomotion by utilizing stable limit cycles generated through dynamic interaction between the body and ground. This study demonstrates the importance of implementing passive properties in the body for future embodied AI.


[64] 2604.14566

Physics-Informed Machine Learning for Pouch Cell Temperature Estimation

Accurate temperature estimation of pouch cells with indirect liquid cooling is essential for optimizing battery thermal management systems for transportation electrification. However, it is challenging due to the computational expense of finite element simulations and the limitations of data-driven models. This paper presents a physics-informed machine learning (PIML) framework for the efficient and reliable estimation of steady-state temperature profiles. The PIML approach integrates the governing heat transfer equations directly into the neural network's loss function, enabling high-fidelity predictions with significantly faster convergence than purely data-driven methods. The framework is evaluated on a dataset of varying cooling channel geometries. Results demonstrate that the PIML model converges more rapidly and achieves markedly higher accuracy, with a 49.1% reduction in mean squared error over the data-driven model. Validation against independent test cases further confirms its superior performance, particularly in regions away from the cooling channels. These findings underscore the potential of PIML for surrogate modeling and design optimization in battery systems.


[65] 2604.14603

A Synonymous Variational Perspective on the Rate-Distortion-Perception Tradeoff

The fundamental limit of natural signal compression has traditionally been characterized by classical rate-distortion (RD) theory through the tradeoff between coding rate and reconstruction distortion, while the rate-distortion-perception (RDP) framework introduces a divergence-based measure of perceptual quality as a modeling principle rather than a theoretically-derived principle, leaving its theoretical origin unclear. In this paper, motivated by a synonymity-based semantic information perspective, we reformulate perceptual reconstruction as recovering any admissible sample within an ideal synonymous set (synset) associated with the source, rather than the source sample itself, and correspondingly establish a synonymous source coding architecture. On this basis, we develop a synonymous variational inference (SVI) analysis framework with a synonymous variational lower bound (SVLBO) for tractable analysis of synset-oriented compression. Within this framework, we establish a synonymity-perception consistency principle, showing that optimal identification of semantic information is theoretically consistent with perceptual optimization. Based on its derivation result, we prove a synonymous RDP tradeoff for the proposed synonymous source coding. These analytical results show that the distributional divergence term arises naturally from the synset-based reconstruction objective, clarify its compatibility with existing RDP formulations and classical RD theory, and suggest the potential advantages of synonymous source coding.


[66] 2604.14619

The Acoustic Camouflage Phenomenon: Re-evaluating Speech Features for Financial Risk Prediction

In computational paralinguistics, detecting cognitive load and deception from speech signals is a heavily researched domain. Recent efforts have attempted to apply these acoustic frameworks to corporate earnings calls to predict catastrophic stock market volatility. In this study, we empirically investigate the limits of acoustic feature extraction (pitch, jitter, and hesitation) when applied to highly trained speakers in in-the-wild teleconference environments. Utilizing a two-stream late-fusion architecture, we contrast an acoustic-based stream with a baseline Natural Language Processing (NLP) stream. The isolated NLP model achieved a recall of 66.25% for tail-risk downside events. Surprisingly, integrating acoustic features via late fusion significantly degraded performance, reducing recall to 47.08%. We identify this degradation as Acoustic Camouflage, where media-trained vocal regulation introduces contradictory noise that disrupts multimodal meta-learners. We present these findings as a boundary condition for speech processing applications in high-stakes financial forecasting.


[67] 2604.14724

HAMSA: Scanning-Free Vision State Space Models via SpectralPulseNet

Vision State Space Models (SSMs) like Vim, VMamba, and SiMBA rely on complex scanning strategies to adapt sequential SSMs to process 2D images, introducing computational overhead and architectural complexity. We propose HAMSA, a scanning-free SSM operating directly in the spectral domain. HAMSA introduces three key innovations: (1) simplified kernel parameterization-a single Gaussian-initialized complex kernel replacing traditional (A, B, C) matrices, eliminating discretization instabilities; (2) SpectralPulseNet (SPN)-an input-dependent frequency gating mechanism enabling adaptive spectral modulation; and (3) Spectral Adaptive Gating Unit (SAGU)-magnitude-based gating for stable gradient flow in the frequency domain. By leveraging FFT-based convolution, HAMSA eliminates sequential scanning while achieving O(L log L) complexity with superior simplicity and efficiency. On ImageNet-1K, HAMSA reaches 85.7% top-1 accuracy (state-of-the-art among SSMs), with 2.2 X faster inference than transformers (4.2ms vs 9.2ms for DeiT-S) and 1.4-1.9X speedup over scanning-based SSMs, while using less memory (2.1GB vs 3.2-4.5GB) and energy (12.5J vs 18-25J). HAMSA demonstrates strong generalization across transfer learning and dense prediction tasks.


[68] 2604.14751

Exploiting Correlations in Federated Learning: Opportunities and Practical Limitations

The communication bottleneck in federated learning (FL) has spurred extensive research into techniques to reduce the volume of data exchanged between client devices and the central parameter server. In this paper, we systematically classify gradient and model compression schemes into three categories based on the type of correlations they exploit: structural, temporal, and spatial. We examine the sources of such correlations, propose quantitative metrics for measuring their magnitude, and reinterpret existing compression methods through this unified correlation-based framework. Our experimental studies demonstrate that the degrees of structural, temporal, and spatial correlations vary significantly depending on task complexity, model architecture, and algorithmic configurations. These findings suggest that algorithm designers should carefully evaluate correlation assumptions under specific deployment scenarios rather than assuming that they are always present. Motivated by these findings, we propose two adaptive compression designs that actively switch between different compression modes based on the measured correlation strength, and we evaluate their performance gains relative to conventional non-adaptive approaches. In summary, our unified taxonomy provides a clean and principled foundation for developing more effective and application-specific compression techniques for FL systems.


[69] 2604.14844

Matched and Euclidean-Mismatched Decoding on Fourier-Curve Constellations with Tangent Noise

We study matched and Euclidean-mismatched decoding on finite Fourier-curve constellations with tangent-space artificial noise. Each hypothesis induces a Gaussian law with symbol-dependent rank-one covariance. We derive exact Euclidean pairwise errors for arbitrary pairs and an exact Gaussian-expectation representation for matched decoding on bilaterally tangent-orthogonal pairs. For uniform even constellations, the Euclidean side yields explicit distance spectra and symbol-error bounds across all offset classes; the matched side is exact on antipodal pairs and benchmarked numerically at the full-codebook level via Monte Carlo. By isolating the detection-theoretic consequence of tangent-space artificial noise, these results clarify analytically how noise fraction and constellation density enter the mismatch behavior; secrecy-rate implications require additional channel and adversary modeling.


[70] 2604.14854

Towards Optimal Passive Feedback Control of LTI Systems under LQR Performance

We study state-feedback design for continuous-time LTI systems with a control input and an external input-output pair. Our objective is to determine feedback gains that render the closed-loop system (strictly) passive with respect to the external port while minimizing the standard LQR cost in the disturbance-free case. The resulting constrained optimization problem is intractable due to bilinear matrix inequalities. We analyze the set of passivating gains, showing it is unbounded, possibly nonconvex, path-connected, and contractible. We propose an indirect approach, in which the set of passivating feedback gains is inner-approximated by a compact, convex polytope. A projected gradient flow is employed to compute a gain within this polytope that minimizes the LQR cost. Numerical examples illustrate the effectiveness of the method.


[71] 2604.14861

Affine-coupled Distributed Optimization via Distributed Proximal Jacobian ADMM with Quantized Communication

This paper investigates distributed resource allocation optimization over directed graphs with limited communication bandwidth. We develop a novel distributed algorithm that integrates the centralized Proximal Jacobian Alternating Direction Method of Multipliers (PJ-ADMM) with a finite-level quantized consensus scheme, enabling nodes to cooperatively solve the optimization in a distributed fashion. Under the assumption of convex objective functions, we establish that the proposed algorithm achieves sublinear convergence to a neighborhood of the optimal solution, with the convergence accuracy explicitly bounded by the quantization level. Numerical experiments validate that the algorithm achieves competitive performance compared to existing approaches while exhibiting communication efficiency.


[72] 2604.14879

SOLIS: Physics-Informed Learning of Interpretable Neural Surrogates for Nonlinear Systems

Nonlinear system identification must balance physical interpretability with model flexibility. Classical methods yield structured, control-relevant models but rely on rigid parametric forms that often miss complex nonlinearities, whereas Neural ODEs are expressive yet largely black-box. Physics-Informed Neural Networks (PINNs) sit between these extremes, but inverse PINNs typically assume a known governing equation with fixed coefficients, leading to identifiability failures when the true dynamics are unknown or state-dependent. We propose \textbf{SOLIS}, which models unknown dynamics via a \emph{state-conditioned second-order surrogate model} and recasts identification as learning a Quasi-Linear Parameter-Varying (Quasi-LPV) representation, recovering interpretable natural frequency, damping, and gain without presupposing a global equation. SOLIS decouples trajectory reconstruction from parameter estimation and stabilizes training with a cyclic curriculum and \textbf{Local Physics Hints} windowed ridge-regression anchors that mitigate optimization collapse. Experiments on benchmarks show accurate parameter-manifold recovery and coherent physical rollouts from sparse data, including regimes where standard inverse methods fail.


[73] 2604.14880

xFODE+: Explainable Type-2 Fuzzy Additive ODEs for Uncertainty Quantification

Recent advances in Deep Learning (DL) have boosted data-driven System Identification (SysID), but reliable use requires Uncertainty Quantification (UQ) alongside accurate predictions. Although UQ-capable models such as Fuzzy ODE (FODE) can produce Prediction Intervals (PIs), they offer limited interpretability. We introduce Explainable Type-2 Fuzzy Additive ODEs for UQ (xFODE+), an interpretable SysID model which produces PIs alongside point predictions while retaining physically meaningful incremental states. xFODE+ implements each fuzzy additive model with Interval Type-2 Fuzzy Logic Systems (IT2-FLSs) and constraints membership functions to the activation of two neighboring rules, limiting overlap and keeping inference locally transparent. The type-reduced sets produced by the IT2-FLSs are aggregated to construct the state update together with the PIs. The model is trained in a DL framework via a composite loss that jointly optimizes prediction accuracy and PI quality. Results on benchmark SysID datasets show that xFODE+ matches FODE in PI quality and achieves comparable accuracy, while providing interpretability.


[74] 2604.14897

Mix-CALADIN: A Distributed Algorithm for Consensus Mixed-Integer Optimization

This paper addresses distributed consensus optimization problems with mixed-integer variables, with a specific focus on Boolean variables. We introduce a novel distributed algorithm that extends the Consensus Augmented Lagrangian Alternating Direction Inexact Newton (CALADIN) framework by incorporating specialized techniques for handling Boolean variables without relying on local mixed-integer solvers. Under the mild assumption of Lipschitz continuity of the objective functions, we establish rigorous convergence guarantees for both convex and nonconvex mixed-integer programming problems. Numerical experiments demonstrate that the proposed algorithm achieves competitive performance compared to existing approaches while providing rigorous convergence guarantees.


[75] 2604.14908

Multi-User mmWave Beam and Rate Adaptation via Combinatorial Satisficing Bandits

We study downlink beam and rate adaptation in a multi-user mmWave MISO system where multiple base stations (BSs), each using analog beamforming from finite codebooks, serve multiple single-antenna user equipments (UEs) with a unique beam per UE and discrete data transmission rates. BSs learn about transmission success based on ACK/NACK feedback. To encode service goals, we introduce a satisficing throughput threshold $\tau_r$ and cast joint beam and rate adaptation as a combinatorial semi-bandit over beam-rate tuples. Within this framework, we propose SAT-CTS, a lightweight, threshold-aware policy that blends conservative confidence estimates with posterior sampling, steering learning toward meeting $\tau_r$ rather than merely maximizing. Our main theoretical contribution provides the first finite-time regret bounds for combinatorial semi-bandits with satisficing objective: when $\tau_r$ is realizable, we upper bound the cumulative satisficing regret to the target with a time-independent constant, and when $\tau_r$ is non-realizable, we show that SAT-CTS incurs only a finite expected transient outside committed CTS rounds, after which its regret is governed by the sum of the regret contributions of restarted CTS rounds, yielding an $O((\log T)^2)$ standard regret bound. On the practical side, we evaluate the performance via cumulative satisficing regret to $\tau_r$ alongside standard regret and fairness. Experiments with time-varying sparse multipath channels show that SAT-CTS consistently reduces satisficing regret and maintains competitive standard regret, while achieving favorable average throughput and fairness across users, indicating that feedback-efficient learning can equitably allocate beams and rates to meet QoS targets without channel state knowledge.


[76] 2604.14987

AI-Enabled Covert Channel Detection in RF Receiver Architectures

Covert channels (CCs) in wireless chips pose a serious security threat, as they enable the exfiltration of sensitive information from the chip to an external attacker. In this work, we propose an AI-based defense mechanism deployed at the RF receiver, where the model directly monitors raw I/Q samples to detect, in real time, the presence of a CC embedded within an otherwise nominal signal. We first compact a state-of-the-art convolutional neural network (CNN), achieving an 80% reduction in parameters, which is an essential requirement for efficient edge deployment. When evaluated on the open-source hardware Trojan (HT)-based CC dataset, the compacted CNN attains an average accuracy of 90.28% for CC detection and 86.50% for identifying the underlying HT, with results averaged across SNR values above 1 dB. For practical communication scenarios where SNR > 20 dB, the model achieves over 97% accuracy for both tasks. These results correspond to a minimal performance degradation of less than 2% compared to the baseline model. The compacted CNN is further benchmarked against alternative classifiers, demonstrating an excellent accuracy-model size trade-off. Finally, we design a lightweight CNN hardware accelerator and demonstrate it on an FPGA, achieving very low resource utilization and an efficiency of 107 GOPs/W. Being the first AI hardware accelerator proposed specifically for CC detection, we compare it against state-of-the-art AI accelerators for RF signal classification tasks such as modulation recognition, showing superior performance.


[77] 2604.15074

Trajectory Planning for a Multi-UAV Rigid-Payload Cascaded Transportation System Based on Enhanced Tube-RRT*

This paper presents a two-stage trajectory planning framework for a multi-UAV rigid-payload cascaded transportation system, aiming to address planning challenges in densely cluttered environments. In Stage I, an Enhanced Tube-RRT* algorithm is developed by integrating active hybrid sampling and an adaptive expansion strategy, enabling rapid generation of a safe and feasible virtual tube in environments with dense obstacles. Moreover, a trajectory smoothness cost is explicitly incorporated into the edge cost to reduce excessive turns and thereby mitigate cable-induced oscillations. Simulation results demonstrate that the proposed Enhanced Tube-RRT* achieves a higher success rate and effective sampling rate than mixed-sampling Tube-RRT* (STube-RRT*) and adaptive-extension Tube-RRT* (AETube-RRT*), while producing a shorter optimal path with a smaller cumulative turning angle. In Stage II, a convex quadratic program is formulated by considering payload translational and rotational dynamics, cable tension constraints, and collision-safety constraints, yielding a smooth, collision-free desired payload trajectory. Finally, a centralized geometric control scheme is applied to the cascaded system to validate the effectiveness and feasibility of the proposed planning framework, offering a practical solution for payload attitude maneuvering in densely cluttered environments.


[78] 2604.15255

Democratization of Real-time Multi-Spectral Photoacoustic Imaging: Open-Sourced System Architecture for OPOTEK Phocus & Verasonics Vantage Combination

Real-time multi-spectral photoacoustic imaging (RT-mPAI) often suffers from synchronization instabilities when interfacing fast-tuning lasers with data acquisition platforms executing on non-real-time operating systems. To overcome this, we establish an open-source hardware-software architecture tailored for the widely adopted combination of the OPOTEK Phocus lasers and Verasonics Vantage systems. By employing an independent micro-controller for deterministic laser trigger counting alongside a decoupled client-server data streaming framework, the proposed system circumvents OS-induced timing deviations and local storage bottlenecks. By open-sourcing this pipeline and cultivating a collaborative environment to share both code and ideas, we aim to lower the technical and cost barriers for RT-mPAI, thereby democratizing access to stable RT-mPAI research and, more ambitiously, fostering a vibrant open-source community.


[79] 2604.15278

A Manual Bar-by-Bar Tempo Measurement Protocol for Polyphonic Chamber Music Recordings: Design, Validation, and Application to Beethoven's Piano and Cello Sonatas

Empirical performance analysis depends on the accurate extraction of tempo data from recordings, yet standard computational tools, designed for monophonic audio or modern studio conditions, fail systematically when applied to historical polyphonic chamber music. This paper documents the failure of automated beat-detection software on duo recordings of Beethoven's five piano and cello sonatas (Op.~5 Nos.~1 and~2; Op.~69; Op.~102 Nos.~1 and~2), and presents a formalised manual alternative: a cumulative lap-timer protocol that yields bar-level beats-per-minute data with millisecond resolution. The protocol, developed in cross-disciplinary collaboration with an engineer specialising in VLSI design, rests on a cumulative timestamp architecture that prevents error accumulation, permits internal self-validation, and captures expressive timing phenomena (rubato, fermatas, accelerandi, ritardandi) that automated tools systematically suppress or misread. The mathematical derivation of the BPM formula, the spreadsheet data structure, and the error characterisation are presented in full. Applied to over one hundred movement-level recordings spanning 1930--2012, the protocol generated a dataset subsequently visualised through tempographs, histograms with spline-smoothed probability density functions, ridgeline plots, and combination charts. The paper argues that manual annotation is not a methodological retreat but a principled response to the intrinsic limitations of computational tools when faced with the specific challenges of polyphonic historical recordings. The complete dataset and analysis code are publicly available.


[80] 2410.05882

Frame forecasting in cine MRI using the PCA respiratory motion model: comparing recurrent neural networks trained online and transformers

Respiratory motion complicates accurate irradiation of thoraco-abdominal tumors during radiotherapy, as treatment-system latency entails target-location uncertainties. This work addresses frame forecasting in chest and liver cine MRI to compensate for such delays. We investigate RNNs trained with online learning algorithms, enabling adaptation to changing respiratory patterns via on-the-fly parameter updates, and transformers, increasingly common in time-series forecasting for their ability to capture long-term dependencies. Experiments used 12 sagittal thoracic and upper-abdominal cine-MRI sequences from ETH Zürich and OvGU; the OvGU data exhibited higher motion variability, noise, and lower contrast. PCA decomposes the Lucas-Kanade optical-flow field into static deformation modes and low-dimensional, time-dependent weights. We compare various methods for forecasting these weights: linear filters, population and sequence-specific transformer encoders, and RNNs trained with real-time recurrent learning (RTRL), unbiased online recurrent optimization, decoupled neural interfaces, and sparse one-step approximation (SnAp-1). Predicted displacements were used to warp the reference frame and generate future images. Prediction accuracy decreased with the horizon h. Linear regression performed best at short horizons (1.3mm geometrical error at h=0.32s, ETH Zürich dataset), while RTRL and SnAp-1 outperformed the other algorithms at medium-to-long horizons, with geometrical errors below 1.4mm and 2.8mm on the sequences from ETH Zürich and OvGU, respectively. The sequence-specific transformer was competitive for low-to-medium horizons, but transformers remained overall limited by data scarcity and domain shift between datasets. Predicted frames visually resembled the ground truth, with notable errors occurring near the diaphragm at end-inspiration and regions affected by out-of-plane motion.


[81] 2504.02380

Beyond Asymptotics: Targeted exploration with finite-sample guarantees

In this paper, we introduce a targeted exploration strategy for the non-asymptotic, finite-time case. The proposed strategy is applicable to uncertain linear time-invariant systems subject to sub-Gaussian disturbances. As the main result, the proposed approach provides a priori guarantees, ensuring that the optimized exploration inputs achieve a desired accuracy of the model parameters. The technical derivation of the strategy (i) leverages existing non-asymptotic identification bounds with self-normalized martingales, (ii) utilizes spectral lines to predict the effect of sinusoidal excitation, and (iii) effectively accounts for spectral transient error and parametric uncertainty. A numerical example illustrates how the finite exploration time influence the required exploration energy.


[82] 2506.09685

Bridging Continuous-time LQR and Reinforcement Learning via Gradient Flow of the Bellman Error

In this paper, we present a novel method for computing the optimal feedback gain of the infinite-horizon Linear Quadratic Regulator (LQR) problem via an ordinary differential equation. We introduce a novel continuous-time Bellman error, derived from the Hamilton-Jacobi-Bellman (HJB) equation, which quantifies the suboptimality of stabilizing policies and is parametrized in terms of the feedback gain. We analyze its properties, including its effective domain, smoothness, coerciveness and show the existence of a unique stationary point within the stability region. Furthermore, we derive a closed-form gradient expression of the Bellman error that induces a gradient flow. This converges to the optimal feedback and generates a unique trajectory which exclusively comprises stabilizing feedback policies. Additionally, this work advances interesting connections between LQR theory and Reinforcement Learning (RL) by redefining suboptimality of the Algebraic Riccati Equation (ARE) as a Bellman error, adapting a state-independent formulation, and leveraging Lyapunov equations to overcome the infinite-horizon challenge. We validate our method in a simulation and compare it to the state of the art.


[83] 2506.13408

HELENA: High-Efficiency Learning-based channel Estimation using dual Neural Attention

Accurate channel estimation is critical for high-performance Orthogonal Frequency-Division Multiplexing systems such as 5G New Radio, particularly under low signal-to-noise ratio and stringent latency constraints. This letter presents HELENA, a compact deep learning model that combines a lightweight convolutional backbone with two efficient attention mechanisms: patch-wise multi-head self-attention for capturing global dependencies and a squeeze-and-excitation block for local feature refinement. Compared to CEViT, a state-of-the-art vision transformer-based estimator, HELENA reduces inference time by 45.0\% (0.175\,ms vs.\ 0.318\,ms), achieves comparable accuracy ($-16.78$\,dB vs.\ $-17.30$\,dB), and requires $8\times$ fewer parameters (0.11M vs.\ 0.88M), demonstrating its suitability for low-latency, real-time deployment.


[84] 2506.14844

Improving Prostate Gland Segmentation Using Transformer based Architectures

Inter reader variability and cross site domain shift challenge the automatic segmentation of prostate anatomy using T2 weighted MRI images. This study investigates whether transformer models can retain precision amid such heterogeneity. We compare the performance of UNETR and SwinUNETR in prostate gland segmentation against our previous 3D UNet model [1], based on 546 MRI (T2weighted) volumes annotated by two independent experts. Three training strategies were analyzed: single cohort dataset, 5 fold cross validated mixed cohort, and gland size based dataset. Hyperparameters were tuned by Optuna. The test set, from an independent population of readers, served as the evaluation endpoint (Dice Similarity Coefficient). In single reader training, SwinUNETR achieved an average dice score of 0.816 for Reader#1 and 0.860 for Reader#2, while UNETR scored 0.8 and 0.833 for Readers #1 and #2, respectively, compared to the baseline UNets 0.825 for Reader #1 and 0.851 for Reader #2. SwinUNETR had an average dice score of 0.8583 for Reader#1 and 0.867 for Reader#2 in cross-validated mixed training. For the gland size-based dataset, SwinUNETR achieved an average dice score of 0.902 for Reader#1 subset and 0.894 for Reader#2, using the five-fold mixed training strategy (Reader#1, n=53; Reader#2, n=87) at larger gland size-based subsets, where UNETR performed poorly. Our findings demonstrate that global and shifted-window self-attention effectively reduces label noise and class imbalance sensitivity, resulting in improvements in the Dice score over CNNs by up to five points while maintaining computational efficiency. This contributes to the high robustness of SwinUNETR for clinical deployment.


[85] 2506.23334

Federated Breast Cancer Detection Enhanced by Synthetic Ultrasound Image Augmentation

Federated learning enables collaborative training of deep learning models across institutions without sharing sensitive patient data. However, its performance is often limited by small datasets and non-independent, identically distributed data, which can impair model generalization. In this work, we propose a generative model-based data augmentation framework for breast ultrasound classification. It leverages synthetic images generated by deep convolutional generative adversarial networks and a class-conditioned denoising diffusion probabilistic model. Experiments on three publicly available datasets (BUSI, BUS-BRA, and UDIAT) demonstrated that incorporating a suitable number of synthetic images improved average AUC from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx. Furthermore, we noticed that excessive use of synthetic data reduced performance. This highlights the importance of balancing real and synthetic samples. Our results underscore the potential of generative model-based augmentation to enhance federated breast ultrasound image classification.


[86] 2507.21300

Simultaneous improvement of control and estimation for battery management systems

Standard battery management systems treat the control and state estimation problems as decoupled objectives, relying on certainty equivalence controllers that are blind to the varying observability induced by nonlinear open-circuit voltage models. In this paper, we show that for a broad class of objectives, including the peak shaving and valley filling scenarios common in grid-connected energy storage, the expected cost of a stochastic battery system can be exactly parametrized by the conditional mean and covariance of the state of charge. This reformulation reveals a direct coupling between the control input and estimation quality, a coupling that certainty equivalence controllers ignore, and motivates a dual-control approach in which the controller actively reduces estimation uncertainty by driving the state to high observability regions without compromising the control objective. We derive a deterministic surrogate to this stochastic cost and pose the dual-control problem as a computationally tractable model predictive control problem. We validate our approach on a nine-battery system tracking a time-varying power/demand reference trajectory. We report simultaneous improvements in control cost (up to 20\% reduction) and state estimation error (up to 30\% reduction). The estimation improvement is reported across different state estimators: extended Kalman filter, unscented Kalman filter, and a moving horizon estimator, confirming that the estimation improvement of our approach is not restricted to a specific state observer.


[87] 2508.16601

On the Unification of Deterministic and Stochastic Electromagnetic Information Theory via Symplectic Geometry

This paper unifies deterministic and stochastic Electromagnetic Information Theory (EIT) through symplectic geometry. For spatially incoherent sources, both formulations yield identical eigenvalues and spatial Degrees of Freedom (NDF). This equivalence is shown to be a structural necessity: the radiometric étendue, the Hamiltonian phase-space volume, and the NDF are the same symplectic invariant of the source--observer configuration. Liouville's theorem guarantees conservation of the NDF under lossless propagation; Gromov's Non-Squeezing Theorem establishes an irreducible minimum phase-space cell, setting a fundamental geometric bound on resolving power. The physical manifestation of this symplectic structure is the formation of \textit{Spatial Information Flows} (SIFs) -- level-set curves of high mutual information which, for convex sources with rotational symmetry, coincide with the optimal sampling curves of Bucci et al. Spatial information in electromagnetic fields is therefore governed by the geometry of the source--observer configuration, providing the foundation for a geometric theory of electromagnetic information.


[88] 2509.02571

Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening

This paper investigates continuous representations of steering vectors over frequency and microphone/source positions for augmented listening (e.g., spatial filtering and binaural rendering), enabling user-parameterized control of the reproduced sound field. Steering vectors have typically been used for representing the spatial response of a microphone array as a function of the look-up direction. The basic algebraic representation of these quantities assuming an idealized environment cannot deal with the scattering effect of the sound field. One may thus collect a discrete set of real steering vectors measured in dedicated facilities and super-resolve (i.e., upsample) them. Recently, physics-aware deep learning methods have been effectively used for this purpose. Such deterministic super-resolution, however, suffers from the overfitting problem due to the non-uniform uncertainty over the measurement space. To solve this problem, we integrate an expressive representation based on the neural field (NF) into the principled probabilistic framework based on the Gaussian process (GP). Specifically, we propose a physics-aware composite kernel that models the directional incoming waves and the subsequent scattering effect. Our comprehensive comparative experiment showed the effectiveness of the proposed method under data insufficiency conditions. In downstream tasks such as speech enhancement and binaural rendering using the simulated data of the SPEAR challenge, the oracle performances were attained with less than ten times fewer measurements.


[89] 2509.23391

Optimizing the Network Topology of a Linear Reservoir Computer

Machine learning has become a fundamental approach for modeling, prediction, and control, enabling systems to learn from data and perform complex tasks. Reservoir computing is a machine learning tool that leverages high-dimensional dynamical systems to efficiently process temporal data for prediction and observation tasks. Traditionally, the connectivity of the network that underlies a reservoir computer (RC) is generated randomly, lacking a principled design. Here, we focus on optimizing the connectivity of a linear RC to improve its performance and interpretability, which we achieve by decoupling the RC dynamics into a number of independent modes. We then proceed to optimize each one of these modes to perform a given task, which corresponds to selecting an optimal RC connectivity in terms of a given set of eigenvalues of the RC adjacency matrix. Simulations on networks of varying sizes show that the optimized RC significantly outperforms randomly constructed reservoirs in both training and testing phases and often surpasses nonlinear reservoirs of comparable size. This approach provides both practical performance advantages and theoretical guidelines for designing efficient, task-specific, and analytically transparent RC architectures.


[90] 2509.25515

Spatiotemporal Forecasting of Incidents and Congestion with Implications for Sustainable Traffic Control

Urban traffic anomalies, such as collisions and disruptions, threaten the safety, efficiency, and sustainability of transportation systems. In this paper, we present a simulation-based framework for modeling, detecting, and predicting such anomalies in urban networks. Using the Simulation of Urban MObility (SUMO) platform, we generate reproducible rear-end and intersection crash scenarios with matched baselines, enabling controlled experimentation and comparative evaluation. We record vehicle-level travel time, speed, and emissions for both edge- and network-level analysis. Building on this dataset, we develop a hybrid forecasting architecture that combines bidirectional long short-term memory networks with a diffusion convolutional recurrent neural network to capture temporal dynamics and spatial dependencies. Our simulation studies on the Broadway corridor in New York City demonstrate the framework's ability to reproduce consistent incident conditions, quantify their effects, and provide accurate multi-horizon traffic forecasts. Our results highlight the value of combining controlled anomaly generation with deep predictive models to support reproducible evaluation and sustainable traffic management.


[91] 2511.04777

OPF-Based Optimal Power System Network Restoration Considering Frequency Dynamics

Due to recent blackout and system split incidents in power grids worldwide, as well as increased system complexity in view of the energy transition, there has been increasing interest in re-evaluating existing Power System Restoration (PSR) plans. In restoration scenarios, due to low island inertia, it is necessary to ensure not only the static, but also the dynamic stability of the system. In this paper, we pose and solve a formulation of the optimal PSR problem including frequency dynamics. We validate the switching constraints for global optimality within a static version of the formulation using a brute-force tree search method. We apply the dynamic problem formulation to the IEEE 9-Bus model, and show that the optimal switching sequence using the static formulation would violate dynamic constraints, illustrating the importance of dynamic considerations in PSR planning.


[92] 2512.08887

A Fast Broadband Beamspace Transformation

We present a new computationally efficient method for multi-beamforming in the broadband setting. Our "fast beamspace transformation" forms $B$ beams from $M$ sensor outputs using a number of operations per sample that scales linearly (to within logarithmic factors) with $M$ when $B\sim M$. While the narrowband version of this transformation can be performed efficiently with a spatial fast Fourier transform, the broadband setting requires coherent processing of multiple array snapshots simultaneously. Our algorithm works by taking $N$ samples off of each of $M$ sensors and encoding the sensor outputs into a set of coefficients using a special non-uniform spaced Fourier transform. From these coefficients, each beam is formed by solving a small system of equations that has Toeplitz structure. The total runtime complexity is $\mathcal{O}(M\log N+B\log N)$ operations per sample, exhibiting essentially the same scaling as in the narrowband case and vastly outperforming broadband beamformers based on delay and sum whose computations scale as $\mathcal{O}(MB)$. Alongside a careful mathematical formulation and analysis of our fast broadband beamspace transform, we provide a host of numerical experiments demonstrating the algorithm's favorable computational scaling and high accuracy. Finally, we demonstrate how tasks such as interpolating to ``off-grid" angles and nulling an interferer are more computationally efficient when performed directly in beamspace.


[93] 2512.15207

Remote Magnetic Levitation Using Reduced Attitude Control and Parametric Field Models

Electromagnetic navigation systems (eMNS) are increasingly used in minimally invasive procedures such as endovascular interventions and targeted drug delivery due to their ability to generate fast and precise magnetic fields. In this paper, we utilize the OctoMag and a custom 13-coil eMNS to achieve remote levitation and control of multiple rigid bodies across large air gaps, showcasing the dynamic capabilities of such systems. A compact parametric analytical model maps coil currents to the forces and torques acting on the levitating object, eliminating the need for computationally expensive simulations or lookup tables and establishing a levitator- and platform-agnostic control framework. Translational motion is stabilized using linear quadratic regulators. A nonlinear time-invariant controller is used to regulate the reduced attitude accounting for the inherent uncontrollability of rotations about the dipole axis and stabilizing the full five degrees of freedom controllable pose subspace. We analyze key design limitations and evaluate the approach through trajectory tracking experiments across different objects and actuation platforms. Notably, our proposed controller demonstrates superiority over an equivalent baseline PID formulation, reliably tracking large spatial angles up to 65$^\circ$. This work demonstrates the dynamic capabilities and potential of feedback control in electromagnetic navigation, which is likely to open up new medical applications.


[94] 2601.21039

Mean-Field Learning for Storage Aggregation

Distributed energy storage devices can be aggregated to provide operational flexibility for power systems. This requires representing a massive device population as a single, tractable surrogate that is computationally efficient and accurate. However, surrogate identification is challenging due to heterogeneity, nonconvexity, and high dimensionality of storage devices. To address these challenges, this paper develops a mean-field learning framework for storage aggregation. We interpret aggregation as the average behavior of a large storage population and show that, as the population grows, aggregate performance converges to a unique, convex mean-field limit, enabling tractable population-level modeling. This convexity further yields a price-responsive characterization of aggregate storage behavior and allows us to bound the mean-field approximation error. We construct a convex surrogate model with physically interpretable parameters that approximates the aggregate behavior of large storage populations and can be embedded directly into power system operations. Surrogate parameter identification is formulated as an optimization problem using historical price-response data, and we adopt a gradient-based algorithm for efficient learning. Case studies validate the theoretical findings and demonstrate the effectiveness of the proposed framework in approximation accuracy and data efficiency.


[95] 2602.15174

Large elements and advanced beamformers for increased field of view in 2-D ultrasound matrix arrays

Three-dimensional (3D) ultrasound promises various medical applications for abdominal, obstetrics, and breast imaging. However, ultrasound matrix arrays have extremely high element counts limiting their field of view (FOV). Current reduced element count architectures, such as row-column arrays, diverging lenses, or sparse arrays, suffer from limited resolution and high side- and grating-lobe levels. This work seeks to demonstrate an increased field-of-view using a reduced element count array design. The approach is to increase the element size and use advanced beamformers to maintain image quality. The delay and sum (DAS), Null Subtraction Imaging (NSI), directional coherence factor (DCF), and Minimum Variance (MV) beamformers were compared. K-wave simulations of the 3D point-spread functions (PSF) of NSI, DCF, and MV display reduced side lobes and narrowed main lobes compared to DAS. Experiments were conducted using a multiplexed 1024-element matrix array on a Verasonics 256 system. Elements were electronically coupled to imitate a larger pitch and element size. Then, a virtual large aperture was created by using a positioning system to collect data in sections with the matrix array. Resolution and contrast was also assessed on a rabbit liver in vivo. Resolution was maintained using coupling numbers up to four, doubling the FOV while reducing the element count. The NSI and DCF beamformers demonstrated the best resolution performance in simulations, in a phantom with the virtual aperture, and in vivo on a rabbit liver. Our results demonstrate how larger matrix arrays could be constructed with larger elements, with resolution maintained by advanced beamformers.


[96] 2602.20218

Robust Glioblastoma Segmentation and Volumetry Without T2-FLAIR: External Validation of Targeted Dropout Training

Objectives: To externally validate targeted T2 fluid-attenuated inversion recovery (T2-FLAIR) dropout for robust automated glioblastoma segmentation and whole-tumor volumetry without T2-FLAIR, while preserving performance when the full MRI protocol is available. Methods: In this retrospective multi-dataset study, 3D nnU-Net models were developed on BraTS 2021 (n=848) and externally validated on an independent University of Pennsylvania glioblastoma cohort (n=403). Models were trained with or without targeted T2-FLAIR dropout, zeroing the T2-FLAIR channel during training. Testing used prespecified T2-FLAIR-present and T2-FLAIR-absent scenarios; the absent scenario was simulated by zeroing the T2-FLAIR channel at inference. The primary endpoint was per-patient overall region-wise Dice similarity coefficient (DSC). Secondary endpoints were region-specific DSC, 95th percentile Hausdorff distance, and Bland-Altman whole-tumor volume bias. Results: In external validation, performance was preserved with the full MRI protocol: overall median DSC was 94.8% (interquartile range [IQR] 90.0%-97.1%) with dropout and 95.0% (IQR 90.3%-97.1%) without dropout. In the T2-FLAIR-absent scenario, targeted dropout improved overall median DSC from 81.0% (IQR 75.1%-86.4%) to 93.4% (IQR 89.1%-96.2%). Whole-tumor DSC improved from 60.4% to 92.6%, whole-tumor 95th percentile Hausdorff distance from 17.24 mm to 2.45 mm, and whole-tumor volume bias from -45.6 mL to 0.83 mL. Conclusions: In an independent external test cohort, targeted T2-FLAIR dropout preserved glioblastoma segmentation performance with the full MRI protocol and substantially reduced whole-tumor segmentation error and volumetric bias when T2-FLAIR was absent. These findings support targeted sequence dropout as a practical robustness strategy for automated glioblastoma analysis in retrospective and heterogeneous clinical workflows.


[97] 2603.15045

LLMs and Speech: Integration vs. Combination

In this work, we study how to best utilize pre-trained LLMs for automatic speech recognition. Specifically, we compare the tight integration of an acoustic model (AM) with the LLM ("speech LLM") to the traditional way of combining AM and LLM via shallow fusion. For tight integration, we provide ablations on the effect of different label units, fine-tuning strategies, LLM sizes and pre-training data, attention interfaces, encoder downsampling, text prompts, and length normalization. Additionally, we investigate joint recognition with a CTC model to mitigate hallucinations of speech LLMs and present effective optimizations for this joint recognition. For shallow fusion, we investigate the effect of fine-tuning the LLM on the transcriptions using different label units, and we compare rescoring AM hypotheses to single-pass recognition with label-wise or delayed fusion of AM and LLM scores. We train on Librispeech and Loquacious and evaluate our models on the HuggingFace ASR leaderboard.


[98] 2604.13004

Inexpensive Optical Projection Tomography on a Mobile Phone Platform

This work presents an inexpensive optical projection tomography (OPT) system built on a mobile phone platform for three-dimensional optical microscopy. The system uses an iPhone camera together with a low-cost commercial microscope lens attachment, a stepper motor for sample rotation, LED illumination, and custom 3D-printed components, with a total component cost of approximately 50 US dollars excluding the phone. To support system evaluation, we also developed a low-cost method for fabricating a zebrafish phantom by embedding fixed larvae in UV-cured resin. Camera calibration was performed using a checkerboard target, and effective magnification was estimated with images of a 1951 Air Force resolution target. Projection images acquired during sample rotation were converted to attenuation images and corrected for field nonuniformity. Each slice was reconstructed with filtered backprojection and the resulting slices were stacked into a 3D volume. The completed system achieved a resolution of 3.91 $\mu m$ and produced volumetric reconstructions in which anatomical features of the zebrafish phantom, including the spine, were clearly visible. These results demonstrate that mobile-phone-based OPT can provide accessible, portable, and low-cost 3D microscopy, with potential utility for education, field work, and resource-limited settings.


[99] 2604.13926

Importance of Aggregated DER Installed Capacity in Distribution Networks

The increasing penetration of Distributed Energy Resources (DERs), particularly electric vehicles, heat pumps, and photovoltaic systems, is fundamentally changing power flows in Low-Voltage (LV) distribution networks. Despite this transition, Distribution System Operators (DSOs) often lack reliable and up-to-date knowledge of the DER capacity connected downstream of LV substations. Limited observability, incomplete topology information, and restricted access to customer-level data make it difficult to maintain accurate DER registries, creating uncertainty in both operational and planning processes. This paper presents aggregated DER installed capacity, estimated at LV aggregation points, as a practical and scalable approach to improving DER awareness without requiring customer-level monitoring. We define the problem of estimating DER installed capacities from commonly available substation and feeder measurements. By linking these estimates to operational and planning needs, we discuss how knowledge of aggregated DER installed capacity enhances DER-aware forecasting, congestion management, flexibility quantification, hosting capacity assessment, and monitoring of DER adoption.


[100] 2410.08329

Survey of Deep Learning and Physics-Based Approaches in Computational Wave Imaging

Computational wave imaging (CWI) extracts hidden structure and physical properties of a volume of material by analyzing wave signals that traverse that volume. Applications include seismic exploration of the Earth's subsurface, acoustic imaging and non-destructive testing in material science, and ultrasound computed tomography in medicine. Current approaches for solving CWI problems can be divided into two categories: those rooted in traditional physics, and those based on deep learning. Physics-based methods stand out for their ability to provide high-resolution and quantitatively accurate estimates of acoustic properties within the medium. However, they can be computationally intensive and are susceptible to ill-posedness and nonconvexity typical of CWI problems. Machine learning-based computational methods have recently emerged, offering a different perspective to address these challenges. Diverse scientific communities have independently pursued the integration of deep learning in CWI. This review discusses how contemporary scientific machine-learning (ML) techniques, and deep neural networks in particular, have been developed to enhance and integrate with traditional physics-based methods for solving CWI problems. We present a structured framework that consolidates existing research spanning multiple domains, including computational imaging, wave physics, and data science. This study concludes with important lessons learned from existing ML-based methods and identifies technical hurdles and emerging trends through a systematic analysis of the extensive literature on this topic.


[101] 2410.11126

Label-free subcellular 3D imaging of oocytes and embryos via reflection matrix microscopy

Non-invasive morphological assessment is the cornerstone of oocyte and embryo selection in assisted reproductive technology, yet clinical practice remains limited by two-dimensional, qualitative microscopy. While three-dimensional (3D) fluorescence imaging provides cellular insights, its inherent phototoxicity precludes routine clinical use. Conversely, existing label-free modalities fail to resolve subcellular structures in thick specimens due to two distinct physical barriers: large-scale refractive index heterogeneities, such as the cumulus cells surrounding oocytes, that induce severe aberrations; and short-scale fluctuations, primarily from cytoplasmic lipids, that generate a multiple scattering ``fog''. Here, we report an ultra-fast Reflection Matrix Imaging (RMI) platform designed to overcome these depth and resolution limits. By capturing the back-scattered electromagnetic field for a set of plane-wave illuminations at multiple wavelengths, we record a multi-spectral reflection matrix. From this matrix, we leverage digital adaptive focusing algorithms to computationally compensate for sample-induced aberrations while realigning forward multiple scattering trajectories with the single-scattering contribution. This approach enables label-free 3D visualization of oocytes and blastocysts with an unprecedented subcellular resolution of 300 nm throughout the entire specimen volume. We demonstrate the reliable identification of germinal vesicles and nuclear status in stages previously inaccessible to conventional optics, including imaging through dense cumulus cells. Our method provides a powerful, non-invasive tool for objective grading across all pre-implantation stages, potentially transforming decision-making in clinical IVF.


[102] 2506.00433

Latent Wavelet Diffusion For Ultra-High-Resolution Image Synthesis

High-resolution image synthesis remains a core challenge in generative modeling, particularly in balancing computational efficiency with the preservation of fine-grained visual detail. We present Latent Wavelet Diffusion (LWD), a lightweight training framework that significantly improves detail and texture fidelity in ultra-high-resolution (2K-4K) image synthesis. LWD introduces a novel, frequency-aware masking strategy derived from wavelet energy maps, which dynamically focuses the training process on detail-rich regions of the latent space. This is complemented by a scale-consistent VAE objective to ensure high spectral fidelity. The primary advantage of our approach is its efficiency: LWD requires no architectural modifications and adds zero additional cost during inference, making it a practical solution for scaling existing models. Across multiple strong baselines, LWD consistently improves perceptual quality and FID scores, demonstrating the power of signal-driven supervision as a principled and efficient path toward high-resolution generative modeling. The code is available at this https URL


[103] 2507.11812

A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction

Sound speed profiles (SSPs) are essential parameters underwater that affects the propagation mode of underwater signals and has a critical impact on the energy efficiency of underwater acoustic communication and accuracy of underwater acoustic positioning. Traditionally, SSPs can be obtained by matching field processing (MFP), compressive sensing (CS), and deep learning (DL) methods. However, existing methods mainly rely on on-site underwater sonar observation data, which put forward strict requirements on the deployment of sonar observation systems. To achieve high-precision estimation of sound velocity distribution in a given sea area without on-site underwater data measurement, we propose a multi-modal data-fusion generative adversarial network model with residual attention block (MDF-RAGAN) for SSP construction. To improve the model's ability for capturing global spatial feature correlations, we embedded the attention mechanisms, and use residual modules for deeply capturing small disturbances in the deep ocean sound velocity distribution caused by changes of SST. Experimental results on real open dataset show that the proposed model outperforms other state-of-the-art methods, which achieves an accuracy with an error of less than 0.3m/s. Specifically, MDF-RAGAN not only outperforms convolutional neural network (CNN) and spatial interpolation (SITP) by nearly a factor of two, but also achieves about 65.8\% root mean square error (RMSE) reduction compared to mean profile, which fully reflects the enhancement of overall profile matching by multi-source fusion and cross-modal attention.


[104] 2509.09027

Regularization in Data-driven Predictive Control: A Convex Relaxation Perspective

This paper explores the role of regularization in data-driven predictive control (DDPC) through the lens of convex relaxation. Using a bi-level optimization framework, we model system identification as an inner problem and predictive control as an outer problem. Within this framework, we show that several regularized DDPC formulations, including l1-norm penalties, projection-based regularizers, and a newly introduced causality-based regularizer, can be viewed as convex relaxations of their respective bi-level problems. This perspective clarifies the conceptual links between direct and indirect data-driven control and highlights how regularization implicitly enforces system identification. We further propose an optimality-based variant, A-DDPC, which approximately solves the inner problem with all identification constraints via an iterative algorithm. Numerical experiments demonstrate that A-DDPC outperforms existing regularized DDPC by reducing both bias and variance errors. These results indicate that further benefits may be obtained by applying system identification techniques to pre-process the trajectory library in nonlinear settings. Overall, our analysis contributes to a unified convex relaxation view of regularization in DDPC and sheds light on its strong empirical performance beyond linear time-invariant systems.


[105] 2509.15946

Differentiable Acoustic Radiance Transfer

Geometric acoustics is an efficient framework for room acoustics modeling, governed by the canonical time-dependent rendering equation. Acoustic radiance transfer (ART) solves the equation by discretization, modeling time- and direction-dependent energy exchange between surface patches with flexible material properties. We introduce DART, an efficient, differentiable implementation of ART that enables gradient-based optimization of material properties. We evaluate DART on a simpler variant of acoustic field learning that aims to predict energy responses for novel source-receiver configurations. Experimental results demonstrate that DART generalizes better under sparse measurement scenarios than existing signal processing and neural network baselines, while maintaining simplicity and full interpretability. We open-source our implementation.


[106] 2509.22378

Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach

Recently, Image-to-Music (I2M) generation has garnered significant attention, with potential applications in fields such as gaming, advertising, and multi-modal art creation. However, due to the ambiguous and subjective nature of I2M tasks, most end-to-end methods lack interpretability, leaving users puzzled about the generation results. Even methods based on emotion mapping face controversy, as emotion represents only a singular aspect of art. Additionally, most learning-based methods require substantial computational resources and large datasets for training, hindering accessibility for common users. To address these challenges, we propose the first Vision Language Model (VLM)-based I2M framework that offers high interpretability and low computational cost. Specifically, we utilize ABC notation to bridge the text and music modalities, enabling the VLM to generate music using natural language. We then apply multi-modal Retrieval-Augmented Generation (RAG) and self-refinement techniques to allow the VLM to produce high-quality music without external training. Furthermore, we leverage the generated motivations in text and the attention maps from the VLM to provide explanations for the generated results in both text and image modalities. To validate our method, we conduct both human studies and machine evaluations, where our method outperforms others in terms of music quality and music-image consistency, indicating promising results. Our code is available at this https URL .


[107] 2510.14664

SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation

Generative speech technologies are progressing rapidly, but evaluating the perceptual quality of synthetic speech remains a core challenge. Existing methods typically rely on scalar scores or binary decisions, which lack interpretability and generalization across tasks and languages. We present SpeechLLM-as-Judges, a new paradigm for enabling large language models (LLMs) to conduct structured and explanation-based speech quality evaluation. To support this direction, we introduce SpeechEval, a large-scale dataset containing 32,207 multilingual speech clips and 128,754 annotations spanning four tasks: quality assessment, pairwise comparison, improvement suggestion, and deepfake detection. Based on this resource, we develop SQ-LLM, a speech-quality-aware LLM trained with chain-of-thought reasoning and reward optimization to improve capability. Experimental results show that SQ-LLM delivers strong performance across tasks and languages, revealing the potential of this paradigm for advancing speech quality evaluation. The relevant code, models, and data are publicly available at this https URL.


[108] 2510.19781

Nodal Capacity Expansion Planning with Flexible Large-Scale Load Siting

We propose explicitly incorporating large-scale load siting into a stochastic nodal power system capacity expansion planning model that concurrently co-optimizes generation, transmission and storage expansion. The potential operational flexibility of some of these large loads is also taken into account by considering them as consisting of a set of tranches with different reliability requirements, which are modeled as a constraint on expected served energy across operational scenarios. We implement our model as a two-stage stochastic mixed-integer optimization problem with cross-scenario expectation constraints. To overcome the challenge of scalability, we build upon existing work to implement this model on a high performance computing platform and exploit scenario parallelization using an augmented Progressive Hedging Algorithm. The algorithm is implemented using the bounding features of mpisppy, which have shown to provide satisfactory provable optimality gaps despite the absence of theoretical guarantees of convergence. We test our approach to assess the value of this proactive planning framework on total system cost and reliability metrics using realistic testcases geographically assigned to San Diego and South Carolina, with datacenter and direct air capture facilities as large loads.


[109] 2511.09363

BarrierBench: Evaluating Large Language Models for Safety Verification in Dynamical Systems

Safety verification of dynamical systems via barrier certificates is essential for ensuring correctness in autonomous applications. Synthesizing these certificates involves discovering mathematical functions with current methods suffering from poor scalability, dependence on carefully designed templates, and exhaustive or incremental function-space searches. They also demand substantial manual expertise--selecting templates, solvers, and hyperparameters, and designing sampling strategies--requiring both theoretical and practical knowledge traditionally shared through linguistic reasoning rather than formalized methods. This motivates a key question: can such expert reasoning be captured and operationalized by language models? We address this by introducing an LLM-based agentic framework for barrier certificate synthesis. The framework uses natural language reasoning to propose, refine, and validate candidate certificates, integrating LLM-driven template discovery with SMT-based verification, and supporting barrier-controller co-synthesis to ensure consistency between safety certificates and controllers. To evaluate this capability, we introduce BarrierBench, a benchmark of 100 dynamical systems spanning linear, nonlinear, discrete-time, and continuous-time settings. Our experiments assess not only the effectiveness of LLM-guided barrier synthesis but also the utility of retrieval-augmented generation and agentic coordination strategies in improving its reliability and performance. Across these tasks, the framework achieves more than 90% success in generating valid certificates. By releasing BarrierBench and the accompanying toolchain, we aim to establish a community testbed for advancing the integration of language-based reasoning with formal verification in dynamical systems. The benchmark is publicly available at this https URL


[110] 2511.19204

Reference-Free Sampling-Based Model Predictive Control

We present a sampling-based model predictive control (MPC) framework that enables emergent locomotion without relying on handcrafted gait patterns or predefined contact sequences. Our method discovers diverse motion patterns, ranging from trotting to galloping, robust standing policies, jumping, and handstand balancing, purely through the optimization of high-level objectives. Building on model predictive path integral (MPPI), we propose a cubic Hermite spline parameterization that operates on position and velocity control points. Our approach enables contact-making and contact-breaking strategies that adapt automatically to task requirements, requiring only a limited number of sampled trajectories. This sample efficiency enables real-time control on standard CPU hardware, eliminating the GPU acceleration typically required by other state-of-the-art MPPI methods. We validate our approach on the Go2 quadrupedal robot, demonstrating a range of emergent gaits and basic jumping capabilities. In simulation, we further showcase more complex behaviors, such as backflips, dynamic handstand balancing and locomotion on a Humanoid, all without requiring reference tracking or offline pre-training.


[111] 2601.09240

DeTracker: Motion-decoupled Vehicle Detection and Tracking in Unstabilized Satellite Videos

Satellite videos provide continuous observations of surface dynamics but pose significant challenges for multi-object tracking (MOT), especially under unstabilized conditions where platform jitter and the weak appearance of tiny objects jointly degrade tracking performance. To address this problem, we propose DeTracker, a joint-detection-and-tracking framework tailored for unstabilized satellite videos. DeTracker introduces a task-driven Global-Local Motion Decoupling (GLMD) module to address the motion imbalance between dominant platform motion and weak target motion. It suppresses background-dominated motion via global semantic alignment at the feature level and captures target-specific motion through local refinement, improving trajectory stability and identity consistency. In addition, a Temporal Dependency Feature Pyramid (TDFP) module is developed to perform cross-frame temporal feature fusion, enhancing the continuity and discriminability of tiny-object representations. We further construct a new benchmark dataset, SDM-Car-SU, which simulates multi-directional and multi-speed platform motions to enable systematic evaluation of tracking robustness under varying motion perturbations. Extensive experiments on both simulated and real unstabilized satellite videos demonstrate that DeTracker significantly outperforms existing methods, achieving 61.1% MOTA on SDM-Car-SU and 45.3% MOTA on real satellite video data. The code and dataset will be publicly available at this https URL.


[112] 2601.14053

LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems

The field of artificial intelligence has undergone a revolution from foundational Transformer architectures to reasoning-capable systems approaching human-level performance. We present LLMOrbit, a comprehensive circular taxonomy navigating the landscape of large language models spanning 2019-2025. This survey examines over 50 models across 15 organizations through eight interconnected orbital dimensions, documenting architectural innovations, training methodologies, and efficiency patterns defining modern LLMs, generative AI, and agentic systems. We identify three critical crises: (1) data scarcity (9-27T tokens depleted by 2026-2028), (2) exponential cost growth ($3M to $300M+ in 5 years), and (3) unsustainable energy consumption (22x increase), establishing the scaling wall limiting brute-force approaches. Our analysis reveals six paradigms breaking this wall: (1) test-time compute (o1, DeepSeek-R1 achieve GPT-4 performance with 10x inference compute), (2) quantization (4-8x compression), (3) distributed edge computing (10x cost reduction), (4) model merging, (5) efficient training (ORPO reduces memory 50%), and (6) small specialized models (Phi-4 14B matches larger models). Three paradigm shifts emerge: (1) post-training gains (RLHF, GRPO, pure RL contribute substantially, DeepSeek-R1 achieving 79.8% MATH), (2) efficiency revolution (MoE routing 18x efficiency, Multi-head Latent Attention 8x KV cache compression enables GPT-4-level performance at $<$$0.30/M tokens), and (3) democratization (open-source Llama 3 88.6% MMLU surpasses GPT-4 86.4%). We provide insights into techniques (RLHF, PPO, DPO, GRPO, ORPO), trace evolution from passive generation to tool-using agents (ReAct, RAG, multi-agent systems), and analyze post-training innovations.


[113] 2602.14394

Increasing ultrasound field-of-view with reduced element count arrays containing large elements

Several applications of medical ultrasound can benefit from a larger field of view (FOV). This study is aimed at increasing the FOV of linear array probes by increasing the element width. Coupled elements were used to imitate a larger element width. Through Fourier analysis, theoretical pressure amplitudes, and bandwidth estimates, coupled elements are shown to be close approximations of large elements. The effects of coupling on resolution, contrast, and speckle signal-to-noise ratio are investigated through phantom images and in-vivo images of a rabbit tumor reconstructed with plane-wave compounding. Furthermore, a positioning system was used to acquire data from a virtual large aperture with 120 mm FOV and 128 elements, collected in sections with a single probe. The Null Subtraction Imaging (NSI), Sign Coherence Factor (SCF), and Minimum Variance (MV) beamformers are compared for regaining resolution lost by an increased F-number. The NSI beamformer decreased Full-Width at Half-Max (FWHM) estimates of wire targets by 79% with coupling by 2 compared to uncoupled DAS. The MV beamformer was best for maintaining speckle statistics while improving resolution. Our results demonstrate how increased element width can increase FOV with no increase to element count.


[114] 2602.20328

GSNR: Graph Smooth Null-Space Representation for Inverse Problems

Inverse problems in imaging are ill-posed, leading to infinitely many solutions consistent with the measurements due to the non-trivial null-space of the sensing matrix. Common image priors promote solutions on the general image manifold, such as sparsity, smoothness, or score function. However, as these priors do not constrain the null-space component, they can bias the reconstruction. Thus, we aim to incorporate meaningful null-space information in the reconstruction framework. Inspired by smooth image representation on graphs, we propose Graph-Smooth Null-Space Representation (GSNR), a mechanism that imposes structure only into the invisible component. Particularly, given a graph Laplacian, we construct a null-restricted Laplacian that encodes similarity between neighboring pixels in the null-space signal, and we design a low-dimensional projection matrix from the $p$-smoothest spectral graph modes (lowest graph frequencies). This approach has strong theoretical and practical implications: i) improved convergence via a null-only graph regularizer, ii) better coverage, how much null-space variance is captured by $p$ modes, and iii) high predictability, how well these modes can be inferred from the measurements. GSNR is incorporated into well-known inverse problem solvers, e.g., PnP, DIP, and diffusion solvers, in four scenarios: image deblurring, compressed sensing, demosaicing, and image super-resolution, providing consistent improvement of up to 4.3 dB over baseline formulations and up to 1 dB compared with end-to-end learned models in terms of PSNR.