Beam alignment is a key challenge in directional mmWave and THz systems, where narrow beams require accurate yet low-overhead training. Existing learning-based approaches typically predict a single beam and do not quantify uncertainty, limiting adaptive beam sweeping. We recast beam alignment as a generative task and propose a conditional diffusion model that learns a probabilistic beam prior from compact geometric and multipath features. The learned priors guide top-$k$ sweeps and capture the SNR loss induced by limited probing. Using a ray-traced DeepMIMO scenario with an 8-beam DFT codebook, our best conditional diffusion model achieves strong ranking performance (Hit@1 $\approx 0.61$, Hit@3 $\approx 0.90$, Hit@5 $\approx 0.97$) while preserving SNR at small sweep budgets. Compared with a deterministic classifier baseline, diffusion improves Hit@1 by about 180\%. Results further highlight the importance of informative conditioning and the ability of diffusion sampling to flexibly trade accuracy for computational efficiency. The proposed diffusion framework achieves substantial improvements in small-$k$ Hit rates, translating into reduced beam training overhead and enabling low-latency, energy-efficient beam alignment for mmWave and THz systems while preserving received SNR.
Multi-modal image registration plays a critical role in precision medicine but faces challenges from non-linear intensity relationships and local optima. While deep learning models enable rapid inference, they often suffer from generalization collapse on unseen modalities. To address this, we propose Search-MIND, a training-free, iterative optimization framework for instance-specific registration. Our pipeline utilizes a coarse-to-fine strategy: a hierarchical coarse alignment stage followed by deformable refinement. We introduce two novel loss functions: Variance-Weighted Mutual Information (VWMI), which prioritizes informative tissue regions to shield global alignment from background noise and uniform regions, and Search-MIND (S-MIND), which broadens the convergence basin of structural descriptors by considering larger local search range. Evaluations on CARE Liver 2025 and CHAOS Challenge datasets show that Search-MIND consistently outperforms classical baselines like ANTs and foundation model-based approaches like DINO-reg, offering superior stability across diverse modalities.
This paper investigates a planar tracking problem between a leader and follower agent. We propose a novel feedback speed control law, paired with a constant bearing steering strategy, to maintain an abreast formation between the two agents. We prove that the proposed control yields asymptotic stability of the closed-loop system when the steering of the leader is known. For the case when the leader's steering is unavailable to the follower, we show that the system is still input-to-state stable with respect to the leader's steering viewed as an input. Furthermore, we demonstrate that if the leader's steering is periodic, the follower will asymptotically converge to a periodic orbit with the same period. We validate these results through numerical simulations and experimental implementations on mobile robots. Finally, we demonstrate the scalability of the proposed approach by extending the two-agent control law to an N-agent chain network, illustrating its implications for directional information propagation in biological and engineered flocks.
Effective startup control is critical for the safe and reliable operation of Dual Active Bridge (DAB) converters. Unlike traditional soft-start techniques that rely solely on phase-shift control or fixed dead-time settings, the proposed approach gradually reduces the dead time from a value close to one switching period to the hardware-defined minimum. This enables a smooth buildup of the secondary-side voltage while effectively minimizing voltage overshoot and suppressing inrush current during startup. As a result, the leakage inductor current rises in a controlled manner, ensuring safe and predictable startup behavior. Simulation results demonstrate that conventional startup methods lead to severe voltage overshoot and high inrush currents, whereas the proposed method achieves a gradual voltage rise with well-regulated current profiles. Experimental validation using a 15 kW hardware platform confirms the effectiveness and robustness of the approach under different operating conditions. The proposed technique is simple, hardware-friendly, easily implementable on standard microcontrollers, and applicable to nth - order DAB architecture, making it a versatile solution for enhancing the reliability and safety of DAB converters in practical applications.
The increasing use of LLM-based agents to support decision-making and control across diverse domains motivates the need for systematic deconfliction of their proposed actions. We present a deconfliction framework for coordinating multiple agents that formally encapsulate individual applications, each proposing potentially conflicting actions over shared resources. Conflicts are resolved through three deconfliction modes: bilateral negotiation, structured mediation, and procedural (deterministic) deconfliction. We define design principles for large language model-based client agents, including a chain-of-thought style reasoning process, and introduce an iterative weighted-consensus mechanism that does not require the applications themselves to solve optimization problems. The framework is domain agnostic and supports both numeric and non-numeric decisions. Its performance is demonstrated on a power distribution use case with conflicting advanced distribution management system applications for cost optimization and resilience, coordinating diesel generators and battery energy storage systems.
With advancements in multimodal communication technologies, remote learning environments such as, distance universities are increasing. Remote learning typically happens asynchronously. As a consequence, unlike face-to-face in-person classroom teaching, this lacks availability of sufficient emotional cues for making learning a pleasant experience. Motivated by advances made in the paralinguistic speech processing community on emotion prediction, in this paper we explore use of speech for sensing students' emotions by building upon speech-based self-control tasks developed to aid effective remote learning. More precisely, we investigate: (a) whether speech acquired through self-control tasks exhibit perceptible variation along valence, arousal, and dominance dimensions? and (b) whether those dimensional emotion variations can be automatically predicted? We address these two research questions by developing a dataset containing spontaneous monologue speech acquired as open responses to self-control tasks and by carrying out subjective listener evaluations and automatic dimensional emotion prediction studies on that dataset. Our investigations indicate that speech-based self-control tasks can be a means to sense student emotion in remote learning environment. This opens potential venues to seamlessly integrate paralinguistic speech processing technologies in the remote learning loop for enhancing learning experiences through instructional design and feedback generation.
Implicit neural representations (INRs) provide a parameter-efficient and fully differentiable image model for CT reconstruction. However, optimizing INRs for CT reconstruction using standard auto-differentiation techniques can be prohibitively GPU memory-intensive, especially in 3D imaging, due to the large number of INR evaluations needed to simulate ray projections. To address this issue, we propose a memory-efficient stochastic gradient approximation based on decomposing the gradient into a Jacobian-vector product that is amenable to stochastic subsampling. This approximation allows the user to trade-off between GPU memory usage and gradient approximation accuracy. Our experiments on synthetic 2D data demonstrate that gradient approximation uses far less GPU memory than standard INR training, while yielding reconstructions that are comparable in convergence behavior and mean squared error. Finally, we demonstrate that the proposed approach allows for memory-efficient 3D cone beam CT reconstruction in a sparse-view setting.
Frequency dynamics in power systems reflect active power imbalance in real time, thereby providing an instantaneous signal to inform electricity pricing. However, existing real-time markets operate on much slower timescales and fail to exploit this signal. In this letter, we develop integrated market--frequency dynamics that enable online pricing directly from frequency measurements. Representing the real-time market as a dynamic price-discovery process, and integrating this process with the grid frequency dynamics, we derive an explicit price formation mechanism from frequency measurements. This mechanism manifests as a distributed PID-like controller for each generator, where frequency response is driven and remunerated by electricity prices derived solely from local frequency measurements.
In this paper, we propose a distributed optimization-learning framework for terahertz (THz) cell-free integrated sensing and communication (CF-ISAC) systems, termed Distributed Optimization-Learning with Graph Transformers (DOLG). We first formulate a highly non-convex joint scheduling and signal design problem for THz CF-ISAC systems, jointly optimizing access point (AP)-user equipment (UE) association and beamforming under signal to interference plus noise ratio based communication and Cramér-Rao bound based sensing constraints, together with line-of-sight-driven visibility rules and per-AP power constraints. We also develop an optimization based benchmark utilizing a tractable relaxed reformulation. Building upon this optimization structure, we redesign a graph transformer network (GTN) as an optimization-aware representation module that encodes cross-field wavefront geometry, blockage visibility, and sensing relevance in a permutation-equivariant manner. The proposed DOLG framework amortizes the iterative optimization procedure into a scalable GTN-conditioned distributed multi-agent reinforcement learning policy through centralized training and decentralized execution, while preserving per-AP power constraints via structure-preserving projections. Simulation results demonstrate that the proposed DOLG framework achieves stable convergence and effectively balances the communication-sensing tradeoff. From the system-level perspective, it outperforms multicell and non-joint design baselines. Furthermore, it surpasses conventional optimization based and heuristic approaches in terms of both ISAC performance and computational scalability.
This paper introduces an LLM agent that automates power grid static analysis by converting natural language into MATPOWER scripts. The framework utilizes DeepSeek-OCR to build an enhanced vector database from MATPOWER manuals. To ensure reliability, it devises a three-tier error-correction system: a static pre-check, a dynamic feedback loop, and a semantic validator. Operating via the Model Context Protocol, the tool enables asynchronous execution and automatically debugging in MATLAB. Experimental results demonstrate that the system achieves a 82.38% accuracy regarding the code fidelity, effectively eliminating hallucinations even in complex analysis tasks.
We present a metasurface imaging system capable of simultaneously capturing two images at close range (1-2~cm) and an additional image at long range (about 40~cm) on a shared photosensor. The close-range image pair focuses at 1.4~cm and 2.0~cm, respectively, which forms a focal stack, enabling passive ranging with an accuracy of $\pm$1~mm from 12~mm to 20~mm through a computationally efficient depth-from-defocus algorithm for a simplified scenario. The entire system is compact, with a total track length of 15~mm, making it suitable for seamless integration into edge platforms for defense and other resource-constrained applications.
Popular Bayes filters often apply linearization techniques, such as Taylor expansion or stochastic linear regression, to enable the use of the Kalman filter structure, but this can lead to large errors in strongly nonlinear systems. The recently proposed NANO filter addresses this issue by interpreting the prediction and update steps of Bayesian filtering as two distinct optimization problems and solving them through moment matching and natural gradient descent, thereby avoiding model linearization errors. However, the natural gradient update in NANO can occasionally diverge because the posterior covariance in its iteration may lose positive definiteness. Our analysis shows that the posterior covariance is the sum of the inverse prior covariance and the expected Hessian of the log-likelihood function, and that the indefiniteness of the latter term is the root cause of update failure. To address this issue, we propose two remedies. The first approximates the log-likelihood Hessian using the Gauss-Newton method, representing it as the self-adjoint product of the Jacobian of the normalized measurement residual, which is guaranteed to be positive semi-definite. The second reformulates the covariance update as an exponential-form update of the Cholesky factor and reconstructs the covariance via its Gram matrix, which ensures positive definiteness. Experiments on three classical nonlinear systems demonstrate that the proposed NANO filter with guaranteed positive definiteness outperforms popular members of the Kalman filter family and original NANO filter.
Integrated sensing and communication (ISAC) is widely regarded as one of the key enabling technologies for future sixth-generation (6G) wireless communication systems. In this work, we investigate a bistatic ISAC system in the presence of a disco reconfigurable intelligent surface (DRIS), whose random and time-varying reflection coefficients emulate a "disco ball." The introduction of the DRIS breaks the underlying assumption in existing ISAC systems that the sensing and communication channels remain static or quasi-static within the channel coherence time. We first develop a bistatic system model incorporating the DRIS and characterize all involved wireless channels. Then, an ISAC waveform design that balances sensing and communication performance is proposed by formulating a Pareto optimization problem, where the trade-off is controlled through a tunable factor. Communication and sensing performance in the bistatic ISAC system are quantified by the signal-to-interference-plus-noise ratio (SINR) and the Cramer-Rao lower bound (CRLB), respectively. To quantify the impact of the DRIS on the bistatic ISAC system, we derive the statistical characteristics of DRIS-induced active channel aging (ACA) channels for communications and the cascaded DRIS-based sensing channel. Then, we establish a theoretical lower bound on the SINR and closed-form CRLB expressions in the presence of a DRIS. The analysis reveals several distinctive properties of the DRIS in bistatic ISAC systems. In particular, the DRIS degrades communication performance significantly due to the introduction of ACA interference. In contrast, with respect to sensing performance, the DRIS decreases the estimation accuracy of the angle of departure (AoD) while concurrently enhancing that of the angle of arrival (AoA). Numerical results validate the derived theoretical analysis and confirm these DRIS-induced behaviors.
Grid-forming (GFM) inverters are expected in future inverter-dominated grids. In such grids, time-domain protection schemes, for example those based on instantaneous incremental quantities (IQs), are being advocated as potential solutions to the challenges faced by traditional phasor-based protection schemes, due to their ability to process nonlinear data. However, IQ-based protection uses the superposition principle; thus, linearity is still assumed in their application, while GFM inverters are nonlinear sources during faults. This paper proposes an analytical model to study the impact of GFM inverters on the relay-measured IQs. The model is validated with PSCAD/EMTDC simulations, and is used to investigate the interoperability of time-domain IQ-based distance protection with GFM inverters employing different current limiters. Results show that time-domain IQ-based distance protection demonstrates superior dependability for close-in faults compared to that of quadrilateral distance protection with GFM inverters, and it has the possibility to be secure for external faults when quadrilateral distance protection overreaches; however, tuning of its settings is hard to generalize for various sources and faults. Taking the observed interoperability issues into account, a trip criterion for dependable and secure time-domain IQ-based distance protection is proposed, which facilitates easy-to-tune and general settings for applications with GFM inverters.
As an effective approach to understanding the human-centric physical world, Wearable Artificial Intelligence (AI), which leverages multimodal wearable sensors to understand human physiology and behavior, has attracted increasing attention in recent years. However, existing sensor models remain largely siloed by modality and task, lacking a unified paradigm for integrating diverse wearable modalities, training strategies, and achieving robust generalization in real-world applications. Motivated by the success of multimodal foundation models, which learn transferable representations from massive multimodal data, we argue that Large Sensor Models (LSMs), defined as foundation models trained on large-scale and multimodal wearable data, offer a promising pathway toward a more general and scalable framework for wearable AI. In this position paper, we formalize the data substrate underlying LSMs, analyze the unique challenges of large-scale wearable sensing, and articulate two directions: (i) LSMs without language capability and (ii) LSMs with language capability. We further discuss representative application areas that can be unlocked by such models. Through this paper, we encourage the community to explore LSMs as a foundational approach for the next generation of human-centric AI systems.
In this paper, we propose a digital control approach for multi-input multi-output negative imaginary (NI) systems using discrete-time hybrid integrator-gain systems (HIGS) controllers. We show the NI property of the bimodal and trimodal discrete-time HIGS, as well as the parallel combinations of them, which are referred to as the multi-HIGS. Also, we demonstrate that linear NI systems can be asymptotically stabilized using discrete-time HIGS in digital control. We apply discrete-time bimodal and trimodal multi-HIGS controllers to a two-input two-output dual-stage force sensor with lightly damped resonant modes. To validate the theoretical findings, the closed-loop performance is evaluated in both time and frequency domains. Experimental results show that the discrete-time multi-HIGS effectively suppresses resonances while preserving favorable phase characteristics, which highlights its potential as a robust nonlinear NI controller for the digital control of NI systems.
A hybrid physical and geometrical optics method is proposed to model the subsurface imaging using mmWave FMCW radar. Modeling of the wave propagation for subsurface imaging can improve the interpretation of acquired data and imaging results. Full-wave simulation is common in simulating wave propagation. However, when the frequency is high such as mmWave frequency, it is difficult to implement since it costs large computation resource and time. In this paper, the physical and geometrical optics are hybridized to simulate the wave propagation in subsurface imaging scenarios. In the proposed method, physical optics method is utilized to calculate the reflection from the object and geometrical optics method is utilized to calculate the transmission of the wave through object. By combining the results from physical and geometrical optics, the wave propagation in the subsurface imaging scenarios is simulated. The synthetic-aperture radar imaging is applied to the simulated data and the image is successfully reconstructed. Further, the experiment setup is developed and the comparison between simulation and experiment is carried out. The results demonstrated that the proposed simulation method can model the subsurface imaging with mmWave FMCW radar.
The development of 6G networks brings an increasing variety of data services, which motivates the hybrid computation paradigm that coordinates the over-the-air computation (AirComp) and edge computing for diverse and effective data processing. In this paper, we address this emerging issue of hybrid data computation from an energy-efficiency perspective, where the coexistence of both types induces resource competition and interference, and thus complicates the network management. Accordingly, we formulate the problem to minimize the overall energy consumption including the data transmission and computation, subject to the offloading capacity and aggregation accuracy. We then propose a block coordinate descent framework that decomposes and solves the subproblems including the user scheduling, power control, and transceiver scaling, which are then iterated towards a coordinated hybrid computation solution. Simulation results confirm that our coordinated approach achieves significant energy savings compared to baseline strategies, demonstrating its effectiveness in creating a well-coordinated and sustainable hybrid computing environment.
The joint design of analog beamforming and power allocation is investigated for a single radio-frequency chain multiuser time-division multiple access system under a max-min signal-to-noise ratio (SNR) criterion. A hardware-efficient phased-array architecture is considered, where the beamforming vector is shared by all users and is subject to constant-modulus constraints. For any fixed analog beamformer, the optimal power allocation is first derived in closed form, by which the original problem is reduced to phase-shift optimization only. Then, globally optimal branch-and-bound (BB) algorithms are developed for discrete and continuous phase shifts. Numerical results show that the proposed BB algorithms achieve the global optimum and provide reliable benchmarks for evaluating the performance gap of low-complexity alternating-optimization methods.
Near-field extremely large multiple input multiple output (XL-MIMO) breaks the assumptions that make classical super-resolution effective: the receiver acquires only a limited set of compressed pilot observations, while each propagation path is jointly determined by angle and distance under a spherical-wave model. This invalidates the far-field Vandermonde structure exploited by conventional methods, and many existing near-field formulations remain only gridless by discretizing range and angle and thus inheriting mismatch, coherence, and resolution loss. This paper develops a continuous 2D super-resolution framework for hybrid near-field measurements that avoids range and angle gridding. The key idea is to reparameterize distance through inverse range, which reveals a compact spectral structure for the near-field spherical-wave manifold. Building on this observation, we introduce a panelized weighted fitting strategy that converts the range-dependent Fresnel terms into a stable transform-domain representation, resulting in a lifted mode, in which each continuous range-angle pair is embedded as a structured rank-one atom and the measurement model remains linear under hybrid combining. Recovery is then posed as a 2D atomic norm minimization, with path localization certified through a dual polynomial over the transformed domain. Numerical experiments show exact support recovery in the noiseless setting using only few compressed hybrid measurements. These results establish the proposed inverse-range atomic norm viewpoint as a new gridless foundation for near-field sensing and channel estimation in hybrid XL-MIMO and integrated sensing and communication systems.
The development of medium-voltage direct current (MVDC) cable systems for wide-body all-electric aircraft (AEA) requires insulation technologies capable of operating reliably under reduced-pressure environments. Conventional underground cable insulation, designed for atmospheric conditions, exhibits degraded partial discharge (PD) and dielectric performance at low pressure, limiting its applicability to aerospace systems. This work presents a controlled experimental comparison between a conventional single-layer extruded insulation system and a micro-multilayer multifunctional electrical insulation (MMEI) architecture, in which all cable components are kept identical except for the insulation. The MMEI system is implemented with only 10% of the baseline insulation thickness to evaluate the effectiveness of insulation architecture in enhancing performance. PD characteristics and dielectric strength are experimentally evaluated under DC voltage at atmospheric pressure and 18.8 kPa. Results show that the MMEI-based cable exhibits higher PD inception voltage (PDIV) and maintains a detectable PD extinction voltage (PDEV) under reduced pressure, unlike the conventional cable. Furthermore, despite its significantly reduced thickness, the MMEI system demonstrates a substantial increase in dielectric breakdown strength, withstanding voltages exceeding 20 kV compared to below 5 kV for the conventional design under low-pressure conditions. These findings demonstrate that insulation architecture, rather than thickness alone, governs performance in MVDC aerospace cables. The results highlight the potential of MMEI systems to enable lighter, more compact, and higher-performance cable designs for future electrified aviation platforms.
Segmented pinching antenna assisted integrated sensing and communication (ISAC) systems enable flexible spatial resource utilization by allowing different waveguide segments to be dynamically configured for transmission and reception. However, the resulting design requires the joint optimization of antenna deployment, segment partitioning, and beamforming under coupled communication and sensing constraints. In this paper, we propose a general learning framework for segmented pinching antenna assisted ISAC systems. Specifically, a channel state information (CSI)-induced self-graph is constructed to capture the scenario-dependent interactions among communication users and sensing targets. Based on the learned graph representation, a large language model (LLM) backbone with low-rank adaptation (LoRA) is employed, followed by two task-specific output heads for antenna deployment and beamforming prediction, respectively. Simulation results show that the proposed framework achieves a favorable tradeoff between communication rate and sensing accuracy
This paper develops an entropy-based stability and robustness framework for nonlinear hypergraph dynamics with conservation and flow balance. We consider generator-form systems on the simplex whose state-dependent transition rates capture higher-order (tensor) interactions among nodes. Under a tensor generalized detailed-balance (TGDB) condition, we show that the system admits a unique equilibrium and an entropy Lyapunov function ensuring global asymptotic stability. The Jacobian restricted to the tangent subspace of the simplex is Hurwitz, and its spectral gap determines the exponential convergence rate. Building on this structure, we derive first-order sensitivity bounds of the equilibrium under perturbations of the coupling tensor and establish a local input-to-state stability (ISS) estimate with respect to external inputs. The results reveal a quantitative link between the spectral gap and the system's robustness margin: larger spectral gaps imply smaller equilibrium shifts and faster recovery under structural or parametric perturbations. Numerical experiments on tensor-coupled flow models confirm the theoretical predictions and illustrate how the proposed entropy-dissipating framework unifies stability and robustness analysis for conservative higher-order network systems.
Electroencephalography provides a non-invasive and cost-effective approach for analyzing neural patterns associated with alcohol dependence. However, reported classification performance in EEG-based alcoholism studies varies considerably, often due to differences in validation strategies rather than intrinsic model capability. This study presents a validation-aware machine learning framework to assess the impact of evaluation methodology on classification performance. A balanced multi-channel EEG dataset of 300 trials (150 alcoholic, 150 control) was analyzed using a structured feature representation combining statistical descriptors and spectral band interactions. Five classifiers, including support vector machines (linear and radial basis function kernels), random forest, k-nearest neighbors, and AdaBoost, were evaluated under standard and nested cross-validation protocols. Results show that conventional validation with global hyperparameter tuning introduces optimistic bias. In particular, SVM with radial basis function kernel exhibited a performance decrease of approximately 5\% under nested cross-validation, indicating overestimation. Ensemble-based methods showed more stable generalization, with AdaBoost achieving the highest performance, reaching 78.3\% accuracy ($\pm$4.25), an AUC of 0.868, and balanced sensitivity (78.67\%) and specificity (81.33\%). These findings highlight that validation strategy is a primary determinant of perceived model performance. Statistical analysis using McNemar's test further shows that most performance differences between models are not statistically significant, emphasizing careful interpretation of classification results. The proposed framework provides a reproducible and robust basis for evaluating machine learning models in biomedical signal analysis.
Orthogonal frequency-division multiplexing (OFDM) is a dominant waveform in modern wireless systems, yet its high peak-to-average power ratio (PAPR) and limited adaptability hinder efficient support for integrated communication and sensing. This paper proposes deep block-unitary precoded OFDM (DBU-OFDM), a structure-preserving learning framework that enables trainable waveform adaptation while preserving the DFT-based signal structure, pilot/null resource protection, and compatibility with low-complexity frequency-domain equalization. The proposed design restricts learning to a block-unitary transformation over data subcarriers and preserves pilot and null resources for structural compatibility. The transform is parameterized by recursive Householder reflections, ensuring strict unitarity as well as differentiable, numerically stable, and complexity-controllable implementation. Results show that DBU-OFDM achieves PAPR tails close to block-pilot DFT-s-OFDM while retaining comb-type pilots, improves communication reliability in frequency-selective fading via frequency-domain diversity, and enhances range and velocity estimation in direct sensing, especially in dimension-limited settings. Over-the-air USRP experiments and FPGA prototyping further verify its practical feasibility, demonstrating low error vector magnitude (EVM), clear PAPR reduction in real transmission, and hardware throughput up to 200~MS/s with microsecond-level latency. DBU-OFDM therefore offers a practical intermediate solution between conventional model-based OFDM waveforms and unconstrained neural transceivers for next-generation integrated communication and sensing systems.
In this work, we propose an interpretable, robust, and lightweight machine learning method for automatic modulation classification (AMC) under dynamic and noisy channel conditions. It is called green automatic modulation classification (GAMC) and targets edge artificial intelligence (AI) with low computational complexity and a small model size. GAMC operates in four stages. First, raw received I/Q signals are transformed into multi-domain representations, including constellation diagrams and spatio-temporal graphs. Second, we extract a comprehensive set of statistical and topological features from time-series signals, constellation diagrams, and graphs. Third, a supervised feature learning process leverages label guidance to project high-dimensional features into robust, discriminative low-dimensional ones. Finally, a context-aware Signal-to-Noise Ratio (SNR) soft routing mechanism ensembles predictions from downstream classifiers. Experimental results show that GAMC effectively mitigates domain shifts caused by high noise. It strikes a good balance between accuracy and efficiency, reducing the number of model parameters by $50\%$, operating at $3\%$ to $42\%$ of the computational cost of lightweight deep learning models, and maintaining higher accuracy in various SNRs.
Physical Layer Authentication (PLA) exploits the spatial uniqueness of wireless channel characteristics in order to authenticate devices without recourse to higher-layer cryptographic protocols, which remain vulnerable to key compromise. This paper reports a comprehensive PLA system constructed on 5G New Radio (NR) Sounding Reference Signals (SRS) extracted from a real OpenAirInterface (OAI) testbed operating in band n78 (3.5 GHz) with 40 MHz bandwidth and 30 kHz subcarrier spacing. The proposed approach extracts a 2,531-dimensional feature vector per SRS probe, combining per-subcarrier channel state information (1,248 amplitude and 1,247 differential-phase coefficients), power delay profile taps, delay spread, Doppler statistics, and nonlinear dynamics indicators. A deep one-dimensional Residual Network (1D-ResNet) augmented with Squeeze-and-Excitation (SE) attention blocks is employed to classify each probe as either legitimate or spoofed. Evaluation is conducted on 20,317 over-the-air SRS probes acquired across four measurement sessions using a USRP B210 software-defined radio as the legitimate device and a commercial mobile handset as the attacker. Under a strict chronological train/validation/test split that eliminates temporal leakage, an Equal Error Rate (EER) of 3.92% is attained, with AUC = 0.962 on the held-out test set, and an authentication latency of less than 0.1 ms per probe, which is compatible with 5G Ultra-Reliable Low-Latency Communications (URLLC) requirements.
Integrated sensing and communication (ISAC) requires spatial architectures that can flexibly balance data transmission and environment sensing. Segmented pinching antenna-assisted ISAC provides such flexibility by allowing different waveguide segments to be dynamically configured for transmission and reception. However, its design involves the joint optimization of antenna deployment, segment partitioning, and beamforming under coupled communication and sensing constraints, which becomes particularly challenging when the numbers of communication users and sensing targets vary across scenarios. To endow the system with stronger adaptability to changing user and target configurations, we propose a general learning framework for segmented pinching antenna-assisted ISAC systems. Specifically, a channel state information (CSI)-induced self-graph is constructed to produce permutation-invariant representations of user-target interactions, and the resulting features are processed by a large language model (LLM) backbone with two task-specific heads for jointly predicting antenna deployment, segment partitioning, and ISAC beamforming. In addition, a user count transfer mechanism is developed to examine whether the learned deployment policy is site-specific and reusable under changed user configurations. Simulation results show that the proposed framework achieves higher communication rates while maintaining reliable sensing accuracy. Moreover, the learned deployment policy remains highly stable when transferring to other user counts, which reduces the training cost from full model retraining to beamforming head adaption.
As a key enabler for sixth-generation (6G) wireless communications, reconfigurable intelligent surfaces (RISs) provide the flexibility to control signal strength. Nevertheless, optimizing hundreds of elements is computationally expensive. To overcome this challenge, we present a quantum framework (QGCN) to jointly optimize the physical and electromagnetic response of a double-sided RIS design that incorporates discrete phase shifts and inter-element coupling. The core contribution is the adaptive activation or deactivation of elements, allowing a virtual spacing mechanism using PIN diode switches. We then solve a multi-objective problem that maximizes the minimum user data rate subject to constraints on aperture length and mutual coupling between active elements. Experimental results on IBM Quantum's 127-qubit ibm_kyiv superconducting processor demonstrate that the proposed QGCN algorithm reduces both per-iteration computational complexity and memory requirements compared to existing approaches. Also, the QGCN outperforms classical graph neural networks (GNN) on an equivalent graph topology by an additional $+$0.38 bps/Hz. This advantage is increasing with increasing array sizes.
With the rapid growth of Multi-access Edge Computing (MEC), secure and efficient computation offloading from user equipment (UEs) to edge access points (APs) is critical. However, DISCO intelligent reflective surface-based fully-passive jammers (DIRS-based FPJs) use random time-varying phase shifts to launch DISCO jamming attacks, disrupting offloading performance. This paper leverages an aerial intelligent reflective surface (AIRS) to enable secure computation offloading against DISCO jamming by jointly optimizing offloading ratios, AIRS phase shifts, and deployment. A two-timescale (2Ts) framework is proposed to address the optimization challenge caused by the distinct update frequencies of different strategies. Specifically, AIRS deployment is adjusted on a long timescale to boost antijamming capability due to the impracticality of frequent physical adjustment, while offloading ratios and phase shifts are optimized on a short timescale to adapt to DIRS-jammed dynamic channel conditions. We propose a dual-agent deep reinforcement learning (DRL)-based AIRS deployment-aided secure computation offloading (DDADSO) scheme to maximize the secure offloading utility under DISCO jamming. Simulation results verify that the proposed DDADSO scheme outperforms benchmark schemes, demonstrating the effectiveness of AIRS deployment in improving offloading performance against DISCO jamming attacks.
Coordinated operation of alkaline water electrolysis (AWE) systems with multiple electrolyzers under fluctuating renewable power input is challenging due to varying power availability and dynamic safety constraints. Moreover, the conventional separation between optimization and control may result in inconsistent decisions across timescales. To address these issues, this paper proposes a two-layer coordinated operation method integrating feedback optimization (FO) with a projection-based safety layer. The FO layer generates real-time reference inputs to improve renewable energy utilization, while the safety layer corrects these inputs to ensure compliance with operational and safety constraints. To explicitly address the safety constraints arising from the inertial dynamics of AWE systems, discrete-time control barrier function theory is incorporated into the safety layer, thereby enhancing safety assurance and online computational tractability. Theoretical analysis establishes the feasibility and effectiveness of the proposed method. Case studies based on annual wind generation data show that the proposed method achieves high energy utilization, maintains safe operation, and demonstrates online applicability, scalability, and robustness.
This paper studies the problem of distributed state estimation of linear time-invariant (LTI) systems under event-triggered communication. For event-triggering mechanisms, the existence of positive minimum inter-event times (MIETs) is an essential property for ensuring practicality. It is widely recognized that dynamic event-triggering mechanisms can effectively reduce redundant communication. However, for distributed observers, it remains unclear whether dynamic event-triggering mechanisms can ensure positive MIETs. This paper proposes a dynamic event-triggered distributed observer. By introducing new comparison functions, it is proven that the dynamic event-triggered distributed observer can guarantee strictly positive MIETs and ensure the exponential convergence of the estimation error. Moreover, most existing works on event-triggered distributed observers only consider node-based event-triggering mechanisms, while both node-based and edge-based dynamic event-triggering mechanisms are constructed in this paper. Numerical examples are provided to illustrate the effectiveness of the proposed results.
Recent progress in brain-guided image generation has improved the quality of fMRI-based reconstructions; however, fundamental challenges remain in preserving object-level structure and semantic fidelity. Many existing approaches overlook the spatial arrangement of salient objects, leading to conceptually inconsistent outputs. We propose a saliency-driven decoding framework that employs graph-informed saliency priors to translate structural cues from brain signals into spatial masks. These masks, together with semantic information extracted from embeddings, condition a diffusion model to guide image regeneration, helping preserve object conformity while maintaining natural scene composition. In contrast to pipelines that invoke multiple diffusion stages, our approach relies on a single frozen model, offering a more lightweight yet effective design. Experiments show that this strategy improves both conceptual alignment and structural similarity to the original stimuli, while also introducing a new direction for efficient, interpretable, and structurally grounded brain decoding.
Understanding the optimization landscape of linear quadratic regulation (LQR) problems is fundamental to the design of efficient reinforcement learning solutions. Recent work has made significant progress in characterizing the landscape of static output-feedback control and linear quadratic Gaussian (LQG) control. For LQG, much of the analysis leverages the separation principle, which allows the controller and estimator to be designed independently. However, this simplification breaks down when the gradients with respect to the estimator and controller parameters are inherently coupled, leading to a more intricate analysis. This paper investigates the optimization landscape of observer-based dynamic output-feedback control of LQR problems. We derive the optimal observer-controller pair in settings where transient quadratic performance cannot be neglected. Our analysis reveals that, in general, the combination of the standard LQR controller and the observer that minimizes the trace of the accumulated estimation error covariance does not correspond to a stationary point of the overall closed-loop performance objective. Moreover, we derive a pair of discrete-time Sylvester equations with symmetric structure, both involving the same set of matrix elements, that characterize the stationary point of the observer-based dynamic LQR problem. These equations offer analytical insight into the structure of the optimality conditions and provide a foundation for developing numerical policy gradient methods aimed at learning complex controllers that rely on reconstructed state information.
This paper investigates a fluid reconfigurable intelligent surface (FRIS)-assisted Rydberg Atomic REceiver (RARE) architecture under magnitude-only heterodyne readout. We show that, unlike conventional coherent systems, the optimal propagation environment is fundamentally governed by the receiver's nonlinear measurement structure. In particular, under the strong-reference regime, symbol detection is limited by residual quadrature leakage after reference alignment, motivating a receiver-induced channel shaping approach rather than conventional channel-centric optimization. Based on this insight, we formulate a signal-independent leakage minimization problem that jointly optimizes the FRIS port set, finite-resolution phase shifts, and the transmit beamformer, resulting in a nonconvex mixed discrete-continuous design. To address this, we develop an alternating-optimization (AO) framework comprising: (i) a closed-form eigenvector solution for widely-linear beamforming, (ii) cross-entropy method (CEM)-based combinatorial port selection, and (iii) coordinate-descent (CD) phase refinement with guaranteed monotonic descent. Simulation results demonstrate fast convergence and consistent bit-error-rate (BER) gains across various modulation orders and receiver dimensions. Moreover, the proposed FRIS-enabled design achieves near-exhaustive performance with significantly reduced complexity and consistently outperforms conventional RIS schemes with fixed elements, highlighting the effectiveness of spatial reconfiguration in suppressing quadrature leakage and the additional spatial degree-of-freedom (DoF) enabled by FRIS for reliable atomic-MIMO detection.
In this paper, we consider the notions of effort and resilience of a dynamical control system defined by the maximum disturbance the system can withstand while satisfying given finite temporal logic specifications. Given a dynamical system and a specification, the objective is to synthesize the controller such that the system satisfies the specification while maximizing its resilience, taking into account input constraints. In addition, we introduce a new metric, called the effort metric, which characterizes the minimal input bound necessary to satisfy a given specification for a perturbed system. The problem for both metrics is formulated as a robust optimization program where the objective is to compute the maximum resilience for the system with input constraints or the minimal effort while simultaneously synthesizing the corresponding controller parameters. Moreover, we study the trade-off between resilience and effort, where we seek to maximize resilience and minimize the control effort. For linear systems and linear controllers, exact solutions are provided for the class of time-varying polytopic specifications for the closed-loop and open-loop systems. For the case of nonlinear systems, nonlinear controllers, and more general specifications, we leverage tools from the scenario optimization approach, offering a probabilistic guarantee of the solution as well as computational feasibility. Different case studies are presented to illustrate the theoretical results.
Elastomers are central to vision-based tactile sensors (VBTSs), where they transduce external contact into observable deformation. Different VBTS architectures, however, require distinct optical and mechanical properties, particularly transparency and hardness. Conventional elastomer design relies on a forward, trial-and-error optimisation process from material preparation to property evaluation, which is inefficient and offers limited property scalability and target tunability. In this work, we present i-Tac, an inverse design pipeline for tailoring 3D-printed tactile elastomers with target optical and mechanical properties. Inspired by the composite structure of the human dermis, i-Tac exploits multi-material PolyJet additive manufacturing with three complementary resins. A mixture design methodology is employed to characterise the printed elastomers and establish response surface models (ReSMs) that map material compositions to functional properties, thereby defining a scalable property space. Based on user-defined targets, a desirability-function-based multi-objective optimisation is then performed to identify feasible composition regions and derive an optimal operating window for fabrication. This enables elastomers with desired properties to be manufactured in a single iteration, thereby achieving efficient target tunability. Experimental results validate the proposed i-Tac framework in terms of both property scalability and inverse design performance, showing that i-Tac can effectively tailor elastomer transparency and hardness while reducing the iterative burden of conventional forward design. By fabricating physical sensor samples from both commercial and custom designs, the proposed framework further demonstrates the potential of inverse-designed, monolithically manufactured elastomers for customisable VBTS fabrication.
Digital Subtraction Angiography (DSA) is a clinically significant imaging technique for diagnosing cerebrovascular disease, as gold-standard. However, the artifacts caused by motion of high-attenuation tissues such as bones, teeth, and catheters, seriously reduce the visibility of blood vessels. This paper presents a novel Vascular Consistency Constrained DSA Imaging Model (VCC-DSA) for robust motion suppression and precise vascular imaging with the following designs: 1) We specially design a Learning-based Subtraction Mapping Paradigm, so that the ill-posed problem of existing learning-based methods can be solved to enhance the stability of the algorithm. 2) Our model effectively develops Residual Dense Blocks and details-shortcut to improve the performance under complex structures, such as moving bones overlapping with blood vessels, and small features, like peripheral vessels. 3) An innovative Vascular Consistency Strategy is proposed to extract intrinsically consistency from the various relative motions in mask-live images, so that spontaneously distils the vascular structure with contrast-agent development and robustly suppress motion artifacts, and also naturally alleviates the high matching requirements of data. 4) We creatively design a Mixup-based Data Self-evolution Strategy for data-intra self-enhancement in training loop, so that the training data gains dynamically optimized to promote model better learning the vascular features, and excluding the irrelevant structures in live/mask image and even the inevitable-artifacts/fake-structure in label. Prospectively, to further evaluate practical value, an actual general anesthesia animal experiment is specially conducted, besides the assessment on human clinical data. Compared with other method, our model improves the PSNR and SSIM by 73.4% and 8.56%, respectively.
The segmentation of 2D vascular structures via deep learning holds significant clinical value but is hindered by the scarcity of annotated data, severely limiting its widespread application. Developing a universal few-shot vascular segmentation model is highly desirable, yet remains challenging due to the need for extensive training and the inherent complexities of vascular imaging. In this work, we propose UniVG (Generative Data-engine Foundation Model for Universal Few-shot 2D Vascular Image Segmentation), a novel approach that learns the compositionality of vascular images and constructing a generative foundation model for robust vascular segmentation. UniVG enables the synthesis and learning of diverse and realistic vascular images through two key innovations: 1) Compositional learning for flexible and diverse vascular synthesis: It decomposes and recombines vascular structures with varying morphological features and diverse foreground-background configurations to generate richly diverse synthetic image-label pairs. 2) Few-shot generative adaptation for transferable segmentation: It fine-tunes pre-trained models with minimal annotated data to bridge the gap between synthetic and real vascular domains, synthesizing authentic and diverse vessel images for downstream few-shot vascular segmentation learning. To support our approach, we develop UniVG-58K, a large dataset comprising 58,689 vascular images across five imaging modalities, facilitating robust large-scale generative pre-training. Extensive experiments on 11 vessel segmentation tasks cross 5 modalties (only with 5 labeled images on each task) demonstrate that UniVG achieves performance comparable to fully supervised models, significantly reducing data collection and annotation costs. All code and datasets will be made publicly available at this https URL.
Consider a non-uniform Euler-Bernoulli beam with a tip-mass at one end and a cantilever joint at the other end. The cantilever joint is not fixed and can itself be moved along an axis perpendicular to the beam. The position of the cantilever joint is the control input to the beam. The dynamics of the beam is governed by a coupled PDE-ODE model with boundary input. On a natural state-space, there exists a unique state trajectory for this beam model for every initial state and each twice continuously differentiable control input which is compatible with the initial state. In this paper, we study the motion planning problem of transferring the beam model from an initial state to a final state over a prescribed time-interval and then employ the results obtained to establish the approximate controllability of this model. We address these problems by extending and applying the generating functions approach to flatness-based control to the beam model. We prove that the transfer described above is feasible if the initial and final states belong to a certain set, which also contains the steady-states of the beam model. We then establish that this set contains all the eigenfunctions of the beam model, which form a Riesz basis for the state-space, and thereby conclude the approximate controllability of the beam model over all time intervals. We illustrate our theoretical results on motion planning using simulations and experiments.
In the field of medical image segmentation, the scarcity of labeled data poses a major challenge for existing models to accurately perceive target regions. Compared with manual annotation, gaze data is easier and cheaper to obtain. As a classical semi-supervised learning framework, mean-teacher can effectively use a large number of unlabeled medical images for stable training through self-teaching and collaborative optimization. Our study is based on the mean-teacher framework. By combining gaze data, it aims to address two crucial issues in semi-supervised medical image segmentation: 1) expand the scale and diversity of the dataset with limited labeled data; 2) enhance the network's perception ability. We propose the Human Gaze-based Dual Teacher Guidance Learning model (HG-DTGL). In this model, human gaze serves as an additional hidden `teacher' in the mean-teacher architecture. We introduce the GazeMix to generate reliable mixed data to expand the diversity and scale of the dataset, and the Multi-scale Gaze Perception (MGP) module is used to extract the multi-scale perception of the network. A Gaze Loss is designed to align the model's perception with human gaze. We have verified HG-DTGL on multiple datasets of different modalities and achieved superior performance on a total of ten different organs/tissues, with extensive experiments. This demonstrates that our method has strong generalization ability for medical images of different modalities, and shows the great application potential of gaze data in semi-supervised medical image segmentation.
Artificial intelligence (AI) is driving rapid growth in electricity demand, yet the grid-facing power dynamics of AI data centers remain poorly understood. Here we show that, in shared-GPU systems, the composition of batch and inference workloads decouples aggregate power variability from short-horizon ramping. As the inference share rises, variability becomes U-shaped, whereas ramping becomes hump-shaped, particularly under higher loading. The magnitude and turning points of these patterns also depend on system loading. Using a trace-calibrated framework linking workload arrivals, queueing, scheduling, and GPU power, we show that the underlying mechanism is asymmetric. At intermediate workload mixes, queued batch jobs fill capacity left idle by fluctuating inference demand, reducing aggregate power variability. However, short-horizon ramping remains elevated because inference-side fluctuations propagate more directly into realized power. AI data centers should therefore be understood as dynamic systems whose workload composition shapes their grid impact.
Energy infrastructure planning under uncertainty has become increasingly complex as electrification, interdependence between energy carriers, decarbonization, and extreme weather events reshape long-term investment decisions. This paper surveys recent advances at the intersection of generation and transmission expansion, and optimization under uncertainty, with a focus on stochastic programming, robust optimization, and distributionally robust optimization. We then categorize modeling needs along the axes of modeling fidelity, uncertainty characterization, and solution methods to identify dominant modeling features and trace research gaps. We further examine emerging directions at the interface of optimization and machine learning, including surrogate modeling, learning uncertainty sets, probabilistic forecasting, and synthetic scenarios, and discuss how these tools can be embedded within infrastructure planning models.
Brain organoid interfaces that seek neuromodulator readout benefit from chemical receivers with molecular specificity and tolerance to drift. This paper presents a receiver-centric theoretical study of a control-referenced tri-channel organic electrochemical transistor (OECT) receiver with dopamine- and serotonin-selective pixels alongside a hydrogel-matched control pixel. The Ag/AgCl electrode provides the electrochemical gate reference, whereas the control pixel is used only as a matched reference for common-mode drift and other low-frequency baseline fluctuations during amplitude decisions. We couple finite-duration release, restricted diffusion with clearance, aptamer binding, OECT transduction, and correlated thermal, flicker, and drift noise, and we evaluate MoSK, CSK-4, and a 2-bit Hybrid detector on the same front-end by Monte Carlo simulation. At $r=45$ micrometers, control referencing mainly benefits the Hybrid amplitude branch, reducing Hybrid SER from $3.71\times 10^{-2}$ to $1.09\times 10^{-2}$ at $N_m=1.40\times 10^4$ molecules/symbol while barely changing the MoSK component. In calibrated no-ISI front-end benchmarks, Hybrid+CTRL reaches an LoD of 11866 molecules/symbol at 45 micrometers and remains below CSK-4+CTRL over much of the medium-to-long-distance range studied. The reported SER and LoD values are scenario-based receiver forecasts, whereas the more transferable result is the regime-dependent rule for when matched control referencing benefits Hybrid amplitude decoding.
We propose CisLunarSense, an opportunistic integrated sensing and communication (ISAC) framework that exploits the Lunar Gateway's Ka-band relay for monostatic debris detection, addressing the absence of cislunar space situational awareness infrastructure beyond the reach of ground-based radars. Using NASA/ESA-documented system parameters with author-selected sensing settings and a CR3BP-based 9:2 near-rectilinear halo orbit model, we derive the orbit-phase-dependent Cramér--Rao bound under OFDM inter-carrier interference, quantify a 36~dB cislunar sensing advantage over a ground-based Ka-band reference, and design a velocity-adaptive processor with mode switching at 337~m/s. Gateway operational debris ($v_\mathrm{rel} < 50$~m/s) is detectable within 700~km with over 30~minutes of warning; external threats ($v_\mathrm{rel}$ up to 500~m/s) remain detectable within 400--630~km. An orbit-phase-adaptive allocation reduces the sensing duty cycle from 60\% to 19\%, increasing relay throughput from 44 to 90~Mbps. A closed-form sensing outage probability for $K$-CPI non-coherent integration under Swerling~I fluctuation shows that the 10\%-outage detection range reaches 91\% of the deterministic maximum at the nominal operating point $K = 16$.
System identification remains an intriguing challenge for lithium-ion batteries, as many models are nonlinear, exhibit multi-physics coupling, and involve a large number of parameters. In this paper, we address this challenge using the ensemble Kalman inversion (EnKI) method for battery system identification. EnKI performs maximum a posteriori parameter estimation through successive local Gaussian approximations, enabling an iterative and incremental search for unknown parameters. The search combines Monte Carlo sampling with Kalman-type updates to evolve an ensemble of samples, thereby offering empirical stability and the ability to handle strongly nonlinear models. We validate the proposed approach on two equivalent circuit models with coupled electro-thermal dynamics, through both simulation and experiments. The results demonstrate that the proposed approach achieves accurate parameter estimation with rapid iterative convergence, and it shows strong potential for application to other battery models.
Rain attenuates Ku-band satellite signals by up to 20~dB, encoding precipitation information along the Earth-space slant path. This paper derives the Bayesian Cramér-Rao bound (BCRB) for rain rate estimation from LEO broadband OFDM downlinks. Using corrected ITU-R P.838-3 coefficients, the standard CRB yields a minimum detectable rain rate $R_{\min} \approx 4.3\mmh$ for a single link at the $38^\circ$ reference elevation. We derive the prior Fisher information in closed form for log-normal rain ($c_v = 1.05$, from 186{,}292 samples) and show that a single-snapshot BCRB reduces $R_{\min}$ to $1.1\mmh$; exploiting temporal correlation ($\rho = 0.95$) over a 30-min window further tightens it to $0.95\mmh$, while multi-link fusion across $N = 215$ links lowers the operating-point RMSE \emph{lower bound} at $R = 20\mmh$ to approximately $0.07\mmh$. Building on these bounds, we formulate a weather-adaptive pilot allocation that minimizes the BCRB subject to a hard spectral-efficiency constraint, characterize its three-regime structure (full-sensing, throughput-tracking, outage), and pair it with a CUSUM rain onset detector achieving sub-10-min delay for $R \geq 20\mmh$. A closed-form analysis of dynamic LEO slant geometry identifies a sensing-optimal elevation at the P.618-validity floor of $15^\circ$ that yields a $1.58\times$ geometric improvement over the $38^\circ$ baseline, exposing a structural anti-correlation between sensing- and communication-optimal elevations along an orbital pass. Validation against 9.4~million radar samples from 215 Ku-band GEO satellite links ($r = 0.72$, RMSE~$= 1.24\dB$) and 113 rain gauges confirms the underlying attenuation model; the bounds transfer to LEO constellations under matched OFDM signal parameters, with dedicated LEO validation left for future work.
Power system simulation workflows remain expert-intensive. Engineers must translate study intents into code or API calls, execute analyses, and interpret outputs. To automate this workflow, this paper presents PFAgent, a tractable and self-evolving power-flow agent for interactive grid analysis. PFAgent integrates four key capabilities: i) a tractable and interactive architecture for intent parsing, knowledge retrieval, tool execution, and structured reporting; ii) a self-evolution mechanism combining verification-driven refinement and human-in-the-loop feedback; iii) an AI-assisted evaluation and debugging loop that leverages conversational context, generated code, and execution errors for iterative fixing; and iv) an evaluation framework covering task success, convergence validity, numerical consistency, and explanation quality. Verification on IEEE benchmark systems shows that PFAgent can automate case change, analyze voltage violations, perform N-1 contingency analysis, generate plots and concise summaries, and return reproducible results with transparent execution logs. The proposed framework highlights a shift from conventional simulation tools to interactive, tractable, and self-evolving agents for power system analysis.
The decoupling of multivariate functions is a powerful modeling paradigm for learning multivariate input-output relations from data. For the single-layer case, established CPD-based methods are available, but the multi-layer case remained largely unexplored. This work introduces a tensor-based framework for multi-layer decoupling, which is based on ParaTuck-type tensor decompositions and constrained optimization. We provide theoretical justification behind the considered tensor decompositions and parameterizations. Furthermore, we formulate a structured coupled matrix-tensor factorization that incorporates both Jacobian and function evaluations, together with a bilevel optimization approach for adaptively balancing first- and zeroth-order information. The feasibility of the proposed methodology is illustrated on synthetic systems, a nonlinear system identification benchmark and neural network compression.
Wireless goal-oriented semantic communication (GSC) has emerged as a promising paradigm by directly optimizing task performance. However, existing GSC frameworks typically operate on entire images and rely on labeled data for classification tasks, which can limit their compression efficiency and increase the risk of overfitting. This paper proposes a novel semi-supervised wireless GSC framework for the unlabeled image foreground classification task. In our proposed framework, a foreground-aware masked autoencoder (MAE) is developed to prioritize semantically important foreground objects, thereby reducing transmission overhead. To enable accurate reconstruction and classification under a limited data size, we further propose a semi-supervised autoencoder (SSAE) that decodes the semantic latent tensor and refines image details by leveraging three complementary information sources, followed by fine-tuning a pre-trained image classification model. The entire pipeline, from foreground masking to classification, is trained in a semi-supervised manner to significantly reduce the need for manual labeling. Simulation results validate that the proposed GSC framework achieves over 90% image classification accuracy while reducing the original image data size by 95%, and demonstrate its strong potential for practical tasks in resource-constrained wireless scenarios.
This paper investigates the joint optimization of beamforming and antenna positions in fluid antenna system (FAS)-aided anti-jamming communications. We consider a multi-user multiple-input multiple-output downlink scenario where multiple malicious jammers exist and the jammer channel state information is imperfect. The goal is to maximize the worst-case sum-rate under quality-of-service and transmit power constraints. To achieve this, we develop two distinct optimization frameworks for continuous and discrete antenna position designs, respectively. For continuous design, we propose an alternating optimization (AO) framework that integrates successive convex approximation and majorization minimization (MM) to handle the highly non-convex problem. For discrete design, based on the minimum mean squared error criterion and MM, we reformulate the problem as a sparse recovery task and propose a low-complexity block coordinate descent and simultaneous orthogonal matching pursuit, which enables joint design rather than AO. Through systematic comparison, we uncover a practical phenomenon: the discrete joint design yields superior sum-rate performance compared to the AO-based continuous counterpart under identical conditions. This superiority stems from the sparse recovery formulation which effectively circumvents the severe local optima. Our findings challenge the conventional view that continuous optimization is inherently superior, and reveal that discretization combined with sparse recovery can offer a more effective paradigm for exploiting spatial degrees-of-freedom in FAS-aided anti-jamming communications.
Robust radio signal recognition is fundamental to spectrum management, electromagnetic space security, and intelligent wireless applications, yet existing deep-learning methods rely heavily on large labeled datasets and struggle to capture the multi-domain characteristics inherent in real-world signals. To address these limitations, we propose an unsupervised equivalent contrastive learning method that leverages four information-lossless equivalent transformations, spanning the time, instantaneous, frequency, and time-frequency domains, to construct multi-view and semantically consistent representations of each signal. An equivalent contrastive learning strategy then aligns these complementary views to learn discriminative and transferable embeddings without requiring labeled data. Once pre-training is completed, the resulting model can be directly fine-tuned on downstream tasks using only raw signal samples, without reapplying any equivalent transformations, which reduces computational overhead and simplifies deployment. Extensive experiments on four public datasets demonstrate that the proposed method consistently outperforms state-of-the-art contrastive baselines under linear evaluation, few-shot semi-supervised learning, and cross-domain transfer settings. Notably, the learned representations yield substantial gains in few-shot regimes and challenging channel conditions, confirming the effectiveness of multi-domain equivalent modeling in enhancing robustness and generalization. This work establishes a principled pathway for exploiting massive unlabeled radio data and provides a foundation for future self-supervised learning frameworks in wireless systems.
Meta-backscatter system that utilizes meta-material sensors is a promising enabler for future environmental sensing, offering distinct advantages such as low cost, zero-power consumption, and robustness. Specifically, the electromagnetic response of the sensor, typically characterized by a frequency-selective absorption profile, is affected by the environmental conditions, allowing the estimation of these conditions from the reflected signal. However, it remains unclear what estimation accuracy can be achieved fundamentally. Motivated by this gap, we quantify this accuracy limit using the Bayesian Cramér-Rao bound (BCRB), which provides a lower bound on the mean-squared error for the environmental condition. Establishing this limit is challenging because the electromagnetic response of the sensor is distorted by the channel fading, while the channel estimation is infeasible since the sensors cannot be configured to predefined states to generate training data. To address this challenge, we consider the joint BCRB of the channel coefficient and the environmental condition in a multicarrier framework. The BCRB of the environmental condition is then obtained by selecting the corresponding element from the joint BCRB. An analysis of the derived BCRB reveals the impact of the absorption peak shape and the number of subcarriers. The derivation and analysis of the BCRB are verified through simulations.
Semantic communication has been increasingly integrated into edge computing systems for reconstruction tasks, owing to its advantages in source compression, robustness to channel noise, and task execution efficiency. However, the black-box nature of neural-network (NN)-based semantic codecs, together with the noisy transmission of semantic features, makes it difficult to allocate transmission resources and guarantee reconstruction quality for multiple users. In this paper, we propose a reliable online resource allocation framework for a semantic-driven multi-user edge computing system, where multiple users encode source information into semantic features and offload reconstruction to an edge server. We formulate a multi-user resource optimization problem whose objective jointly accounts for system-wide reconstruction performance and transmission latency, under constraints that guarantee each user's minimum reconstruction quality. To solve this problem, we develop a Bayesian optimization (BO)-based online algorithm that enables flexible control of the user-side semantic compression ratio (CR) and allocation of transmission rates. The edge server jointly determines each user's CR and transmission rate by exploiting Gaussian-process (GP) models that capture the relationship between reconstruction performance, signal-to-noise ratio (SNR), and CR, and by employing an acquisition function to select CRs that satisfy the performance quality constraints while maximizing the objective. Simulation results on high-resolution video-frame reconstruction datasets demonstrate that the proposed method selects near-optimal CRs via the GP surrogate and acquisition function, achieving a 98.03% constraint-satisfaction rate and reducing transmission latency by more than 45% compared with fixed-CR schemes.
We study the nonlinear inverse problem arising in Temporal CT, a multi-source computed-tomography architecture in which NS = 3 simultaneously active X-ray sources produce M = 5 mixed Poisson intensity measurements of K = 3 unknown line-integral attenuations per projection bundle. The forward model is a sum of exponentials and creates two distinct sources of performance loss: an irreducible aggregation loss fixed by the measurement geometry, and a reducible algorithmic inefficiency that improved estimators can close. We derive closed-form Cramer-Rao bounds and inflation factors for this problem; At unequal attenuation the inflation ratios vary -- and can be considerably worse. We introduce SNN1, a near-optimal classical per-bundle algorithm that brings endpoint paths to within 1-2% of their CRBs and evaluate a physics-motivated residual neural network across three datasets ordered by increasing sinogram structure: RND (synthetic), SGS (analytical chest phantom), and PIS (patient-image-derived). On SGS the NN beats SNN1 at high attenuation by 33-67% but cannot cross the equal-dose single-source floor; on PIS the evaluation ratio drops below 1.0 at bin 6 and reaches 0.096 at bin 9, confirming that the anatomical prior learned from this patient is concentrated enough to dominate collapsed Fisher information at high attenuation -- a characterization of prior informativeness, not a claim of clinical generalizability beyond the single patient studied. A cross evaluation (SGS-trained on PIS test) shows that a concentrated wrong prior is catastrophically worse than a broad wrong prior, underscoring prior diversity as a critical requirement for any future multi-patient deployment. Quantitative sinogram correlation analysis motivates a companion strip-processing architecture that exploits inter-bundle structure inaccessible to the per-bundle algorithms of this paper (Thread 1).
Rapid growth in artificial intelligence (AI) workloads is driving up data center power densities, increasing the need for advanced thermal management. Direct-to-chip liquid cooling can remove heat efficiently at the source, but many cold plate channel layouts remain heuristic and are not optimized for the strongly non-uniform temperature distribution of modern heterogeneous packages. This work presents a generative design framework for synthesizing cooling channel geometries for the NVIDIA GB200 Grace Blackwell Superchip. A physics-based finite-difference thermal model provides rapid steady-state temperature predictions and supplies spatial thermal feedback to a constrained reaction-diffusion process that generates novel channel topologies while enforcing inlet/outlet and component constraints. By iterating channel generation and thermal evaluation in a closed loop, the method naturally redistributes cooling capacity toward high-power regions and suppresses hot-spot formation. Compared with a baseline parallel channel design, the resulting channels achieve more than a 5 degree Celsius reduction in average temperature and over 35 degree Celsius reduction in maximum temperature. Overall, the results demonstrate that coupling generative algorithms with lightweight physics-based modeling can significantly enhance direct-to-chip liquid cooling performance, supporting more sustainable scaling of AI computing.
Traditional Active Noise Control (ANC) systems are mostly based on FxLMS algorithms, but such algorithms rely on linear assumptions and are often limited in handling broadband non-stationary noise or nonlinear acoustic paths. Not only that, the traditional method is used to eliminating all signals together, and noise reduction often accidentally damages the voice signal and affects normal communication. To tackle these issues, this study proposes a speech preserving deep learning ANC system, which aims to achieve stable noise reduction while effectively retaining speech in a complex acoustic environment. This study builds an end-to-end control architecture, the core of which adopts a Convolutional Recurrent Network (CRN). The structure uses the long short-term memory (LSTM) network to capture the time-related characteristics of acoustic signals. Combined with complex spectrum mapping (CSM) technology, the nonlinear distortion problem is effectively solved. In order to retain useful voice while removing noise, this study also designs a special voice retention loss function. This design guidance model selectively retains the target voice while suppressing environmental noise by identifying the characteristics of the spectrum structure. In addition, in order to verify whether the system is effective in real scenes, we use the Image Source Method (ISM) to build a high-fidelity acoustic simulation environment, which also simulates the real reverberation effect. Experimental results demonstrate that the proposed Deep ANC system achieves significantly better noise reduction than the traditional FxLMS algorithm, especially for non-stationary noises like crowd babble. Meanwhile, PESQ and STOI based evaluations confirm that the system preserves both the naturalness and intelligibility of the target speech.
Vehicular formation control is an important component of intelligent transportation systems (ITSs). In practical implementations, the controller design needs to satisfy multiple state constraints, including inter-vehicle spacing and vehicle speed. When system states approach the constraint boundaries, control singularity and excessive control effort may arise, which limits the practical applicability of existing methods. To address this problem, this paper investigates a class of nonlinear vehicular formation systems for autonomous vehicles (AVs) with uncertain dynamics and develops a switched event-triggered control framework. A smooth nonlinear mapping is first introduced to transform the constrained state space into an unconstrained one, thereby avoiding singularity near the constraint boundaries. A radial basis function neural network (RBFNN) is then employed to approximate the unknown nonlinear dynamics online, based on which an adaptive controller is constructed via the backstepping technique. In addition, a switched event-triggered mechanism (SETM) is designed to increase the control update frequency during the transient stage and reduce the communication burden during the steady-state stage. Lyapunov-based analysis proves that all signals in the closed-loop system remain uniformly bounded and that Zeno behavior is excluded. Simulation results verify that the proposed method achieves stable platoon formation under prescribed state constraints while significantly reducing communication updates.
This paper proposes a two-stage optimization framework to evaluate whether cost-optimal electric vehicle (EV) charging infrastructure translates into effective operation under distribution grid constraints. The proposed approach explicitly links infrastructure planning with grid-constrained charging operation through a consistent optimal power flow (OPF) formulation applied in both stages. The framework is formulated as a mixed-integer program (MIP) and evaluated across different fleet sizes, demonstrating its scalability and applicability to realistic planning scenarios. The model incorporates heterogeneous charging technologies, including fast and slow chargers with both single-port and multi-port configurations. The results show a fundamental trade-off between cost optimality and service performance. Infrastructure configurations that minimize capital investment tend to spatially concentrate charging resources, resulting in lower achieved state-of-charge (SOC) and higher unmet energy demand. In contrast, uniformly distributed deployments of the same infrastructure significantly improve the spatial availability of charging and operational performance, reducing energy shortfall by up to 74%. Our findings reveal that cost-optimal planning alone is insufficient to guarantee satisfactory system performance. Effective EV charging infrastructure design must jointly consider cost optimality, spatial distribution of charging resources, and grid constraints. Sensitivity analysis with respect to battery capacity further highlights the nonlinear scaling of infrastructure requirements.
Large, spatially flexible electricity consumers such as data centers can reallocate demand across locations, influencing dispatch and prices in wholesale electricity markets. While flexible load is often assumed to improve system efficiency, this intuition typically relies on price-taking behavior. We study price-anticipatory spatial load shifting by modeling a large flexible consumer as a Stackelberg leader interacting with DC optimal power flow (DC-OPF) based market clearing. We show that decentralized, cost-minimizing load shifting need not align with system operating cost minimization, and that misalignment arises at boundaries between DC-OPF operating regimes, where small changes in load can induce discrete changes in marginal generators or congestion patterns. We evaluate strategic load shifting on the 73-bus RTS-GMLC test system, where findings indicate reductions in system operating cost in most hours, but misalignment in a subset of cases that are driven by redispatch at merit-order discontinuities. We find that these outcomes are primarily redistributive relative to a price-taking benchmark, reducing generator profits while lowering electricity procurement costs for both flexible and inflexible consumers, even in cases where total system operating costs increase.
The dynamic competition against intelligent jammer systems presents a significant challenge to modern radar. Traditional active anti-jamming strategy learning methods often suffer from low sample efficiency and fail to fully exploit the structures of the adversary jammer. To reveal the inherent structure, this paper adopts an Online Convex Optimization (OCO) framework to capture the competition between a frequency agile radar and a digital radio frequency memory (DRFM)-based intelligent jammer. Recognizing that conventional OCO algorithms also suffer from suboptimal sample efficiency, two refined algorithms are developed that incorporate unbiased gradient estimators specifically tailored to the unique characteristics of DRFM-based jammers. Our theoretical analysis of the regret bound indicates significant improvements in long-term performance compared to standard OCO. The simulation results consistently show that our algorithms outperform traditional OCO and reinforcement learning baselines, achieving faster convergence and better anti-jamming performance.
Precision contouring control is crucial in industrial machining processes, particularly for applications such as laser and water jet cutting, where contouring accuracy directly determines product quality. This paper presents a novel control strategy for biaxial machines featuring position-dependent flexibility and input delays, ensuring that the end-effector accurately traverses the desired contour within specified contouring error bounds and system constraints. To capture the rotation dynamics for systems with mechanical vibration, we introduce a high-fidelity model and explicitly consider the input delay with augmented system states. The controller design is based on the model predictive control scheme to enforce system states staying in robust control invariant sets defined by the reference model and switched linear time-invariant control-oriented models. The proposed algorithm is not restricted to a specific shape of the curve that is being traversed. The effectiveness of the proposed control algorithm is demonstrated in an experimental environment with discretizations and input delay. The results show that a bounded contouring error can be achieved by the proposed method in a performance degradation environment with a low commissioning effort.
This paper develops a direct data-driven framework for infinite networks with unknown nonlinear polynomial subsystems, enabling the synthesis of controllers that ensure the entire network is uniformly globally asymptotically stable (UGAS). To address scalability challenges arising from high dimensionality, we develop a data-driven approach to construct an input-to-state stable (ISS) Lyapunov function and its corresponding controller for each unknown subsystem using only a single set of noise-corrupted input-state trajectories collected from that subsystem. Once each subsystem admits a data-driven ISS Lyapunov function, we leverage a compositional small-gain framework for infinite-dimensional spaces to construct a global control Lyapunov function and its associated controller, thereby ensuring UGAS of the entire infinite network. The effectiveness of the proposed data-driven approach is demonstrated through three case studies, including infinite networks of spacecraft, Lorenz chaotic systems, and an academic example with a state-dependent control input matrix.
Extremely large-scale multiple-input multiple-output (XL-MIMO) is a key enabler for sixth-generation (6G) communications. However, near-field channel estimation is particularly challenging due to spherical-wave propagation and spatial non-stationarity. To tackle this challenge, we propose a structured sparse Bayesian learning framework with adaptive dictionary updating for near-field non-stationary channel estimation. Specifically, the proposed method iteratively updates the distance parameters within an adaptive dictionary, thereby enhancing the representation capability without increasing the dictionary size. Moreover, we develop a hierarchical prior model that jointly captures polar-domain sparsity and structured dependency, enabling efficient Bayesian inference. Simulation results demonstrate that the proposed approach outperforms existing polar-domain dictionary-based methods while achieving low dictionary overhead.
Task-oriented semantic communication emerges as a crucial paradigm for next-generation wireless networks, aiming to efficiently transmit task-relevant information while reducing interference and redundancy across multiple users. Existing information bottleneck (IB)-based frameworks predominantly focus on single-user scenarios, neglecting cross-user semantic interference in distributed semantic communications. To overcome this limitation, we propose a task-oriented orthogonalised information bottleneck (TOIB) approach, explicitly designed for distributed semantic communication systems. By introducing task-conditioned latent variables, TOIB adaptively balances semantic sufficiency, semantic compression, and inter-user semantic orthogonality. Extensive simulations conducted on classification tasks demonstrate that TOIB consistently achieves superior classification accuracy across various signal-to-noise ratio (SNR) regimes compared to traditional IB and deep joint source-channel coding (JSCC) methods. Specifically, the proposed method significantly enhances robustness under harsh low-SNR conditions and effectively suppresses cross-user semantic interference, as validated by cross-decoding accuracy metrics.
Considerable efforts have been made to analyze the small-signal stability of doubly fed induction generator (DFIG) systems. However, commercial confidentiality and frequency coupling make the DFIG system a grey-box multiple-input-multiple-output (MIMO) system with highly challenging stability analysis. This paper proposes an Argument-principle based stability assessment method to analyze the stability of the grey-box DFIG system. The frequency sweeping technique is first used to acquire the MIMO model of the black-box device, as well as the determinant of the system's return difference matrix. Then a stability criterion based on the determinant trajectory is presented. This criterion applies to the stability analysis of grey-box MIMO systems without detailed system models. Further, acritical-pole estimation method with trajectory information is developed to assess the dominant mode of the target system. The simulation and hardware-in-loop experiment results demonstrate the effectiveness of the proposed method. Finally, some concerns about this method, such as model selection, estimation errors and application potential, are thoroughly analyzed and clarified.
An important design principle for biological oscillators divides the oscillators into two classes: fixed frequency, variable amplitude and fixed amplitude, variable frequency. Because of the interplay of nonlinearity and feedback, both positive and negative, analytical investigations of this design principle are primarily based on numerical simulations of ordinary differential equations. To enhance the qualitative and quantitative characterization, we adapted and developed a block diagram modeling framework. We showed how the observed amplitude-frequency characteristics could be obtained from the block diagram models. We obtained constraints on the positive feedback and negative feedback strengths for the oscillations to exist. These results should contribute to a systems and control perspective on oscillations in biology and related contexts.
This paper experimentally investigates geometry-based multi-antenna RF wireless power transfer (WPT) using a large-scale distributed indoor transmit array measuring 8 m by 4 m. Geometry-based beamforming uses known transmitter and receiver positions to perform phase-only precoding, avoiding the need for explicit channel estimation or feedback. The experiments use a ceiling-mounted array of 41 phase-synchronized transmit antennas operating at 920 MHz. Geometry-based beamforming is compared with channel state information (CSI)-based beamforming. The spatial power delivery is evaluated through two-dimensional scans over an area of 1.25 m by 1.25 m. The harvested DC power is measured using an RF-to-DC energy profiler. Under line-of-sight (LoS) conditions, geometry-based beamforming achieves a power gain of 18.75 dB, which is within 0.82 dB of CSI-based beamforming. In obstructed LoS scenarios with reflections, the gain decreases to 16.7 dB, while CSI-based beamforming achieves 20.53 dB, resulting in a performance gap of 3.83 dB. These results quantify the trade-off between reduced system overhead and robustness to multipath propagation in geometry-driven WPT, and represent an initial step toward geometry-based wireless power transfer enabled by digital twins.
Multichannel speech enhancement is widely used as a front-end in microphone array processing systems. While most existing approaches produce a single enhanced signal, direction-preserving multiple-input multiple-output (MIMO) methods instead aim to provide enhanced multichannel signals that retain directional properties, enabling downstream applications such as beamforming, binaural rendering, and direction-of-arrival estimation. In this work, we propose a fully blind, direction-preserving MIMO speech enhancement method based on neural estimation of the spatial noise covariance matrix. A lightweight OnlineSpatialNet estimates a scale-normalized Cholesky factor of the frequency-domain noise covariance, which is combined with a direction-preserving MIMO Wiener filter to enhance speech while preserving the spatial characteristics of both target and residual noise. In contrast to prior approaches relying on oracle information or mask-based covariance estimation for single-output systems, the proposed method directly targets accurate multichannel covariance estimation with low computational complexity. Experimental results show improved speech enhancement, covariance estimation capability, and performance in downstream tasks over a mask-based baseline, approaching oracle performance with significantly fewer parameters and computational cost.
Retrieving procedure-oriented evidence from materials science papers is difficult because key synthesis details are often scattered across long, context-heavy documents and are not well captured by paragraph-only dense retrieval. We present RECIPER, a dual-view retrieval pipeline that indexes both paragraph-level context and compact large language model-extracted procedural summaries, then combines the two candidate streams with lightweight lexical reranking. Across four dense retrieval backbones, RECIPER consistently improves early-rank retrieval over paragraph-only dense retrieval, achieving average gains of +3.73 in Recall@1, +2.85 in nDCG@10, and +3.13 in MRR. With BGE-large-en-v1.5, it reaches 86.82%, 97.07%, and 97.85% on Recall@1, Recall@5, and Recall@10, respectively. We further observe improved downstream question answering under automatic metrics, suggesting that procedural summaries can serve as a useful complementary retrieval signal for procedure-oriented materials question answering. Code and data are available at this https URL.
RF sensing exploits phase-sensitive measurements of stray electromagnetic (EM) fields from wireless devices across various frequency bands to detect EM blockage and to reconstruct and map the surrounding environment in 2D/3D. Although blockage effects caused by objects or human motion are well-studied in ISM bands and frequencies up to 60~GHz, there is a significant lack of research for frequencies above 100~GHz. The paper proposes a unified signal processing framework for RF sensing in the sub-THz D-band (105--175~GHz), explicitly integrating EM blockage and scattering as a single process through the birth-death dynamics of multipath components (MPCs). The framework extracts, associates, and classifies MPCs from angle-delay measurements using statistically grounded detection and classification, enabling human-scale sensing from a single radio link. The modeling and classification of MPCs, along with large-scale EM parameters, are demonstrated through an indoor measurement campaign using multiple test targets. Experimental results show that newly formed, attenuated, and suppressed MPCs can be reliably identified with millimeter-scale delay resolution. Static object localization achieves average positioning errors of $8-20$~cm depending on range and material, while passive human localization yields errors of 12-17cm at 0.5m and 26-30cm at 2m, respectively. The proposed framework demonstrates that accurate sensing and localization are feasible at sub-THz frequencies using a single link.
Channel gain maps (CGMs) enable propagation-aware services in edge-intelligent wireless communication networks, while diffusion-based CGM construction is memory intensive for on-device training or adaptation. This letter proposes InvDiff-CGM, an invertible diffusion framework that constructs CGMs from sparse measurements and environmental priors. By adopting invertible architectures in both the diffusion process and the U-Net noise estimator, InvDiff-CGM achieves near-constant training memory consumption. A prior-informed multi-scale injector further integrates environmental priors with sparse measurements to improve physical consistency and detail preservation. Experiments on RadioMap3DSeer show about an 85\% reduction in peak training memory and a PSNR of 38.02~dB, outperforming representative recent baselines. This validates the practicality of InvDiff-CGM for high-fidelity CGM construction under edge resource constraints.
Speech recognition systems often struggle with data domains that have not been included in the training. To address this, unsupervised domain adaptation has been explored with ensemble and multi-stage teacher-student training methods reducing the word error rate. Despite improvements, the error rate remains much higher than that achieved with supervised in-domain training. This work proposes a more efficient strategy by simultaneously updating the ensemble of teacher models along with the single student model eliminating the need for sequential models training. The joint update improves the word error rate of the student model, benefiting the progressively enhanced teacher models. Experiments are conducted with three labelled source datasets, namely AMI, WSJ, LS360, and one unlabeled target domain i.e. SwitchBoard. The results show that the proposed method improves the WER by 4.6% on the Switchboard eval00 test set, thus outperforming multi-stage and iterative training methods.
Speaker-Attributed Automatic Speech Recognition (SAA) enhances traditional ASR systems by incorporating relative speaker identity tags directly into the transcript (e.g., [Speaker 1]:, [Speaker 2]:). In this work, we extend the capabilities of Granite-speech, a state-of-the-art speech-aware Large Language Model (LLM) originally trained for transcription and translation. We demonstrate that it can be effectively adapted for SAA with only minimal architectural changes. Our core contribution is the introduction of speaker cluster identification tags (e.g., [Speaker 1 cluster 42]:) which are jointly trained with SAA to significantly improve accuracy. To address limitations in training data, we propose a data augmentation method that uses artificially concatenated multi-speaker conversations. Our approach is evaluated across multiple benchmarks and shows superior performance compared to conventional pipelines that sequentially perform speaker diarization followed by ASR.
Everything that exists has a natural frequency; this material characteristic is something that must be known and fully understood. If we fail to predict, measure, and address potential natural frequency concerns, it could significantly reduce the life span of our equipment or cause it to fail immediately when put into service. There are a few methodologies used to study natural frequencies, one being computer simulations and the other being physical tests done on the equipment. In this paper, we will focus on testing natural frequencies and discuss how we measure our excitation, our form of excitation, the type of data we are able to export, as well as what we are able to do with that data. These principles can be applied to any type of machinery or object where vibration could be of concern. For our purposes, we will primarily focus on rotating machinery, such as generators, gearboxes, and motors.
A mutual coupling-aware beamforming design for continuous aperture array (CAPA)-aided multi-user systems is investigated. First, a transmit coupling kernel is characterized to explicitly capture the mutual coupling effects inherent in CAPAs, based on which a mutual coupling-aware sum-rate maximization functional optimization problem is formulated. To address this problem, a kernel approximation (KA)-based weighted minimum mean-squared error (WMMSE) algorithm is developed. The optimal beamforming condition is derived within the WMMSE framework using the calculus of variations, while KA is employed to obtain a closed-form beamforming solution via wavenumber-domain Fourier transforms and Gauss-Legendre quadrature. Furthermore, the proposed framework is extended to CAPA-to-CAPA multiple-input multiple-output (MIMO) systems. Finally, numerical results demonstrate that: 1) the proposed algorithm achieves improved performance compared to benchmark schemes; 2) the modeled coupling effects are physically rational, where the performance of spatially discrete arrays converges to that of CAPAs; and 3) CAPA-to-CAPA MIMO systems can achieve higher degrees of freedom when the transceivers are placed in close proximity.
Spectrum sharing and dynamic spectrum reuse are becoming increasingly critical in modern wireless networks to address spectrum scarcity. However, these techniques inevitably increase Cross-Technology Interference (CTI). In this context, the Open Radio Access Network (O-RAN), as a modern and disaggregated network architecture, necessitates accurate, low-latency, and computationally efficient CTI classification and mitigation to support real-time control and maintain Quality of Service (QoS). Unfortunately, existing solutions predominantly rely on high-complexity, monolithic deep learning-based solutions that, while achieving high classification accuracy, incur significant latency and computational overhead This paper exploits the O-RAN functional split to leverage multi-domain raw signal representations (time, frequency, and Channel State Information (CSI)) directly from the same data stream. Each domain is processed locally, naturally interleaving CTI within the distributed, disaggregated O-RAN architecture. This distributed strategy enables a cost-aware, multi-domain fusion architecture that balances classification accuracy with computational overhead and latency. Our proposed multi-domain distributed architecture achieves a 400 $\mu s$ inference latency on standard CPUs. Compared to a state-of-the-art monolithic frequency-domain classifier, this represents an average 9x reduction in latency and an 11-fold decrease in computational cost, while sacrificing only 4% in classification performance and maintaining >90% accuracy in high-interference conditions.
The rapid growth of the low-altitude economy (LAE) is making aerial systems an important part of future digital infrastructure. Although major advances have been achieved in unmanned aerial vehicle (UAV) platforms, communications, and autonomous control, environmental perception remains a key bottleneck to reliable and scalable LAE operations. Existing sensing modalities, such as optical, LiDAR, and millimeter-wave radar, are limited by visibility, sensing range, and environmental conditions, resulting in fragmented situational awareness. This article argues that addressing these limitations requires a shift from platform-centric sensing to a shared, environment-aware sensing infrastructure. In this context, synthetic aperture radar (SAR) offers a distinct advantage by enabling all-weather, wide-area perception. We show that SAR can support UAV operations through global environmental awareness, enhance task-level sensing, and enable cooperative sensing across satellites, high-altitude platforms, UAVs, and ground systems. Building on this perspective, we present a system-level view of SAR-enabled LAE, highlighting key transformations from fragmented to infrastructure-centric sensing, from reactive to predictive operation, and from device-centric to environment-aware networking. We further discuss enabling architectures, including multi-platform sensing hierarchies, integration with integrated sensing and communication (ISAC), and the role of artificial intelligence and digital twins, along with the key challenges toward real-world deployment. By positioning SAR as a shared sensing foundation rather than a standalone modality, this article provides new insights into the design of scalable, reliable, and intelligent LAE systems.
We propose a simple yet effective divide-and-discard (DD) approach to guaranteed state estimation for nonlinear discrete-time systems. Our method iteratively subdivides interval enclosures of the state and propagates them forward in time using a mean-value enclosure. The central idea is to rely on repeated refinement of simple sets rather than on more complex set representations, yielding an observer that is straightforward to implement and easy to integrate into existing frameworks. Our divide-and-discard strategy exploits that many sets can be discarded early and limits the number of maintained sets, resulting in low computational cost with complexity that scales only quadratically in the state dimension. The proposed method is evaluated on nonlinear benchmark problems previously used to compare guaranteed observers, where it outperforms state-of-the-art approaches in terms of both computational efficiency and enclosure tightness.
State estimation constitutes a core task in monitoring, supervision, and control of dynamic systems. This paper proposes a data-driven framework for the design of state observers for descriptor systems. Necessary and sufficient conditions for the existence of a standard state observer are derived purely from data under mild assumptions. When the system is subject to unknown inputs, we further extend the framework to the data-driven design method for full-order unknown input observer (UIO). Notably, for both the standard state observer and the UIO, we establish the mathematical equivalence between the proposed data-driven existence conditions and classical model-based ones. Moreover, the data-driven approach is applied to the design of extended state observers, enabling simultaneous estimation of system states and disturbances via system augmentation. Numerical simulations validate the effectiveness of the proposed methods.
We address density control problems for large-scale multi-agent systems in leader-follower settings, where a group of controllable leaders must steer a population of followers toward a desired spatial distribution. Unlike prior work, we explicitly account for follower-follower interactions, capturing realistic behaviors such as flocking and collision avoidance. Within a macroscopic framework based on partial differential equations governing the density dynamics, we derive (i) necessary and sufficient feasibility conditions linking the target distribution to interaction strength, diffusion, and leader mass, and (ii) a feedback control law guaranteeing local stability with an explicit estimate of the basin of attraction. Our analysis reveals sharp feasibility thresholds, phase transitions beyond which no control effort can achieve the desired configuration. Numerical simulations in one- and two-dimensional domains validate the theoretical results at the macroscopic level, and agent-based simulations on finite populations confirm the practical deployability of the proposed framework.
Optimization using network traffic flow models require computing gradients of objective functions with respect to model parameters. Conventional approaches rely on numerical differentiation or derivative-free methods that does not scale well with the parameter dimension, or on adjoint methods that require manual derivation for each specific model. This study proposes a novel end-to-end differentiable network traffic flow simulator based on the Link Transmission Model (LTM), incorporating general node models and a dynamic user optimum (DUO) route choice model. We observe that the LTM operates on continuous aggregate state variables (cumulative vehicle counts) through piecewise-linear $\min$/$\max$ operations, which admit subgradients almost everywhere and thus require no smooth relaxation for automatic differentiation (AD). We incorporate the DUO route choice model and its logit extension to explicitly consider endogenous dynamic route choice of travelers while preserving differentiability, because the diverge ratios are continuous functions of per-destination vehicle counts. The resulting simulator computes exact gradients via reverse-mode AD in a single backward pass regardless of the parameter dimension. In order to demonstrate the capability of the proposed model, we solved a dynamic congestion toll optimization problem on Chicago-Sketch dataset with around 2500 links and 1 million vehicles with 15 000 decision variables. The proposed model successfully derived a high quality solution with 10 000 iterations that took about 2 hours, meaning that 1 simulation run and gradient derivation took 0.8 second. The simulator, implemented in Python and JAX, is released as an open-source software named UNsim (this https URL).
The integration of first-principles models with learning-based components, i.e., model augmentation, has gained increasing attention, as it offers higher model accuracy and faster convergence properties compared to black-box approaches, while generating physically interpretable models. Recently, a unified formulation has been proposed that generalizes existing model augmentation structures, utilizing linear fractional representations (LFRs). However, several potential benefits of the approach remain underexplored. In this work, we address three key limitations. First, the added flexibility of LFRs also introduces possible algebraic loops, i.e., a problem of well-posedness. To address this challenge, we propose a constraint-free direct parametrization of the model structure with a well-posedness guarantee. Second, we introduce a constraint-free parametrization that ensures stability of the overall model augmentation structure via contraction. Third, we adopt an efficient identification pipeline capable of handling non-smooth cost functions, such as group-lasso regularization, which facilitates automatic model order selection and discovery of the required augmentation configuration. These contributions are demonstrated on various simulation and benchmark identification examples.
Integrated sensing and communication (ISAC) systems rely on communication waveforms to perform sensing tasks, thus making their sensing performance strongly dependent on the level of communication symbol knowledge available to the sensing receivers. However, the existing literature fails to capture this dependency, often relying on full symbol knowledge assumptions. In this paper, we present a Cramer Rao bound (CRB) analysis of a bistatic ISAC network with heterogeneous uplink and downlink illumination and structured clutter. We consider different symbol knowledge regimes by modeling unknown communication symbols as nuisance parameters. Assuming a temporal evolution of the communication channel, we derive a correlation aware channel estimator and an expression for the UEs uplink spectral efficiency (SE). Numerical results show the CRB degradation induced by clutter and symbol uncertainty and how this can affect resource allocation policies. We also show the performance gain of our channel estimator relative to conventional block fading architectures.
Our objective is to study the performance and robustness of the model-free strategy for controlling the oxygen stoichiometry of a fuel cell air supply system with a proton exchange membrane. After reviewing the literature on modeling and control of this process, the model-free approach appears to be a good candidate because, on the one hand, it allows straightforward real-time adaptation to track operating points and, on the other hand, it requires a low computational burden, which is attractive for industrial applications. Numerical simulations for two scenarios (constant and variable oxygen stoichiometry) with two current profiles reveal satisfactory performance of the model-free control law. The robustness is addressed by considering significant variations in the parameters of the proton exchange membrane air supply system.
Reinforcement learning (RL) can be a powerful alternative to classical control methods when standard model-based control is insufficient, e.g., when deriving a suitable model is intractable or impossible. In many cases, however, the choice between model-based and RL-based control is not obvious. Due to the high computational costs of training RL agents, RL-based control should be limited to cases where it is expected to yield superior results compared to model-based control. To the best of our knowledge, there exists no approach to quantify the benefit of RL-based control that does not require RL training. In this work, we present a computationally efficient, purely simulation-based litmus test predicting whether RL-based control is superior to model-based control. Our test evaluates the suitability of the given model for model-based control by analyzing the impact of model uncertainties on the control problem. For this, we use reachset-conformant model identification combined with simulation-based analysis. This is followed by a learnability evaluation of the uncertainties based on correlation analysis. This two-part analysis enables an informed decision on the suitability of RL for a control problem without training an RL agent. We apply our test to several benchmarks, demonstrating its applicability to a wide range of control problems and highlight the potential to save computational resources.
Many wireless systems divide the baseband processing between two locations, interconnected by a fronthaul. This paper examines the impact of fronthaul quantization on multiple-input multiple-output (MIMO) systems. Starting from a Bussgang-based analysis of quantized single-input single-output (SISO) channels, we extend the framework to MIMO and derive a capacity lower bound under fronthaul quantization, where the receive combining is performed before the quantization. To maximize the sum rate, we propose a joint bit and power allocation (JBP-Alloc) scheme that efficiently distributes fronthaul bits and transmit power across active data streams. Asymptotic analysis shows that uniform bit allocation becomes optimal at high SNR. Numerical results confirm that JBP-Alloc outperforms uniform allocation and quantization-unaware water-filling, and achieves the same performance as Greedy bit allocation but with substantially lower computational complexity.
This work explores controllability and the control effort required for lithium-ion batteries. Battery packs have become a critical technology in both personal and professional applications as a means to store large amounts of energy. Management of cells in a pack becomes increasingly difficult though, with charging and discharging operations requiring more complex strategies due to parameter variations between the cells. There are numerous studies which develop effective estimation and control schemes to reduce the impact of the imbalances present in battery packs, but the receptiveness of the individual cells to these schemes is much less explored. This paper performs a nonlinear controllability analysis for experimentally parameterized cells. A connection is shown between the condition number of a battery's controllability matrix and the amount of control effort that battery will require. This reveals that if a cell's dynamics are poorly mathematically conditioned, it will require more time or higher power to control than one that is not. The controllability condition number of each cell's model is then determined both with new and aged parameters, and a sensitivity analysis shows that the cells' conditioning is equally impacted by all parameters. This offers insight into the increased control effort required for a battery as it ages and the culprit of said increase. The results of this analysis are then used to determine the best conditioned assemblies for a batch of cells with a mix of new and second-life parameters.
This paper investigates the impact of practical features of the emerging antenna array technology of Dynamic Metasurface Antennas (DMAs) when used for wideband sensing. By adopting a realistic DMA response model capturing frequency selective magnetic polarizability, finite resonant frequency tuning, and waveguide phase and leakage effects, we first present a compact observation model for user localization and multiple scattering points sensing through DMA-based analog combining of Orthogonal Frequency Division Multiplexing (OFDM) pilots transmitted in the uplink direction. Building on this model, we derive the Fisher Information Matrix (FIM), the equivalent FIM, and the corresponding Cramer-Rao Bounds (CRBs) for the relevant spatitemporal parameters estimation. The analysis reveals that frequency selectivity reduces the effective information bandwidth and distorts the DMA-based reception manifold, while waveguide attenuation decreases both the coherent combining gain and the effective aperture, thereby degrading estimation accuracy. Numerical results validate the analysis and confirm the resulting inflation in the delay, angle, and position error bounds.
This paper proposes a novel Distributed Unknown Input Observer (DUIO) framework for state estimation in large-scale systems subject to local unknown inputs. We consider systems where outputs are measured by a network of spatially distributed sensors and inputs are introduced through multiple dispersed channels. In this framework, each local node utilizes only its local input and output measurements to estimate the maximal locally reconstructible state. Subsequently, nodes collaboratively reconstruct the whole system state via a distributed optimization algorithm that fuses these partial estimates. We provide a rigorous analysis showing that the estimation error is bounded, with the error bound explicitly dependent on the number of communication iterations per time step and strongly convexity constant determined by the system parameters. Furthermore, to counteract curvature anisotropy induced by poor conditioned system geometry, we embed a normalization step into the distributed optimization procedure. Simulation results demonstrate the effectiveness of the proposed framework and the performance improvements yielded by the normalization procedure.
Evaluating the emotional intelligence (EI) of audio language models (ALMs) is critical. However, existing benchmarks mostly rely on synthesized speech, are limited to single-turn interactions, and depend heavily on open-ended scoring. This paper proposes HumDial-EIBench, a comprehensive benchmark for evaluating ALMs' EI. Using real-recorded human dialogues from the ICASSP 2026 HumDial Challenge, it reformulates emotional tracking and causal reasoning into multiple-choice questions with adversarial distractors, mitigating subjective scoring bias for cognitive tasks. It retains the generation of empathetic responses and introduces an acoustic-semantic conflict task to assess robustness against contradictory multimodal signals. Evaluations of eight ALMs reveal that most models struggle with multi-turn emotional tracking and implicit causal reasoning. Furthermore, all models exhibit decoupled textual and acoustic empathy, alongside a severe text-dominance bias during cross-modal conflicts.
The enhanced Gaussian noise (EGN) model is widely used for estimating the nonlinear interference (NLI) power accumulated in coherent fiber-optic transmission systems. Given a fixed fiber link, under the assumption that transmitted symbols are independently and identically distributed (i.i.d.), the EGN model establishes that the NLI power depends on time-invariant signal statistics, i.e., the second-, fourth-, and sixth-order moments of the symbols, which are determined by the modulation format and its probability distribution. However, recent advances in coded modulation have sought to mitigate NLI by introducing controlled temporal correlations among transmitted symbols, thereby violating the i.i.d. assumption underlying the EGN model. Among these correlations, symbol energy correlations are believed to exert the most significant influence on NLI. This work presents a rigorous mathematical derivation of a memory extension of the EGN model that explicitly accounts for symbol energy correlations, referred to as the MEGN model. The proposed MEGN model is validated through both numerical simulations and transmission experiments. Normalized average NLI power estimations with less than 5% errors across a wide range of symbol rates and transmission distances are reported. The model also provides a theoretical framework for analyzing and optimizing optical transmission systems employing temporally correlated modulation schemes.
Cell-free massive multiple-input multiple-output is a potential candidate for future networks with pervasive connectivity by utilizing coherent joint transmission and distributed antenna arrays. This paper studies the exploitation of full-duplex communication for a distributed antenna array. Specifically, we derive a closed-form expression for the uplink and downlink ergodic spectral efficiency (SE) for a network where the APs can flexibly operate in either the full-duplex or half-duplex mode with linear processing and Rayleigh fading channels. A long-term total SE maximization problem is formulated subject to a network operation model and individual SE requirements with limited power budget. Due to the intrinsic nonconvexity and infeasible circumstances where some UEs might not be able to achieve the rate requirements, we adapt differential evolution to design a low computational complexity algorithm that can attain good power allocation and network operation mode in polynomial time. Numerical results demonstrate the effectiveness of our system design and proposed algorithm over state-of-the-art benchmarks with satisfactory service to the majority of UEs, although several ones may be unscheduled under harsh conditions.
In this paper we address the problem of detecting differences or anomalies in a dynamical system, based on historical data of nominal operations. This problem encompasses quality control, where newly manufactured systems are tested against desired nominal operations, and the detection of changes in the dynamics due to degradation or repairs. We propose a model free approach based on Gaussian processes (GPs). The idea is to train offline a GP based on nominal data, which is then deployed online to detect whether measurements of the system state are compatible with nominal operations or if they deviate. Detecting this deviation is made more challenging by the presence of process and measurement noise, which might obfuscate deviations in the dynamics. The detection then is based on a threshold that ensures a specific false positive rate. We showcase the promising performance of the proposed method with two systems, and highlight several interesting future research questions.
Industrial control applications require detecting system anomalies as accurately and quickly as possible to enable prompt maintenance. In this context, it is common to consider several possible plant models, each linked to a different anomaly. The log-likelihood ratio method can then be used to identify the most accurate model and thereby classify which anomaly, if any, has occurred. Although the method has been applied to a wide variety of systems, there is no formal analysis of what makes anomalies more or less prone to detection. In this paper, we investigate a real-time anomaly detector based on the log-likelihood ratio and provide a theoretical characterization of its error rate when it is applied to linear Gaussian systems. We showcase the performance of this algorithm and the characterization obtained, and demonstrate how the latter can be leveraged for observer design.
Frequency-selective wireless power transfer provides a feasible route to enable independent actuation and control of multiple untethered robots in a common workspace; however, the scalability remains unquantified, particularly the maximum number of resonators that can be reliably addressed within a given frequency bandwidth. To address this, we formulate the relationship between resonator quality factor (Q-factor) and the number of individually addressable inductor-capacitor (LC) resonant energy harvesters within a fixed radio-frequency (RF) spectrum, and we convert selectively activated harvested energy into mechanical motion. We theoretically proved and experimentally demonstrated that scalability depends primarily on the Q-factor. For this proof-of-concept study, we define effective series resistance as a function of frequency allocating bandwidths to discrete actuators. We provide design equations for scaling untethered magnetic actuation with Q-factor optimization. Resonator networks spanning bandwidths from 100kHz to 1MHz were analyzed to quantify how increasing the number of resonators affects independent addressability. We validated the approach experimentally by fabricating three centimeter-scale untethered actuators that selectively trigger the motion of mechanical beams at 734kHz, 785kHz, and 855kHz. We also characterized the generated mechanical force and the activation bandwidth of each actuator, confirming that no unintended cross-triggering occurred.
This paper studies cyber attacks against informativity-based analysis in data-driven control. Focusing on strong observability, we consider an adversary who post-processes finite time-series data by an invertible linear transformation acting on the data matrices. We show that such transformations are capable of embedding malicious states into the invariant subspace explained by the transformed dataset. We provide a constructive attack method and derive feasibility conditions that characterize when such transformations exist. Moreover, we formulate an optimization problem to obtain the minimum-norm attack that quantifies the smallest data distortion required to destroy informativity. Numerical examples demonstrate that small and structured transformations can invalidate informativity certificates.
Accurate material recognition is a fundamental capability for intelligent perception systems to interact safely and effectively with the physical world. For instance, distinguishing visually similar objects like glass and plastic cups is critical for safety but challenging for vision-based methods due to specular reflections, transparency, and visual deception. While millimeter-wave (mmWave) radar offers robust material sensing regardless of lighting, existing camera-radar fusion methods are limited to closed-set categories and lack semantic interpretability. In this paper, we introduce VLMaterial, a training-free framework that fuses vision-language models (VLMs) with domain-specific radar knowledge for physics-grounded material identification. First, we propose a dual-pipeline architecture: an optical pipeline uses the segment anything model and VLM for material candidate proposals, while an electromagnetic characterization pipeline extracts the intrinsic dielectric constant from radar signals via an effective peak reflection cell area (PRCA) method and weighted vector synthesis. Second, we employ a context-augmented generation (CAG) strategy to equip the VLM with radar-specific physical knowledge, enabling it to interpret electromagnetic parameters as stable references. Third, an adaptive fusion mechanism is introduced to intelligently integrate outputs from both sensors by resolving cross-modal conflicts based on uncertainty estimation. We evaluated VLMaterial in over 120 real-world experiments involving 41 diverse everyday objects and 4 typical visually deceptive counterfeits across varying environments. Experimental results demonstrate that VLMaterial achieves a recognition accuracy of 96.08%, delivering performance on par with state-of-the-art closed-set benchmarks while eliminating the need for extensive task-specific data collection and training.
Koopman operator theory is a key tool in data assimilation of complex dynamical systems, with the potential to be applied to multimodal data. We formulate the problem of learning Koopman eigenfunctions from observations at arbitrary, possibly non-vanishing, time intervals as an optimization problem. Analysis of the formulation reveals aliasing induced by oscillatory dynamics and the sampling pattern, making an inherent identifiability limit explicit. The analysis also uncovers phase alignment near the true Koopman frequency, which creates a steep loss valley and demands careful optimization. We further show that irregular sampling can break aliasing and lead to phase cancellation. Numerical results demonstrate the efficacy of the proposed method under large regular time intervals compared to generator extended dynamic mode decomposition, and support the idea that irregular sampling can help recover the true Koopman spectrum.
Compact, high-performance components in millimeter-wave (mmWave) communication systems demand new acoustic filter technology at increasingly higher frequencies. Among various promising mmWave platforms, first-order antisymmetric (A1) mode laterally excited bulk acoustic resonators (XBARs) in thin-film lithium niobate (LiNbO3) have perhaps the most impressive linear performance. Despite these advances, there are few reports of nonlinear characterization of LiNbO3 filters at mmWaves. Here, we address this gap by developing a new nonlinear methodology for high-frequency filters. The result is a methodology for performing power-dependent S-parameters and third-order intermodulation (IMD3) measurements. To test our methodology, we fabricated filters on transferred single-crystal LiNbO3 films on sapphire (Al2O3) and silicon (Si) substrates with amorphous silicon (aSi) sacrificial layer. At 21.8 GHz, the filters on Al2O3 demonstrated an insertion loss of 1.48 dB, a 3 dB fractional bandwidth (FBW) of 17.7%, and in-band third-order input intercept points (IIP3) of 50.8 dBm. At 21.6 GHz, the filters on silicon demonstrated an insertion loss of 2.47 dB, a 3 dB FBW of 18.6%, and in-band IIP3 of 46.5 dBm. The nonlinear results conclusively show that thermal stability and passband distortion improved on the Al2O3 substrate, confirming that substrate selection plays a pivotal role in mitigating nonlinearity in acoustic front-end modules.
In this work, we study angle-based localization and rigidity maintenance control for multi-robot networks under sensing constraints. We establish the first equivalence between angle rigidity and bearing rigidity considering \textit{directed} sensing graphs and \textit{body-frame} bearing measurements in both $2$ and $3$-\textit{dimensional space}. In particular, we demonstrate that a framework in $\mathrm{SE}(d)$ is infinitesimally bearing rigid if and only if it is infinitesimally angle rigid and each robot obtains at least $d-1$ bearing measurements ($d \in \{2, 3\}$). Building on these findings, this paper proposes a distributed angle-based localization scheme and establishes local exponential stability under switching sensing graphs, requiring only infinitesimal angle rigidity across the visited topologies. Then, since angle rigidity strongly depends on the robots' spatial configuration, we investigate rigidity maintenance control. The \textit{angle rigidity eigenvalue} is presented as a metric for the degree of rigidity. A decentralized gradient-based controller capable of executing mission-specific commands while maintaining a sufficient level of angle rigidity is proposed. Simulations were conducted to evaluate the scheme's effectiveness and practicality.
Brain-computer interfaces (BCIs) have opened new platforms for human-computer interaction, medical diagnostics, and neurorehabilitation. Wearable BCI systems, which typically employ non-invasive electrodes for portable monitoring, hold great promise for real-world applications, but also face significant challenges of signal quality degradation caused by motion artifacts and environmental interferences. Most existing wearable BCI datasets are collected under stationary or controlled lab settings, limiting their utility for evaluating performance under body movement. To bridge this gap, we introduce WearBCI, the first dataset that comprehensively evaluates wearable BCI signals under different motion dynamics with synchronized multimodal recordings (EEG, IMU, and egocentric video), and systematic benchmark evaluations for studying impacts of motion artifact. Specifically, we collect data from 36 participants across different motion dynamics, including body movements, walking, and navigation. This dataset includes synchronized electroencephalography (EEG), inertial measurement unit (IMU) data, and egocentric video recordings. We analyze the collected wearable EEG signals to understand the impact of motion artifacts across different conditions, and benchmark representative EEG signal enhancement techniques on our dataset. Furthermore, we explore two new case studies: cross-modal EEG signal enhancement and multi-dimension human behavior understanding. These findings offer valuable insights into real-world wearable BCI deployment and new applications.
Motor Imagery (MI) is an emerging Brain-Computer Interface (BCI) paradigm where a person imagines body movements without physical action. By decoding scalp-recorded electroencephalography (EEG) signals, BCIs establish direct communication to control external devices, offering significant potential in prosthetics, rehabilitation, and human-computer interaction. However, existing solutions remain difficult to deploy. (i) Most employ independent, opaque models for each MI task, lacking a unified architectural foundation. Consequently, these models are trained in isolation, failing to learn robust representations from diverse datasets, resulting in modest performance. (ii) They primarily adopt fixed sensor deployment, whereas real-world setups vary in electrode number and placement, causing models to fail across configurations. (iii) Performance degrades sharply under low-SNR conditions typical of consumer-grade EEG. To address these challenges, we present NeuroPath, a neural architecture for robust MI decoding. NeuroPath takes inspiration from the brain's signal pathway from cortex to scalp, utilizing a deep neural architecture with specialized modules for signal filtering, spatial representation learning, and feature classification, enabling unified decoding. To handle varying electrode configurations, we introduce a spatially aware graph adapter accommodating different electrode numbers and placements. To enhance robustness under low-SNR conditions, NeuroPath incorporates multimodal auxiliary training to refine EEG representations and stabilize performance on noisy real-world data. Evaluations on three consumer-grade and three medical-grade public datasets demonstrate that NeuroPath achieves superior performance.
Digitizing magnetic media containing computer data is only the first step towards the preservation of early home computing era artifacts. The audio tape images must be decoded, verified, repaired if necessary, tested, and documented. If parts of this process could be effectively automated, volunteers could focus on contributing contextual and historical knowledge rather than struggling with technical tools. We therefore propose a feature representation based on Checksum Count Vectors and evaluate its applicability to detecting duplicates and variants of recordings within a large data store. The approach was tested on a collection of decoded tape images (n=4902), achieving 58\% accuracy in detecting variants and 97% accuracy in identifying alternative copies, for damaged recordings with up to 75% of records missing. These results represent an important step towards fully automated pipelines for restoration, de-duplication, and semantic integration of historical digital artifacts through sequence matching, automatic repair and knowledge discovery.
Multi-person social interactions are inherently built on coherence and relationships among all individuals within the group, making multi-person localization and body pose estimation essential to understanding these social dynamics. One promising approach is 2D-to-3D pose lifting which provides a 3D human pose consisting of rich spatial details by building on the significant advances in 2D pose estimation. However, the existing 2D-to-3D pose lifting methods often neglect inter-person relationships or cannot handle varying group sizes, limiting their effectiveness in multi-person settings. We propose MuPPet, a novel multi-person 2D-to-3D pose lifting framework that explicitly models inter-person correlations. To leverage these inter-person dependencies, our approach introduces Person Encoding to structure individual representations, Permutation Augmentation to enhance training diversity, and Dynamic Multi-Person Attention to adaptively model correlations between individuals. Extensive experiments on group interaction datasets demonstrate MuPPet significantly outperforms state-of-the-art single- and multi-person 2D-to-3D pose lifting methods, and improves robustness in occlusion scenarios. Our findings highlight the importance of modeling inter-person correlations, paving the way for accurate and socially-aware 3D pose estimation. Our code is available at: this https URL
This paper presents an analytical framework to study the geometry arising when a soft continuum arm grasps a planar object. Both the arm centerline and the object boundary are modeled as smooth curves. The grasping problem is formulated as a kinematic boundary following problem, in which the object boundary acts as the arm's 'shadow curve'. This formulation leads to a set of reduced kinematic equations expressed in terms of relative geometric shape variables, with the arm curvature serving as the control input. An optimal control problem is formulated to determine feasible arm shapes that achieve optimal grasping configurations, and its solution is obtained using Pontryagin's Maximum Principle. Based on the resulting optimal grasp kinematics, a class of continuum grasp quality metrics is proposed using the algebraic properties of the associated continuum grasp map. Feedback control aspects in the dynamic setting are also discussed. The proposed methodology is illustrated through systematic numerical simulations.
Accurate volume estimation of objects from visual data is a long-standing challenge in computer vision with significant applications in robotics, logistics, and smart health. Existing methods often rely on complex 3D reconstruction pipelines or struggle with the ambiguity inherent in single-view images. To address these limitations, we introduce a new method that fuses implicit 3D cues from stereo vision with explicit prior knowledge from natural language text. Our approach extracts deep features from a stereo image pair and a descriptive text prompt that contains the object's class and an approximate volume, then integrates them using a simple yet effective projection layer into a unified, multi-modal representation for regression. We conduct extensive experiments on public datasets demonstrating that our text-guided approach significantly outperforms vision-only baselines. Our findings show that leveraging even simple textual priors can effectively guide the volume estimation task, paving the way for more context-aware visual measurement systems. Code: this https URL.
Simultaneous Speech Translation (SimulST) requires balancing high translation quality with low latency. Recent work introduced REINA, a method that trains a Read/Write policy based on estimating the information gain of reading more audio. However, we find that information-based policies often lack temporal context, leading the policy to bias itself toward reading most of the audio before starting to write. We improve REINA using two distinct strategies: a supervised alignment network (REINA-SAN) and a timestep-augmented network (REINA-TAN). Our results demonstrate that while both methods significantly outperform the baseline and resolve stability issues, REINA-TAN provides a slightly superior Pareto frontier for streaming efficiency, whereas REINA-SAN offers more robustness against 'read loops'. Applied to Whisper, both methods improve the pareto frontier of streaming efficiency as measured by Normalized Streaming Efficiency (NoSE) scores up to 7.1% over existing competitive baselines.
Hybrid approaches that combine data-driven learning with physics-based insight have shown promise for improving the reliability of industrial condition monitoring. This work develops a hybrid condition monitoring framework that integrates primary sensor measurements, lagged temporal features, and physics-informed residuals derived from nominal surrogate models. Two hybrid integration strategies are examined. The first is a feature-level fusion approach that augments the input space with residual and temporal information. The second is a model-level ensemble approach in which machine learning classifiers trained on different feature types are combined at the decision level. Both hybrid approaches of the condition monitoring framework are evaluated on a continuous stirred-tank reactor (CSTR) benchmark using several machine learning models and ensemble configurations. Both feature-level and model-level hybridization improve diagnostic accuracy relative to single-source baselines, with the best model-level ensemble achieving a 2.9\% improvement over the best baseline ensemble. To assess predictive reliability, conformal prediction is applied to quantify coverage, prediction-set size, and abstention behavior. The results show that hybrid integration enhances uncertainty management, producing smaller and well-calibrated prediction sets at matched coverage levels. These findings demonstrate that lightweight physics-informed residuals, temporal augmentation, and ensemble learning can be combined effectively to improve both accuracy and decision reliability in nonlinear industrial systems.
Synthetic aperture radar (SAR) imaging can be exploited to enhance wireless communication performance through high-precision environmental awareness. However, integrating sensing and communication functionalities in such wideband systems remains challenging, motivating the development of a joint SAR and communication (JSARC) framework. We propose a dynamic time-division JSARC (TD-JSARC) framework for secure aerial communications that is relevant for critical scenarios, such as surveillance or post-disaster communication, where conventional localization of mobile adversaries often fails. In particular, we consider a secure downlink communication scenario where an aerial base station (ABS) serves a ground user (UE) in the presence of a ground-moving eavesdropper. To detect and track the eavesdropper, the ABS uses cognitive SAR along-track interferometry (ATI) to estimate its position and velocity. Based on these estimates, the ABS applies adaptive beamforming and artificial-noise jamming to enhance secrecy. To this end, we jointly optimize the time and power allocation to maximize the worst-case secrecy rate, while satisfying both SAR and communication constraints. Using the estimated eavesdropper trajectory, we formulate the problem as a Markov decision process (MDP) and solve it via deep reinforcement learning (DRL). Simulation results show that the proposed learning-based approach outperforms both learning and non-learning baseline schemes employing equal-aperture and random time allocation. The proposed method also generalizes well to previously unseen eavesdropper motion patterns.
Managing stock efficiently remains a core issue in modern logistics, where companies must reconcile cost efficiency with dependable service despite unpredictable market conditions. Conventional models often overlook the direct connection between investment in inventory and overall financial performance. This study introduces a data-driven decision framework that combines stochastic simulations with a profit-oriented optimization routine to enhance decision-making under uncertainty. The simulation stage generates performance estimates across multiple operating scenarios, providing realistic data on expenditures, revenues, and service reliability. These outcomes inform a fractional optimization process that searches for policies yielding the highest financial returns while maintaining required availability levels. The algorithm iteratively refines parameter values through feedback between simulated outcomes and optimization results, ensuring adaptability to dynamic enterprise systems. Computational experiments using representative business settings confirm that this approach improves both service consistency and financial yield. Overall, the framework demonstrates a practical, data-driven path for firms seeking to align operational responsiveness with sustainable profitability.
End-to-end full-duplex Speech Language Models (SLMs) require precise turn-taking for natural interaction. However, optimizing temporal dynamics via standard raw-token reinforcement learning (RL) degrades semantic quality, causing severe generative collapse and repetition. We propose ASPIRin, an interactivity-optimized RL framework that explicitly decouples when to speak from what to say. Using Action Space Projection, ASPIRin maps the text vocabulary into a coarse-grained binary state (active speech vs. inactive silence). By applying Group Relative Policy Optimization (GRPO) with rule-based rewards, it balances user interruption and response latency. Empirical evaluations show ASPIRin optimizes interactivity across turn-taking, backchanneling, and pause handling. Crucially, isolating timing from token selection preserves semantic coherence and reduces the portion of duplicate n-grams by over 50% compared to standard GRPO, effectively eliminating degenerative repetition.
In this exploratory paper we introduce the problem of cognitive agents that learn how to modify their environment according to local sensing to reach a global goal. We concentrate on discrete dynamics (cellular automata) on a two-dimensional system. We show that agents may learn how to approximate their goal when the environment is passive, while this task becomes impossible if the environment follows an active dynamics.
In mathematics and engineering, control theory is concerned with the analysis of dynamical systems through the application of suitable control inputs. One of the prominent problems in control theory is controllability which concerns the ability to determine whether there exists a control input that can steer a dynamical system from an initial state to a desired final state within a finite time horizon. There is a general theory for controlling linear or linearizable system, but it cannot be applied to discrete systems like cellular automata, which is the problem of that we address in this paper. We develop a general theory for linear (and affine) cellular automata, and apply it to examples of one-dimensional and two-dimensional Boolean cases. We introduce the concept of controllability matrix and show that controllability holds if and only if the controllability matrix is invertible.
Multi-output Gaussian Processes provide principled uncertainty-aware learning of vector-valued fields but are difficult to deploy in large-scale, distributed, and streaming settings due to their computational and centralized nature. This paper proposes a Consensus-based Recursive Multi-Output Gaussian Process (CRMGP) framework that combines recursive inference on shared basis vectors with neighbour-to-neighbour information-consensus updates. The resulting method supports parallel, fully distributed learning with bounded per-step computation while preserving inter-output correlations and calibrated uncertainty. Experiments on synthetic wind fields and real LiDAR data demonstrate that CRMGP achieves competitive predictive performance and reliable uncertainty calibration, offering a scalable alternative to centralized Gaussian process models for multi-agent sensing applications.
Intelligent operation of thermal energy networks aims to improve energy efficiency, reliability, and operational flexibility through data-driven control, predictive optimization, and early fault detection. Achieving these goals relies on sufficient observability, requiring continuous and well-distributed monitoring of thermal and hydraulic states. However, district heating systems are typically sparsely instrumented and frequently affected by sensor faults, limiting monitoring. Virtual sensing offers a cost-effective means to enhance observability, yet its development and validation remain limited in practice. Existing data-driven methods generally assume dense synchronized data, while analytical models rely on simplified hydraulic and thermal assumptions that may not adequately capture the behavior of heterogeneous network topologies. Consequently, modeling the coupled nonlinear dependencies between pressure, flow, and temperature under realistic operating conditions remains challenging. In addition, the lack of publicly available benchmark datasets hinders systematic comparison of virtual sensing approaches. To address these challenges, we propose a heterogeneous spatial-temporal graph neural network (HSTGNN) for constructing virtual smart heat meters. The model incorporates the functional relationships inherent in district heating networks and employs dedicated branches to learn graph structures and temporal dynamics for flow, temperature, and pressure measurements, thereby enabling the joint modeling of cross-variable and spatial correlations. To support further research, we introduce a controlled laboratory dataset collected at the Aalborg Smart Water Infrastructure Laboratory, providing synchronized high-resolution measurements representative of real operating conditions. Extensive experiments demonstrate that the proposed approach significantly outperforms existing baselines.
Rendering large-scale, unbounded scenes on AR/VR-class devices is constrained by the computation, bandwidth, and storage cost of 3D Gaussian Splatting (3DGS). We propose a low-power, low-cost 3DGS hardware accelerator that renders full-HD images in real time, together with a hardware-friendly compression pipeline that combines iterative Gaussian pruning and fine-tuning, progressive spherical harmonics (SH) degree reduction, and vector quantization of all SH coefficients and colors. The scheme achieves a $51.6\times$ model-size reduction with a 0.743 dB PSNR loss. The accelerator uses a frame-level pipeline that integrates point-based culling and projection with tile-based sorting and rasterization, skips zero-Jacobian matrix multiplications (reducing processing elements by 63\% and computation by 53\%), and adopts comparison-free tile-based sorting with deterministic latency. Implemented in a TSMC 28-nm process at 800 MHz, the design occupies $0.66~\text{mm}^2$ with 1.1438 M gates and 120 kB SRAM, consumes 0.219 W, and delivers 1219 Mpixels/J at 267.5 Mpixels/s, enabling 1080p at 129 FPS. Overall, it is $5.98\times$ smaller in area, $5.94\times$ higher throughput, and delivers $7.5\times$ higher energy efficiency than prior 3DGS accelerators.
Mutual coupling is a dominant systematic effect in dense reflector arrays, imprinting direction-dependent and frequency-dependent structure on embedded element patterns (EEPs) and currently limiting sensitivity in precision radio measurements. Accurate modelling of these effects requires full-wave simulations of structures that are electrically large at both the array and element levels, making conventional approaches computationally prohibitive. We present a Method-of-Moments (MoM) framework accelerated by a fast direct solver (FDS). The rotational symmetry of reflector dishes is exploited to efficiently compress self-interaction blocks of the impedance matrix. Mutual interactions are treated using a broadband multipole decomposition that remains efficient and accurate for closely spaced elements. We demonstrate the method on arrays of tens of reflectors from the Hydrogen Epoch of Reionization Array (HERA) telescope. To scale to larger arrays, the FDS is used to construct macro-basis functions (MBFs) from a smaller representative array and embed them within a conventional MBF scheme. This allows the first computation of EEPs for the 320-element HERA core on a 128-core workstation.
We provide an approach that closely estimates an organization's cyber resources directly from vulnerability timestamps, using a non-stationary queueing framework. Traditional attack-surface metrics operate on static snapshots, ignoring the core attack-defense dynamics within information systems, which exhibit bursty, heavy-tailed, and capacity-constrained behavior. Our approach to modeling such dynamics is based on a queueing abstraction of attack surfaces. We utilize a segmentation method to identify piecewise-stationary regimes via Gaussian mixture modeling (GMM) of queue length distributions. We fit segment-specific arrival, service, and resource parameters through the minimization of Kullback--Leibler divergence (KL) between the empirical and estimated distributions. Applied to both large-scale software supply chain data and multi-year private logistics enterprise cyber-ticket workflows, the model estimates organizational resources, measured in the time-varying active personnel and output rate per personnel, solely from bug report and fix timings for software supply chains, and discovery and patch timestamps in the enterprise setting. Our results provide 91--96\% accuracy in resource estimation, making the dynamic queueing framework a compelling approach for understanding attack surface dynamics. Further, our framework exposes resource bottlenecks, establishing a foundation for predictive workforce planning, patch-race modeling, and proactive cyber-risk management.
Reinforcement learning agent-based simulation (RL-ABS) has become an important tool for electricity market mechanism analysis and evaluation. In the modeling of monotone, bounded, multi-segment stepwise bids, existing methods typically let the policy network first output an unconstrained action and then convert it into a feasible bid curve satisfying monotonicity and boundedness through post-processing mappings such as sorting, clipping, or projection. However, such post-processing mappings often fail to satisfy continuous differentiability, injectivity, and invertibility at boundaries or kinks, thereby causing gradient distortion and leading to spurious convergence in simulation results. Meanwhile, most existing studies conduct mechanism analysis and evaluation mainly on the basis of training-curve convergence, without rigorously assessing the distance between the simulation outcomes and Nash equilibrium, which severely undermines the credibility of the results. To address these issues, this paper proposes...
In the limited feedback downlink multiple-input single-output (MISO) non-orthogonal multiple access (NOMA) system, both the effective channel gain and the channel direction need to be quantized. The quantization error affects the feasible region of NOMA and the rate loss compared with the full channel state information (CSI) case. In this letter, we analyze this effect and obtain upper bound for the rate loss. The numerical results show that the sum rate of the limited feedback MISO-NOMA system approaches that of the full CSI as the number of feedback bits increases.
We present a scalable method for geolocalizing buried fiber-optic cables using Distributed Acoustic Sensing (DAS) and traffic-induced quasi-static seismic signals. Assuming access to one end of the fiber, the method fuses DAS measurements with vehicle trajectories obtained from either video tracking or vehicle-mounted GPS. The fiber geometry is estimated by minimizing the mismatch between the measured and physics-based synthetic strain-rate maps. The framework combines a matched-filter initialization with neural-network-based trajectory optimization, enabling robust convergence under realistic noise and trajectory-uncertainty conditions. Simulation and field experiments demonstrate sub-meter localization accuracy, often on the order of tens of centimeters, and strong agreement with manual calibration by tap-testing. This approach provides a practical tool for mapping poorly documented underground fiber infrastructure and for supporting urban sensing applications.
In this paper, we develop a distributed algorithm for solving a class of distributed convex optimization problems where the local objective functions can be a general nonsmooth function, and all equalities and inequalities are network-wide coupled. This type of problem arises from many areas, such as economic dispatch, network utility maximization, and demand response. Integrating the decomposition by right hand side allocation and primal-dual methods, the proposed algorithm is able to handle the distributed optimization over networks with time-varying directed graph in fully distributed fashion. This algorithm does not require the communication of sensitive information, such as primal variables, for privacy issues. Further, we show that the proposed algorithm is guaranteed to achieve an $O(1/k)$ rate of convergence in terms of optimality based on duality analysis under the condition that local objective functions are strongly convex but not necessarily differentiable, and the subdifferential of local inequalities is bounded. We simulate the proposed algorithm to demonstrate its remarkable performance.
We develop a queueing-theoretic framework to model the temporal evolution of cyber-attack surfaces, where the number of active vulnerabilities is represented as the backlog of a queue. Vulnerabilities arrive as they are discovered or created, and leave the system when they are patched or successfully exploited. Building on this model, we study how automation affects attack and defense dynamics by introducing an AI amplification factor that scales arrival, exploit, and patching rates. Our analysis shows that even symmetric automation can increase the rate of successful exploits. We validate the model using vulnerability data collected from an open source software supply chain and show that it closely matches real-world attack surface dynamics. Empirical results reveal heavy-tailed patching times, which we prove induce long-range dependence in vulnerability backlog and help explain persistent cyber risk. Utilizing our queueing abstraction for the attack surface, we develop a systematic approach for cyber risk mitigation. We formulate the dynamic defense problem as a constrained Markov decision process with resource-budget and switching-cost constraints, and develop a reinforcement learning (RL) algorithm that achieves provably near-optimal regret. Numerical experiments validate the approach and demonstrate that our adaptive RL-based defense policies significantly reduce successful exploits and mitigate heavy-tail queue events. Using trace-driven experiments on the ARVO dataset, we show that the proposed RL-based defense policy reduces the average number of active vulnerabilities in a software supply chain by over 90% compared to existing defense practices, without increasing the overall maintenance budget. Our results allow defenders to quantify cumulative exposure risk under long-range dependent attack dynamics and to design adaptive defense strategies with provable efficiency.
We present a distributionally robust PAC-Bayesian framework for certifying the performance of learning-based finite-horizon controllers. While existing PAC-Bayes control literature typically assumes bounded losses and matching training and deployment distributions, we explicitly address unbounded losses and environmental distribution shifts (the sim-to-real gap). We achieve this by drawing on two modern lines of research, namely the PAC-Bayes generalization theory and distributionally robust optimization via the type-1 Wasserstein distance. By leveraging the System Level Synthesis (SLS) reparametrization, we derive a sub-Gaussian loss proxy and a bound on the performance loss due to distribution shift. Both are tied directly to the operator norm of the closed-loop map. For linear time-invariant systems, this yields a computationally tractable optimization-based framework together with high-probability safety certificates for deployment in real-world environments that differ from those used in training.
Collecting large, aligned cross-modal datasets for music-flavor research is difficult because perceptual experiments are costly and small by design. We address this bottleneck through two complementary experiments. The first tests whether audio-flavor correlations, feature-importance rankings, and latent-factor structure transfer from an experimental soundtracks collection (257~tracks with human annotations) to a large FMA-derived corpus ($\sim$49,300 segments with synthetic labels). The second validates computational flavor targets -- derived from food chemistry via a reproducible pipeline -- against human perception in an online listener study (49~participants, 20~tracks). Results from both experiments converge: the quantitative transfer analysis confirms that cross-modal structure is preserved across supervision regimes, and the perceptual evaluation shows significant alignment between computational targets and listener ratings (permutation $p<0.0001$, Mantel $r=0.45$, Procrustes $m^2=0.51$). Together, these findings support the conclusion that sonic seasoning effects are present in synthetic FMA annotations. We release datasets and companion code to support reproducible cross-modal AI research.
We study whether second-order systems can be made to behave like prescribed first-order dynamical systems through feedback control. More precisely, we study whether prescribed vector fields on compact smooth manifolds, viewed geometrically as sections of the tangent bundle, can be asymptotically stabilized in a strong sense by second-order control systems on the base manifold. Our class of second-order systems includes most Lagrangian systems, and we obtain both positive and negative results. The positive result asserts that, for fully actuated systems, the section corresponding to any smooth vector field can be made globally exponentially stable, normally hyperbolic, and more. In particular, not only does each closed-loop solution asymptotically have the prescribed velocities, but it also converges to a trajectory of the first-order dynamics generated by the prescribed vector field at an exponential rate. Thus, the closed-loop second-order system asymptotically reproduces the prescribed first-order dynamics. In contrast, the negative result asserts that, for underactuated systems on manifolds with nonzero Euler characteristic, sections corresponding to "almost all" smooth vector fields cannot even be locally asymptotically stabilized. This includes, in particular, all vector fields with only isolated zeros. An example shows that the Euler characteristic assumption is necessary for the negative result.
We present Audio Flamingo Next (AF-Next), the next-generation and most capable large audio-language model in the Audio Flamingo series, designed to advance understanding and reasoning over speech, environmental sounds and music. Compared to Audio Flamingo 3, AF-Next introduces: (i) a stronger foundational audio-language model that significantly improves accuracy across diverse audio understanding tasks; (ii) scalable strategies for constructing large-scale audio understanding and reasoning data beyond existing academic benchmarks; (iii) support for long and complex audio inputs up to 30 minutes; and (iv) Temporal Audio Chain-of-Thought, a new reasoning paradigm that explicitly grounds intermediate reasoning steps to timestamps in long audio, enabling fine-grained temporal alignment and improved interpretability. To enable these capabilities, we first conduct a systematic analysis of Audio Flamingo 3 to identify key gaps in audio understanding and reasoning. We then curate and scale new large-scale datasets totaling over 1 million hours to address these limitations and expand the existing AudioSkills-XL, LongAudio-XL, AF-Think and AF-Chat datasets. AF-Next is trained using a curriculum-based strategy spanning pre-training, mid-training and post-training stages. Extensive experiments across 20 audio understanding and reasoning benchmarks, including challenging long-audio tasks, show that AF-Next outperforms similarly sized open models by large margins and remains highly competitive with and sometimes surpasses, much larger open-weight and closed models. Beyond benchmark performance, AF-Next exhibits strong real-world utility and transfers well to unseen tasks, highlighting its robustness and generalization ability. In addition to all data, code and methods, we open-source 3 variants of AF-Next, including AF-Next-Instruct, AF-Next-Think and AF-Next-Captioner.
Large-scale three-dimensional (3D) scene reconstruction in low-altitude intelligent networks (LAIN) demands highly efficient wireless image transmission. However, existing schemes struggle to balance severe pilot overhead with the transmission accuracy required to maintain reconstruction fidelity. To strike a balance between efficiency and reliability, this paper proposes a novel deep learning-based end-to-end (E2E) transceiver design that integrates 3D Gaussian Splatting (3DGS) directly into the training process. By jointly optimizing the communication modules via the combined 3DGS rendering loss, our approach explicitly improves scene recovery quality. Furthermore, this task-driven framework enables the use of a sparse pilot scheme, significantly reducing transmission overhead while maintaining robust image recovery under low-altitude channel conditions. Extensive experiments on real-world aerial image datasets demonstrate that the proposed E2E design significantly outperforms existing baselines, delivering superior transmission performance and accurate 3D scene reconstructions.
Herding, where investors imitate others' decisions rather than relying on their own analysis, is a prevalent phenomenon in financial markets. Excessive herding distorts rational decisions, amplifies volatility, and can be exploited by manipulators to harm the market. Traditional regulatory tools, such as information disclosure and transaction restrictions, are often imprecise and lack theoretical guarantees for effectiveness. This calls for a quantitative approach to regulating herding. We propose a regulator-leader-follower trilateral game framework based on optimal control theory to study the complex dynamics among them. The leader makes rational decisions, the follower maximizes utility while aligning with the leader's decisions, whereas the regulator designs a mechanism to maximize social welfare and minimize regulatory cost. We derive the follower's decisions and the regulator's mechanisms, theoretically analyze the impact of regulation on decisions, and investigate effective mechanisms to improve social welfare.
Multi-path sensing, which aims to extract the geometric attributes of multiple propagation paths, is expected to be a key functionality of 6G. A movable antenna (MA) can enable this functionality by creating a synthetic aperture through sequential mechanical motion. However, existing MA-based sensing methods typically rely on exhaustive scanning over the entire movable plate, resulting in significant control overhead and sensing latency, which limits their practicality for agile sensing. To address this challenge, this paper develops a prior-guided agile multi-path sensing framework that leverages weak prior angle-of-arrival (AoA) statistics as side information. The proposed framework comprises two steps. First, the movable plate's three-dimensional orientation is optimized only once to maximize path visibility while preserving path discriminability, both induced from Fisher information analysis. Second, only two predetermined linear MA scans are made on the tilted plate to estimate the elevation and azimuth AoAs from the resulting sequence of received signals. By incorporating the prior AoA statistics, a maximum a posteriori (MAP)-based AoA estimation algorithm is developed. With only one orientation control and two linear scans, the proposed framework enables agile multi-path sensing with significantly reduced control overhead and latency, while achieving AoA estimation accuracy approaching that of the single-path benchmark.
Incentive design problems consider a system planner who steers self-interested agents toward a socially optimal Nash equilibrium by issuing incentives in the presence of information asymmetry, that is, uncertainty about the agents' cost functions. A common approach formulates the problem as a Mathematical Program with Equilibrium Constraints (MPEC) and optimizes incentives using hypergradients-the total derivatives of the planner's objective with respect to incentives. However, computing or approximating the hypergradients typically requires full or partial knowledge of equilibrium sensitivities to incentives, which is generally unavailable under information asymmetry. In this paper, we propose a hypergradient-free incentive law, called the social-gradient flow, for incentive design when the planner's social cost depends on the agents' joint actions. We prove that the social cost gradient is always a descent direction for the planner's objective, irrespective of the agent cost landscape. In the idealized setting where equilibrium responses are observable, the social-gradient flow converges to the unique socially optimal incentive. When equilibria are not directly observable, the social-gradient flow emerges as the slow-timescale limit of a two-timescale interaction, in which agents' strategies evolve on a faster timescale. It is established that the joint strategy-incentive dynamics converge to the social optimum for any agent learning rule that asymptotically tracks the equilibrium. Theoretical results are also validated via numerical experiments.
The dominant paradigm for building LLM based agents is the Agent Loop, an iterative cycle where a single language model decides what to do next by reading an ever growing context window. This paradigm has three structural weaknesses: implicit dependencies between steps, unbounded recovery loops, and mutable execution history that complicates debugging. We characterize the Agent Loop as a single ready unit scheduler: at any moment, at most one executable unit is active, and the choice of which unit to activate comes from opaque LLM inference rather than an inspectable policy. This perspective places Agent Loops and graph based execution engines on a single semantic continuum. We propose SGH, Structured Graph Harness, which lifts control flow from implicit context into an explicit static DAG. SGH makes three commitments: execution plans are immutable within a plan version, planning execution and recovery are separated into three layers, and recovery follows a strict escalation protocol. These choices trade some expressiveness for controllability, verifiability, and implementability. Our contributions are fourfold: a scheduler unified framework that applies classical scheduling theory to LLM agent execution and identifies challenges introduced by non deterministic LLM nodes; a trade off analysis of controllability, expressiveness, and implementability across 70 surveyed systems; a formal specification including a node state machine with termination and soundness guarantees; and an attributable experimental framework with a seven group design for future validation. This is a position paper and design proposal. We provide a theoretical framework, design analysis, and experimental protocol, not a production implementation or empirical results.
We present a framework for bridging the gap between sensor attack detection and recovery in cyber-physical systems. The proposed framework models modern-day, complex perception pipelines as bipartite graphs, which combined with anomaly detector alerts defines a Bayesian network for inferring compromised sensors. An active probing strategy exploits system nonlinearities to maximize distinguishability between attack hypotheses, while compromised sensors are selectively disabled to maintain reliable state estimation. We propose a threshold-based probing strategy and show its effectiveness via a simplified partially observable Markov decision process (POMDP) formulation. Experiments on an inverted pendulum under single and multi-sensor attacks show that our method significantly outperforms outlier-robust and prediction-based baselines, especially under prolonged attacks.
Ensuring operational safety is critical for human-to-humanoid motion imitation. This paper presents a vision-based framework that enables a humanoid robot to imitate human movements while avoiding collisions. Human skeletal keypoints are captured by a single camera and converted into joint angles for motion retargeting. Safety is enforced through a Control Barrier Function (CBF) layer formulated as a Quadratic Program (QP), which filters imitation commands to prevent both self-collisions and human-robot collisions. Simulation results validate the effectiveness of the proposed framework for real-time collision-aware motion imitation.
Artificial intelligence (AI) is moving increasingly beyond prediction to support decisions in complex, uncertain, and dynamic environments. This shift creates a natural intersection with operations research and management sciences (OR/MS), which have long offered conceptual and methodological foundations for sequential decision-making under uncertainty. At the same time, recent advances in deep learning, including feedforward neural networks, LSTMs, transformers, and deep reinforcement learning, have expanded the scope of data-driven modeling and opened new possibilities for large-scale decision systems. This tutorial presents an OR/MS-centered perspective on deep learning for sequential decision-making under uncertainty. Its central premise is that deep learning is valuable not as a replacement for optimization, but as a complement to it. Deep learning brings adaptability and scalable approximation, whereas OR/MS provides the structural rigor needed to represent constraints, recourse, and uncertainty. The tutorial reviews key decision-making foundations, connects them to the major neural architectures in modern AI, and discusses leading approaches to integrating learning and optimization. It also highlights emerging impact in domains such as supply chains, healthcare and epidemic response, agriculture, energy, and autonomous operations. More broadly, it frames these developments as part of a wider transition from predictive AI toward decision-capable AI and highlights the role of OR/MS in shaping the next generation of integrated learning--optimization systems.
Traditionally, industrial control systems (ICS) were designed without security in mind, prioritizing availability and real-time communication. As these systems increasingly become targets of powerful adversaries, security can no longer be neglected. Driven by flexibility and automation needs, ICS are transitioning from wired to 5G communication, introducing new attack surfaces and a less reliable communication medium, thereby exacerbating existing security challenges. Given their critical role in society, a comprehensive evaluation of their security is imperative. To this end, we introduce SWICS, a fully virtual testbed simulating an ICS in a realistic 5G environment, and study how this transition affects security under varying channel conditions. Our results show three key findings: under optimal channel conditions, industrial 5G networks can achieve resilience comparable to wired systems, while degraded channel conditions can amplify traditional attacks, threaten system stability, and undermine detection mechanisms based on predictable traffic patterns. We further demonstrate the inherent limits of securing 5G channels for ICS through eavesdropping and jamming on the open-air interface. Our work highlights the interplay between security and 5G channel conditions, showing that traditional security controls may no longer be sufficient and motivating further research.
Microscale manipulation has advanced substantially in controlled locomotion and targeted transport, yet many biomedical applications require precise and adaptive interaction with biological micro-objects. At these scales, manipulation is realized through three main classes of platforms: embodied microrobots that physically interact as mobile agents, field-mediated systems that generate contactless trapping or manipulation forces, and externally actuated end-effectors that interact through remotely driven physical tools. Unlike macroscale manipulators, these systems function in fluidic, confined, and surface-dominated environments characterized by negligible inertia, dominant interfacial forces, and soft, heterogeneous, and fragile targets. Consequently, classical assumptions of dexterous manipulation, including rigid-body contact, stable grasping, and rich proprioceptive feedback, become difficult to maintain. This review introduces micro-dexterity as a framework for analyzing biological micromanipulation through the coupled roles of embodiment, perception, and control. We examine how classical manipulation primitives, including pushing, reorientation, grasping, and cooperative manipulation, are reformulated at the microscale; compare the architectures that enable them, from contact-based micromanipulators to contactless field-mediated systems and cooperative multi-agent platforms; and review the perception and control strategies required for task execution. We identify the current dexterity gap between laboratory demonstrations and clinically relevant biological manipulation, and outline key challenges for future translation.
Foundation models, including large language models (LLMs), are increasingly used for human-in-the-loop (HITL) cyber-physical systems (CPS) because foundation model-based AI agents can potentially interact with both the physical environments and human users. However, the unpredictable behavior of human users and AI agents, in addition to the dynamically changing physical environments, leads to uncontrollable nondeterminism. To address this urgent challenge of enabling agentic AI-powered HITL CPS, we propose a reactor-model-of-computation (MoC)-based approach, realized by the open-source Lingua Franca (LF) framework. We also carry out a concrete case study using the agentic driving coach as an application of HITL CPS. By evaluating the LF-based agentic HITL CPS, we identify practical challenges in reintroducing determinism into such agentic HITL CPS and present pathways to address them.
Open-source software for cyber-physical systems (CPS) often lacks robust testing involving robotic platforms, resulting in critical errors that remain undetected. This is especially challenging when multiple modules of CPS software are developed by various open-source contributors. To address this gap, we propose Automated CPS Testing (ACT) that performs automated, continuous testing of open-source software with its robotic platforms, integrated with the open-source infrastructure such as GitHub. We implement an ACT prototype and conduct a case study on an open-source CPS with an educational robotic platform to demonstrate its capabilities.
Deep learning underpins a wide range of applications in MRI, including reconstruction, artifact removal, and segmentation. However, progress has been driven largely by public datasets focused on brain and knee imaging, shaping how models are trained and evaluated. As a result, careful studies of the reliability of these models across diverse anatomical settings remain limited. In this work, we introduce MosaicMRI, a large and diverse collection of fully sampled raw musculoskeletal (MSK) MR measurements designed for training and evaluating machine-learning-based methods. MosaicMRI is the largest open-source raw MSK MRI dataset to date, comprising 2,671 volumes and 80,156 slices. The dataset offers substantial diversity in volume orientation (e.g., axial, sagittal), imaging contrasts (e.g., PD, T1, T2), anatomies (e.g., spine, knee, hip, ankle, and others), and numbers of acquisition coils. Using VarNet as a baseline for accelerated reconstruction task, we perform a comprehensive set of experiments to study scaling behavior with respect to both model capacity and dataset size. Interestingly, models trained on the combined anatomies significantly outperform anatomy-specific models in low-sample regimes, highlighting the benefits of anatomical diversity and the presence of exploitable cross-anatomical correlations. We further evaluate robustness and cross-anatomy generalization by training models on one anatomy (e.g., spine) and testing them on another (e.g., knee). Notably, we identify groups of body parts (e.g., foot and elbow) that generalize well with each other, and highlight that performance under domain shifts depends on both training set size, anatomy, and protocol-specific factors.
The stable operation of autonomous off-grid photovoltaic systems dictates reliance on solar forecasting algorithms that respect atmospheric thermodynamics. Contemporary deep learning models consistently exhibit critical anomalies, primarily severe temporal phase lags during cloud transients and physically impossible nocturnal power generation. To resolve this divergence between data-driven modeling and deterministic celestial mechanics, this research introduces the Thermodynamic Liquid Manifold Network. The proposed methodology projects 15 meteorological and geometric variables into a Koopman-linearized Riemannian manifold to systematically map complex climatic dynamics. The architecture integrates a Spectral Calibration unit and a multiplicative Thermodynamic Alpha-Gate. This system synthesizes real-time atmospheric opacity with theoretical clear-sky boundary models, structurally enforcing strict celestial geometry compliance. This completely neutralizes phantom nocturnal generation while maintaining zero-lag synchronization during rapid weather shifts. Validated against a rigorous five-year testing horizon in a severe semi-arid climate, the framework achieves an RMSE of 18.31 Wh/m2 and a Pearson correlation of 0.988. The model strictly maintains a zero-magnitude nocturnal error across all 1826 testing days and exhibits a sub-30-minute phase response during high-frequency transients. Comprising exactly 63,458 trainable parameters, this ultra-lightweight design establishes a robust, thermodynamically consistent standard for edge-deployable microgrid controllers.
In a variational denoising model, weight in the data fidelity term plays the role of enhancing the noise-removal capability. It is profoundly correlated with noise information, while also balancing the data fidelity and regularization terms. However, the difficulty of assigning weight is expected to be substantial when the noise pattern is beyond independent identical Gaussian distribution, e.g., impulse noise, stripe noise, or a mixture of several patterns, etc. Furthermore, how to leverage weight to balance the data fidelity and regularization terms is even less evident. In this work, we propose a data-driven loss weighting (DLW) scheme to address these issues. Specifically, DLW trains a parameterized weight function (i.e., a neural network) that maps the noisy image to the weight. The training is achieved by a bilevel optimization framework, where the lower level problem is solving several denoising models with the same weight predicted by the weight function and the upper level problem minimizes the distance between the restored image and the clean image. In this way, information from both the noise and the regularization can be efficiently extracted to determine the weight function. DLW also facilitates the easy implementation of a trained weight function on denoising models. Numerical results verify the remarkable performance of DLW on improving the ability of various variational denoising models to handle different complex noise. This implies that DLW has the ability to transfer the noise knowledge at the model level to heterogeneous tasks beyond the training ones and the generalization theory underlying DLW is studied, validating its intrinsic transferability.
In recent years, novel communication strategies have emerged to face the challenges that the increased number of connected devices and the higher quality of transmitted information are posing. Among them, semantic communication obtained promising results especially when combined with state-of-the-art deep generative models, such as large language or diffusion models, able to regenerate content from extremely compressed semantic information. However, most of these approaches focus on single-user scenarios processing the received content at the receiver on top of conventional communication systems. In this paper, we propose to go beyond these methods by developing a novel generative semantic communication framework tailored for multi-user scenarios. This system assigns the channel to users knowing that the lost information can be filled in with a diffusion model at the receivers. Under this innovative perspective, OFDMA systems should not aim to transmit the largest part of information, but solely the bits necessary to the generative model to semantically regenerate the missing ones. The thorough experimental evaluation shows the capabilities of the novel diffusion model and the effectiveness of the proposed framework, leading towards a GenAI-based next generation of communications.
Multiplicative noise widely exists in radar images, medical images and other important fields' images. Compared to normal noises, multiplicative noise has a generally stronger effect on the visual expression of images. Aiming at the denoising problem of multiplicative noise, we linearize the nonlocal means algorithm with deep learning and propose a linear attention mechanism based deep nonlocal means filtering (LDNLM). Starting from the traditional nonlocal means filtering, we employ deep channel convolution neural networks to extract the information of the neighborhood matrix and obtain representation vectors of every pixel. Then we replace the similarity calculation and weighted averaging processes with the inner operations of the attention mechanism. To reduce the computational overhead, through the formula of similarity calculation and weighted averaging, we derive a nonlocal filter with linear complexity. Experiments on both simulated and real multiplicative noise demonstrate that the LDNLM is more competitive compared with the state-of-the-art methods. Additionally, we prove that the LDNLM possesses interpretability close to traditional NLM. The source code and pre-trained model are available at this https URL.
In this work, we propose a compositional scheme based on small-gain reasoning to synthesize safety controllers for interconnected stochastic hybrid systems. In our proposed setting, we first offer an augmented scheme that characterizes each stochastic hybrid subsystem, endowed with both continuous evolution and instantaneous jumps, within a unified framework including both scenarios, implying that its state trajectories coincide with those of the original hybrid subsystem. We then introduce the concept of augmented control sub-barrier certificates (A-CSBCs) for each subsystem, thereby enabling the construction of an augmented control barrier certificate (A-CBC) for an interconnected network (from A-CSBCs of its subsystems) along with its safety controller under small-gain compositional conditions. We eventually leverage the constructed A-CBC to derive a guaranteed lower bound on the safety probability of the interconnected network. While in a monolithic scheme the computational complexity of synthesizing a control barrier certificate via sum-of-squares (SOS) optimization scales polynomially with the overall network size, the proposed compositional framework reduces this dependence to the subsystem size. We illustrate the efficacy of the proposed approach on an interconnected network comprising 1000 stochastic hybrid subsystems with nonlinear dynamics under two distinct interconnection topologies.
Ensuring resilience in multi-energy systems (MESs) has become increasingly urgent and challenging due to the growing frequency and severity of extreme events, such as natural disasters, extreme weather, and cyber-physical attacks. Among the various approaches to enhancing MES resilience, hydrogen integration offers significant potential in cross-temporal, cross-spatial, and cross-sector flexibility, as well as black-start capability. Although considerable efforts have been devoted to this area, a systematic review of resilience enhancement in hydrogen-enabled MESs is still lacking. To address this gap, this paper presents a comprehensive review of hydrogen-enabled MES resilience enhancement. First, advantages, vulnerabilities, and challenges related to hydrogen-enabled MES resilience enhancement are summarized. Next, a resilience enhancement framework for hydrogen-enabled MESs is proposed, based on which existing resilience metrics and event-oriented contingency models are reviewed and discussed. Planning measures are then classified according to the types of hydrogen-related facilities, together with uncertainty handling methods, scenario generation methods, and planning problem formulation frameworks. In addition, operational enhancement measures are categorized into three response stages: prevention, emergency response, and restoration. Finally, research gaps are identified and future directions are discussed, including comprehensive resilience metric design, advanced extreme-event scenario generation, spatiotemporal cyber-physical contingency modeling under compound extreme events, coordinated planning and operation across multiple networks and timescales, low-carbon resilient planning and operation, and large language model-assisted whole-process resilience enhancement.
Image restoration (IR) often faces various complex and unknown degradations in real-world scenarios, such as noise, blurring, compression artifacts, and low resolution, etc. Training specific models for specific degradation may lead to poor generalization. To handle multiple degradations simultaneously, All-in-One models might sacrifice performance on certain types of degradation and still struggle with unseen degradations during training. Existing IR agents rely on multimodal large language models (MLLM) and a time-consuming rolling-back selection strategy neglecting image quality. As a result, they may misinterpret degradations and have high time and computational costs to conduct unnecessary IR tasks with redundant order. To address these, we propose a Quality-Driven agent (Q-Agent) via Chain-of-Thought (CoT) restoration. Specifically, our Q-Agent consists of robust degradation perception and quality-driven greedy restoration. The former module first fine-tunes MLLM, and uses CoT to decompose multi-degradation perception into single-degradation perception tasks to enhance the perception of MLLMs. The latter employs objective image quality assessment (IQA) metrics to determine the optimal restoration sequence and execute the corresponding restoration algorithms. Experimental results demonstrate that our Q-Agent achieves superior IR performance compared to existing All-in-One models.
Hematoxylin and eosin (H&E)-stained slides are central to cancer diagnosis and monitoring, visualizing tissue architecture and cellular morphology. However, H&E lacks the molecular specificity needed to distinguish cell states and functional activation. Antibody-based stains, such as immunohistochemistry (IHC), are therefore required to identify specific phenotypes (e.g., CD3$^+$ T cells or HER2-positive tumor cells) but are costly, time-consuming, and not universally available. Deep learning-based image translation methods, often termed virtual staining, offer a complementary alternative by generating virtual immunostains directly from H&E images. Most existing virtual staining methods are patch-based and operate at fixed resolutions, often requiring large datasets and additional post-hoc super-resolution models to generate high-resolution images. Furthermore, GAN- and diffusion-based approaches introduce stochasticity into generated stains which, although beneficial for visual realism in natural images, can lead to hallucinations and structural distortions that affect the accuracy and reliability required for clinical use. We propose IMPLICITSTAINER, a deterministic framework that reformulates virtual staining as a continuous pixel-level translation problem. In contrast to existing patch-based approaches, IMPLICITSTAINER formulates image translation as a continuous spatial mapping using neural implicit deep learning models. Each target-domain (IHC) pixel is predicted from a high-dimensional embedding of the corresponding source-domain H&E pixel, its local spatial neighborhood, and explicit coordinate information. IMPLICITSTAINER enables resolution-agnostic inference, improves robustness in low-data regimes, and yields deterministic, reproducible outputs. Across more than twenty baselines, IMPLICITSTAINER achieves SOTA performance on virtual staining tasks, including IHC and mIF.
Infrastructure-based sensing systems, like Wi-Fi, thermal, vibration-based approaches, provide continuous and unobtrusive indoor human monitoring services. They are often deployed statically for long-term continuous monitoring, which often leads to inefficient sensing/inflexible deployment due to human mobility or high maintenance/data volume for dense deployments. In contrast, autonomous and human carried mobile devices can better adapt to human mobility. However, their physical presence (e.g., drones or robots) may induce observer effects, while their operation often imposes additional burdens, such as wearing (e.g., wearables) and frequent charging. We present GEM, a hybrid scheme that introduces the mobility to infrastructure-based sensing. GEM integrates a matrix of gears into everyday surfaces (e.g., floors, walls) to turn them into "public transportation" for moving infrastructure sensors around. We design and fabricate a 3 x 3 gear matrix prototype that can effectively move sensors from one location to another. We further validate the scalability of the design through simulation of up to 64 x 64 gear matrix with concurrent sensors.
Extremely large aperture arrays operating in the near-field regime unlock additional spatial resources, which can be exploited to simultaneously serve multiple users even when they share the same angular direction. This work investigates the distance-domain degrees of freedom (DoF), defined as the DoF when a user varies only its distance to the base station and not the angle. To obtain the distance-domain DoF, we investigate a line-of-sight (LoS) channel between a base station (source) and observation region representing users. The base station is modeled as a large two-dimensional transmit (Tx) array with an arbitrary shape. The observation region is modeled as an arbitrarily long linear receive (Rx) array, where elements are collinearly aligned but located at varying distances from the Tx array. We assume that both the Tx and Rx arrays have continuous apertures with an infinite number of elements and infinitesimal spacing, which establishes an upper bound for the distance-domain DoF in the case of a finite number of elements. First, we analyze an ideal case where the Tx array is a single piece and the Rx array is on the broadside of the Tx array. By reformulating the channel as an integral operator with a Hermitian convolution kernel, we derive a closed-form expression for the distance-domain DoF via the Fourier transform. Our analysis shows that the distance-domain DoF is predominantly determined by the extreme boundaries of both the Tx and Rx arrays rather than their detailed interior structure. We further extend the framework to non-broadside configurations by employing a projection method that converts the problem to an equivalent broadside case. Finally, we extend the analytical framework to modular arrays and show the distance-domain DoF gain over a single-piece array under a fixed total physical length.
In this paper, we present a robust and adaptive model predictive control (MPC) framework for uncertain nonlinear systems affected by bounded disturbances and unmodeled nonlinearities. We use Gaussian Processes (GPs) to learn the uncertain dynamics based on noisy measurements, including those collected during system operation. As a key contribution, we derive robust predictions for GP models using contraction metrics, which are incorporated in the MPC formulation. The proposed design guarantees recursive feasibility, robust constraint satisfaction and convergence to a reference state, with high probability. We provide a numerical example of a planar quadrotor subject to difficult-to-model ground effects, which highlights significant improvements achieved through the proposed robust prediction method and through online learning.
The computational pipelines of single-particle cryo-electron microscopy (cryo-EM) and cryo-electron tomography (cryo-ET) include an early particle-picking stage, in which a micrograph or tomogram is scanned to extract candidate particles, typically via template matching or deep-learning-based techniques. The extracted particles are then passed to downstream tasks such as classification and 3D reconstruction. Although it is well understood empirically that particle picking can be sensitive to the choice of templates or learned priors, a quantitative theory of the bias introduced by this stage has been lacking. Here, we develop a mathematical framework for analyzing bias in template matching-based detection with concrete applications to cryo-EM and cryo-ET. We study this bias through two downstream tasks: (i) maximum-likelihood estimation of class means in a Gaussian mixture model (GMM) and (ii) 3D volume reconstruction from the extracted particle stack. We show that when template matching is applied to pure noise, then under broad noise models, the resulting maximum-likelihood estimates converge asymptotically to deterministic, noise-dependent transforms of the user-specified templates, yielding a structure from noise effect. We further characterize how the resulting bias depends on the noise statistics, sample size, dimension, and detection threshold. Finally, controlled experiments using standard cryo-EM software corroborate the theory, demonstrating reproducible structure from noise artifacts in low-SNR data.
Training effective artificial intelligence models for telecommunications is challenging due to the scarcity of deployment-specific data. Real data collection is expensive, and available datasets often fail to capture the unique operational conditions and contextual variability of the network environment. Digital twinning provides a potential solution to this problem, as simulators tailored to the current network deployment can generate site-specific data to augment the available training datasets. However, there is a need to develop solutions to bridge the inherent simulation-to-reality (sim-to-real) gap between synthetic and real-world data. This paper reviews recent advances on two complementary strategies: 1) the calibration of digital twins (DTs) through real-world measurements, and 2) the use of sim-to-real gap-aware training strategies to robustly handle residual discrepancies between digital twin-generated and real data. For the latter, we evaluate two conceptually distinct methods that model the sim-to-real gap either at the level of the environment via Bayesian learning or at the level of the training loss via prediction-powered inference.
Demand charge, a utility fee based on an electricity customer's peak power consumption, often constitutes a significant portion of costs for commercial electric vehicle (EV) charging station operators. This paper explores control methods to reduce peak power consumption at workplace EV charging stations in a joint price and power optimization framework. We optimize a menu of price options to incentivize users to select controllable charging service. Using this framework, we propose a model predictive control approach to reduce both demand charge and overall operator costs. Through a Monte Carlo simulation, we find that our algorithm outperforms a state-of-the-art benchmark optimization strategy and can significantly reduce station operator costs.
With the advanced reasoning, contextual understanding, and information synthesis capabilities of large language models (LLMs), a novel paradigm emerges for the autonomous generation of dispatch strategies in modern power systems. In this paper, we propose an LLM-based experience-driven day-ahead Volt/Var schedule solution for distribution networks, which enables the self-evolution of LLM agent's strategies through the collaboration and interaction of multiple modules, specifically, experience storage, experience retrieval, experience generation, and experience modification. The experience storage module archives historical operational records and decisions, while the retrieval module selects relevant past cases according to current forecasting conditions. The LLM agent then leverages these retrieved experiences to generate new, context-aware decisions for current situation, which are subsequently refined by the modification module to realize self-evolution of the dispatch policy. Comprehensive experimental results validate the effectiveness of the proposed method and highlight the applicability of LLMs in power system dispatch problems facing incomplete information.
Grid-connected power converters are ubiquitous in modern power systems, acting as grid interfaces of renewable energy sources, energy storage systems, electric vehicles, high-voltage DC systems, etc. Conventionally, power converters use multiple PID regulators to achieve different control objectives such as grid synchronization and voltage/power regulation, where the PID parameters are usually tuned based on a presumed (and often overly-simplified) power grid model. However, this may lead to inferior performance or even instabilities in practice, as the real power grid is highly complex, variable, and generally unknown. To tackle this problem, we employ a data-enabled predictive control (DeePC) to perform data-driven, optimal, robust, and adaptive control for power converters. We call the converters that are operated in this way DeePConverters. A DeePConverter can implicitly perceive the characteristics of the power grid from measured data and adjust its control strategy to achieve optimal, robust, and adaptive performance. We present the modular configurations, generalized structure, control behavior specification, inherent robustness, detailed implementation, computational aspects, and online adaptation of DeePConverters. High-fidelity simulations and hardware-in-the-loop (HIL) tests are provided to validate the effectiveness of DeePConverters.
Numerous deep learning-based solutions have been developed for the automatic recognition of breast cancer using mammography images. However, their performance often declines when applied to data from different domains, primarily due to domain shift - the variation in data distributions between source and target domains. This performance drop limits the safe and equitable deployment of AI in real-world clinical settings. In this study, we present DoSReMC (Domain Shift Resilient Mammography Classification), a batch normalization (BN) adaptation framework designed to enhance cross-domain generalization without retraining the entire model. Using three large-scale full-field digital mammography (FFDM) datasets - including HCTP, a newly introduced, pathologically confirmed in-house dataset - we conduct a systematic cross-domain evaluation with convolutional neural networks (CNNs). Our results demonstrate that BN layers are a primary source of domain dependence: they perform effectively when training and testing occur within the same domain, and they significantly impair model generalization under domain shift. DoSReMC addresses this limitation by fine-tuning only the BN and fully connected (FC) layers, while preserving pretrained convolutional filters. We further integrate this targeted adaptation with an adversarial training scheme, yielding additional improvements in cross-domain generalizability while reducing the computational cost of model training. DoSReMC can be readily incorporated into existing AI pipelines and applied across diverse clinical environments, providing a practical pathway toward more robust and generalizable mammography classification systems.
The electrification and automation of mobility are reshaping how cities operate on-demand transport systems. Managing Electric Autonomous Mobility-on-Demand (EAMoD) fleets effectively requires coordinating dispatch, rebalancing, and charging decisions under multiple uncertainties, including travel demand, travel time, energy consumption, and charger availability. We address this challenge with a combined stochastic and robust model predictive control (MPC) framework. The framework integrates spatio-temporal Bayesian neural network forecasts with a multi-stage stochastic optimization model, formulated as a large-scale mixed-integer linear program. To ensure real-time applicability, we develop a tailored Nested Benders Decomposition that exploits the scenario tree structure and enables efficient parallelized solution. Stochastic optimization is employed to anticipate demand and infrastructure variability, while robust constraints on energy consumption and travel times safeguard feasibility under worst-case realizations. We evaluate the framework using high-fidelity simulations of San Francisco and Chicago. Compared with deterministic, reactive, and robust baselines, the combined stochastic and robust approach reduces median passenger waiting times by up to 36% and 95th-percentile delays by nearly 20%, while also lowering rebalancing distance by 27% and electricity costs by more than 35%. We also conduct a sensitivity analysis of battery size and vehicle efficiency, finding that energy-efficient vehicles maintain stable performance even with small batteries, whereas less efficient vehicles require larger batteries and greater infrastructure support. Our results emphasize the importance of jointly optimizing predictive control, vehicle capabilities, and infrastructure planning to enable scalable, cost-efficient EAMoD operations.
As the number of satellites in orbit has increased exponentially in recent years, ensuring their correct functionality has started to require automated methods to decrease human workload. In this work, we present an algorithm that analyzes the on-board data related to friction from the Reaction Wheel Assemblies (RWA) of a satellite and determines their operating status, distinguishing between nominal status and several possible anomalies that require preventive measures to be taken. The algorithm first uses a model based on hybrid systems theory to extract the information relevant to the problem. The extraction process combines techniques in changepoint detection, dynamic programming, and maximum likelihood in a structured way. A classifier then uses the extracted information to determine the status of the RWA. This last classifier has been previously trained with a labelled dataset produced by a high-fidelity simulator, comprised for the most part of nominal data. The final algorithm combines model-based and data-based approaches to obtain satisfactory results with an accuracy around 95%.
Integrated sensing and communication (ISAC) is emerging as a key enabler for spectrum-efficient and hardware-converged wireless networks. However, classical radar systems within ISAC architectures face fundamental limitations under low signal power and high-noise conditions. This paper proposes a novel framework that embeds quantum illumination radar into a base station to simultaneously support full-duplex classical communication and quantum-enhanced target detection. The resulting integrated quantum sensing and classical communication (IQSCC) system is optimized via a sum-rate maximization formulation subject to radar sensing constraints. The non-convex joint optimization of transmit power and beamforming vectors is tackled using the successive convex approximation technique. Furthermore, we derive performance bounds for classical and quantum radar protocols under the statistical detection theory, highlighting the quantum advantage in low signal-to-interference-plus-noise ratio regimes. Simulation results demonstrate that the proposed IQSCC system achieves a higher communication throughput than the conventional ISAC baseline while satisfying the sensing requirement.
As definitions about new architectural aspects, use cases, and standards for integrated sensing and communication (ISAC) continue to appear, cellular systems based on massive multiple-input multiple-output (MIMO) antenna technology are also experiencing a parallel evolution through the integration of novel network components. This evolution should support emerging ISAC use cases and services. In particular, this paper explores a recent vision for cost-efficient cellular network densification through the deployment of swarms of repeaters. Leveraging their ability to retransmit signals instantaneously, we investigate how these repeaters can enhance radar sensing capabilities for drone detection in a swarm repeater-assisted MIMO ISAC system.
Dynamic Metasurface Antennas (DMAs) are recently attracting considerable research interests due to their potential to enable low-cost, reconfigurable, and highly scalable antenna array architectures for next generation wireless systems. However, most of the existing literature relies on idealized models for the DMA operation, often overlooking critical structural and physical constraints inherent to their constituent metamaterials. In this paper, leveraging a recently proposed model for this antenna architecture incorporating physically consistent modeling of mutual coupling and waveguide propagation losses, we optimize DMA-based transmission for bistatic sensing. A tractable approximation for the DMA response is first presented, which enables efficient optimization of the dynamically reconfigurable Lorentzian-constrained responses of the array's metamaterials. In particular, we formulate a robust beamforming optimization problem with the objective to minimize the worst-case position error bound, in the presence of spatial uncertainties for the environment's scatterers as well as synchronization uncertainties at the analog combining multi-antenna receiver. To address the resulting high computational complexity due to the possibly excessive number of metamaterial-based antennas and their operation constraints, two low complexity beamforming design approaches are presented that perform offline searching over a novel beam codebook. The accuracy of all presented DMA designs is assessed by means of Monte Carlo simulations for various system parameters, confirming that accurately modeling mutual coupling is essential for maintaining increased localization performance. It is also shown that, even under positioning and synchronization uncertainties, the proposed designs yield accuracy comparable to their fully digital and analog counterparts, while adhering to the structural DMA constraints.
Diffusion models have found extensive use in solving inverse problems, by sampling from an approximate posterior distribution of data given the measurements. Recently, consistency models (CMs) have been proposed to directly predict the final output from any point on the diffusion ODE trajectory, enabling high-quality sampling in just a few neural function evaluations (NFEs). CMs have also been utilized for inverse problems, but existing CM-based solvers either require additional task-specific training or utilize data fidelity operations with slow convergence, limiting their applicability to large-scale problems and making them difficult to extend to nonlinear settings. In this work, we reinterpret CMs as proximal operators of a prior, enabling their integration into plug-and-play (PnP) frameworks. Specifically, we propose PnP-CM, an ADMM-based PnP solver that provides a unified framework for solving a wide range of inverse problems, and incorporates noise perturbations and momentum-based updates to improve performance in the low-NFE regime. We evaluate our approach on a diverse set of linear and nonlinear inverse problems. We also train and apply CMs to MRI data for the first time. Our results show that PnP-CM achieves high-quality reconstructions in as few as 4 NFEs, and produces meaningful results in 2 steps, highlighting its effectiveness in real-world inverse problems while outperforming existing CM-based approaches.
Deep learning has shown impressive results in reducing noise and artifacts in X-ray computed tomography (CT) reconstruction. Self-supervised CT reconstruction methods are especially appealing for real-world applications because they require no ground truth training examples. However, these methods involve a simplified X-ray physics model during training, which may make inaccurate assumptions, for example, about scintillator blurring, the scanning geometry, or the distribution of the noise. As a result, they can be less robust to real-world imaging circumstances. In this paper, we review the model assumptions of six recent self-supervised CT reconstruction methods. Based on this, we combined concepts of the Robust Equivariant Imaging and Sparse2Inverse methods in a new self-supervised CT reconstruction method called Equivariance2Inverse that is robust to scintillator blurring and limited-angle data. We benchmarked Equivariance2Inverse and the existing methods on the real-world 2DeteCT dataset and on synthetic data with and without scintillator blurring and a limited-angle scanning geometry. The results of our benchmark show that methods that assume that the noise is pixel-wise independent do not perform well on data with scintillator blurring. Moreover, they show that when the distribution of objects is rotationally invariant, this invariance can be used to reduce artifacts in limited-angle reconstructions.
The joint transmit and pinching beamforming design for spectral efficiency (SE) and energy efficiency (EE) tradeoff in pinching-antenna systems (PASS) is proposed, under practical channel and energy consumption models. In the single-user scenario, it is proved that the optimal pinching antenna (PA) positions are independent of the transmit beamforming. Based on this insight, a two-stage joint beamforming design is proposed. Specifically, in the first stage, a general PA placement framework is proposed for multi-waveguide systems. In the second stage, the closed-form solution for the optimal transmit beamformer is derived given the optimized PA positions. In the multi-user scenario, an alternating optimization (AO)-based joint beamforming design is proposed to balance the SE-EE performance while taking the quality-of-service (QoS) requirements into account. It is proved that the proposed AO-based algorithm is guaranteed to converge when no constraints are violated in PA placement subproblem. Numerical results demonstrate that: 1) the proposed algorithms effectively improve joint SE-EE performance; 2) PASS exhibits strong robustness against variations in the service area along the waveguide direction.
Distributed MIMO (D-MIMO) has emerged as a key architecture for future sixth-generation (6G) networks, enabling cooperative transmission across spatially distributed access points (APs). However, most existing studies rely on idealized channel models and lack hardware validation, leaving a gap between algorithmic design and practical deployment. Meanwhile, recent advances in artificial intelligence (AI)-driven precoding have shown strong potential for learning nonlinear channel-to-precoder mappings, but their real-world deployment remains limited due to challenges in data collection and model generalization. This work presents a framework for implementing and validating an AI-based precoder on a D-MIMO testbed with hardware reciprocity calibration. A pre-trained graph neural network (GNN)-based model is fine-tuned using real-world channel state information (CSI) collected from the Techtile platform and evaluated under both interpolation and extrapolation scenarios before end-to-end validation. Experimental results demonstrate a 15.7% performance gain over the pre-trained model in the multi-user case after fine-tuning, while in the single-user scenario the model achieves near-maximum ratio transmission (MRT) performance with less than 0.7 bits/channel use degradation out of a total throughput of 5.19 bits/channel use on unseen positions. Further analysis confirms the data efficiency of real-world measurements, showing consistent gains with increasing training samples, and end-to-end validation verifies coherent power focusing comparable to MRT.
With the rapid deployments of 5G and 6G networks, accurate modeling of urban radio propagation has become critical for system design and network planning. However, conventional statistical or empirical models fail to fully capture the influence of detailed geometric features on site-specific channel variances in dense urban environments. In this paper, we propose a geometry map-based propagation channel model that directly extracts key parameters from a 3D geometry map and incorporates the Uniform Theory of Diffraction (UTD) to recursively compute multiple diffraction fields, thereby enabling accurate prediction of site-specific large-scale path loss and time-varying Doppler characteristics in urban scenarios. A well-designed identification algorithm is developed to efficiently detect buildings that significantly affect signal propagation. The proposed model is validated using urban measurement data, showing excellent agreement of path loss in both line-of-sight (LOS) and nonline-of-sight (NLOS) conditions. In particular, for NLOS scenarios with complex diffractions, it outperforms the 3GPP and simplified models, reducing the RMSE by 7.1 dB and 3.18 dB, respectively. Doppler analysis further demonstrates its accuracy in capturing time-varying propagation characteristics, confirming the scalability and generalization of the model in urban environments.
This paper investigates the sensing potential of affine frequency division multiplexing (AFDM) in high-mobility integrated sensing and communication (ISAC) from the perspective of radar waveforms. We introduce an innovative parameter selection criterion that establishes a precise mathematical equivalence between AFDM subcarriers and Nyquist-sampled frequency-modulated continuous-wave (FMCW). This connection not only provides a clear physical insight into AFDM's sensing mechanism but also enables a direct mapping from the DAFT index to delay-Doppler (DD) parameters of wireless channels. Building on this, we develop a novel input-output model in a DD-parameterized DAFT (DD-DAFT) domain for AFDM, which explicitly reveals the inherent DD coupling effect arising from the chirp-channel interaction. Subsequently, we design two matched-filtering sensing algorithms. The first is performed in the time-frequency domain with low complexity, while the second is operated in the DD-DAFT domain to precisely resolve the DD coupling. Simulations show that our algorithms achieve effective pilot-free sensing and demonstrate a fundamental trade-off between sensing performance, communication overhead, and computational complexity. The proposed AFDM outperforms classical AFDM and other variants in most scenarios.
This paper proposes a vision-conditioned flow matching (FM) framework for beam prediction in millimeter-wave vehicle-to-infrastructure links. Instead of modeling discrete beam-index sequences, the proposed method learns the temporal evolution of normalized beam receive power vectors through a continuous vector field governed by an ordinary differential equation, enabling smooth dynamics and efficient sampling. By imposing FM over beam-state transitions and jointly optimizing beam prediction and flow consistency, the proposed framework provides a unified model for future beam prediction. Experimental results show that the proposed FM-based model significantly improves beam prediction performance over baselines, approaches the performance of large language model-based methods, and reduces predictor-side inference latency by about $6.9\times$ on GPU and $2.8\times10^3\times$ on CPU, respectively.
Low Earth orbit (LEO) satellites are a promising technology for providing low-latency, high-data-rate, and wide-coverage communication services. However, with growing demand for data transmission, future non-terrestrial networks (NTNs) require high spectral efficiency especially with low-gain antennas at the ground devices. This motivates the adoption of in-band full-duplex (FD) systems. In addition, the potential imbalance between downlink (DL) and uplink (UL) transmissions necessitates flexibility in resource allocation. To overcome these challenges, we propose an FD LEO satellite system, where the non-reciprocal beyond-diagonal reconfigurable intelligent surfaces (NR-BD-RIS) and multiple transmit and receive antennas are attached to the LEO satellite. NR-BD-RIS reflects the DL and UL signals by passive beamforming. By incorporating non-reciprocal components into the impedance network of RIS, the NR-BD-RIS breaks channel reciprocity, facilitating simultaneous support for multiple beam directions. To cover a wide coverage, we propose a time-sharing scheduling framework in which the NR-BD-RIS simultaneously serves multiple DL and multiple UL ground devices within each time slot. An optimization problem is defined to maximize the weighted sum-rate over the entire scheduling period. Numerical results demonstrate that the proposed NR-BD-RIS significantly performs better than both conventional BD-RIS and diagonal RIS (D-RIS) with respect to DL and UL sum-rate performance under both single-user (SU) and multiple-user (MU) cases. Additionally, NR-BD-RIS requires less frequent reconfiguration compared to the other two types of RIS, making it more practical for implementation.
Force feedback gloves in haptic applications remain constrained by limited adaptability, simplified feedback, and fixed architectures that limit force feedback versatility. To address these challenges, we present KinesCeTI, a modular force feedback exoskeleton for the index and thumb, designed as a multipurpose device adaptable to a wide range of hand sizes. The glove incorporates interchangeable thimbles for fingertip or phalanx attachment and a bidirectional tendon transmission that supports both passive and active feedback. It is combined with a modular actuation design, where different feedback systems may be attached. The system was tested with two actuation modules: a compliant ratchet-pawl braking mechanism for passive feedback and a novel one-way clutch for variable active feedback, newly introduced here. The system was evaluated in three user studies with 20 participants each, assessing ergonomics, actuation performance and usability in both real and virtual tasks. Results indicate that the glove adapts to different hand sizes and provides effective feedback with both mechanisms, highlighting its potential as a versatile platform for haptic research.
Modern microelectronic systems require long term operational stability, necessitating precise reliability models to predict device lifecycles and identify governing failure mechanisms. This is particularly critical for high power GaN High-Electron-Mobility Transistors (HEMTs), where reliability research has historically trailed behind low power digital counterparts. This study introduces a novel application of a modified boost converter circuit designed to investigate GaN failure mechanisms, specifically targeting the determination of reliability factors for the MTOL model. By utilizing a high duty cycle, the circuit stresses the device at maximum rated voltages and currents with minimal input requirements, accelerating hot carrier and trap generation without immediate detrimental failure. Experimental validation was conducted using an EPC 2038 GaN transistor under a constant drain current of 400 mA and a duty cycle of 0.7. The results confirmed that the increase in Drain-Source on-resistance ($R_{DS(on)}$) follows a logarithmic trend over time, consistent with the EPC Phase 12 reliability model. While initial tests at 40V did not successfully validate the longitudinal optical phonon scattering energy ($\hbar\omega_{LO}$), but were reasonably acceptable, subsequent stress tests at 70V and 100V yielded $\hbar\omega_{LO}$ values that were successfully validated against existing theoretical and experimental data. This methodology provides a robust framework for predicting performance and lifetime across varying operational parameters in modern power electronics.
The transition to electric transportation is a key enabler for intelligent and sustainable cities; however, inadequate charging infrastructure remains a major barrier to large-scale electric vehicle (EV) adoption. This paper presents a scalable Electric Road System (ERS) architecture that enables Dynamic Wireless Charging (DWC) of EVs during motion. The proposed framework integrates inductive charging coils embedded in road pavement, real-time vehicle-to-infrastructure (V2I) communication, and adaptive energy management coordinated with smart grid systems. Modular road segments with a standardized charging process are employed to ensure scalability across urban corridors and interoperability among different EV platforms. System performance is evaluated using a co-simulation framework combining MATLAB-based power analysis with traffic inputs generated in SUMO. Key performance metrics include charging efficiency, energy cost per kilometer, and battery lifecycle improvement. Simulation results indicate a potential reduction in range anxiety and an increase in battery lifespan due to frequent shallow charging cycles. The study further discusses deployment challenges, policy considerations, and energy distribution strategies aligned with climate-resilient urban development. A case study of a tier-1 Indian city is presented to analyze the cost-benefit trade-offs of retrofitting high-density urban corridors with ERS. The proposed framework provides a practical foundation for next-generation EV infrastructure planning in smart cities.
This paper integrates the emerging ultra-massive multiple-input multiple-output (UM-MIMO) technique with orthogonal chirp division multiplexing (OCDM) waveform to tackle the challenging near-field integrated sensing and communication (ISAC) problem. Specifically, we conceive a comprehensive ISAC architecture, where an UM-MIMO base station adopts OCDM waveform for communications and a co-located sensing receiver adopts the frequency-modulated continuous wave (FMCW) detection principle to simplify the associated hardware. For sensing tasks, several OCDM subcarriers, namely, dedicated sensing subcarriers (DSSs), are each transmitted through a dedicated sensing antenna (DSA) within the transmit antenna array. By judiciously designing the DSS selection scheme and optimizing receiver parameters, the FMCW-based sensing receiver can decouple the echo signals from different DSAs with significantly reduced hardware complexity. This setup enables the estimation of ranges and velocities of near-field targets in an antenna-pairwise manner. Moreover, by leveraging the spatial diversity of UM-MIMO, we introduce the concept of virtual bistatic sensing (VIBS), which incorporates the estimates from multiple antenna pairs to achieve high-accuracy target positioning and three-dimensional velocity measurement. The VIBS paradigm is immune to hostile channel environments characterized by spatial non-stationarity and uncorrelated multipath environment. Furthermore, the channel estimation of UM-MIMO OCDM systems enhanced by the sensing results is investigated. Simulation results demonstrate that the proposed ISAC scheme enhances sensing accuracy, and also benefits communication performance.
This work addresses the output consensus problem of constrained heterogeneous multi-agent systems under a switching network with potential communication delays, where outputs are periodic and characterized by an exosystem. Since periodic references have more complex dynamics, it is more challenging to track periodic references and achieve consensus on them. In this paper, a model predictive control method incorporating an artificial reference and a modified cost function is proposed to track periodic references, which maintains recursive feasibility even when references switch. Moreover, consensus protocols are proposed to achieve consensus on periodic references in different scenarios, in which global information such as the set of globally admissible references and the global time index are not involved. Theoretical analysis proves that constrained output consensus is asymptotically achieved with the proposed algorithm as the references of each agent converge and agents track their references while maintaining constraint satisfaction. Finally, numerical examples are provided to verify the effectiveness of the proposed algorithm.
ASVspoof 5 is the fifth edition in a series of challenges which promote the study of speech spoofing and deepfake detection solutions. A significant change from previous challenge editions is a new crowdsourced database collected from a substantially greater number of speakers under diverse recording conditions, and a mix of cutting-edge and legacy generative speech technology. With the new database described elsewhere, we provide in this paper an overview of the ASVspoof 5 challenge results for the submissions of 53 participating teams. While many solutions perform well, performance degrades under adversarial attacks and the application of neural encoding/compression schemes. Together with a review of post-challenge results, we also report a study of calibration in addition to other principal challenges and outline a road-map for the future of ASVspoof.
Modulation effects such as phasers, flangers and chorus effects are heavily used in conjunction with the electric guitar. Machine learning based emulation of analog modulation units has been investigated in recent years, but most methods have either been limited to one class of effect or suffer from a high computational cost or latency compared to canonical digital implementations. Here, we build on previous work and present a framework for modelling flanger, chorus and phaser effects based on differentiable digital signal processing. The model is trained in the time-frequency domain, but at inference operates in the time-domain, requiring zero latency. We investigate the challenges associated with gradient-based optimisation of such effects, and show that low-frequency weighting of loss functions avoids convergence to local minima when learning delay times. We show that when trained against analog effects units, sound output from the model is in some cases perceptually indistinguishable from the reference, but challenges still remain for effects with long delay times and feedback.
This paper proposes a novel reinforcement learning framework, named Self-Organizing Dual-buffer Adaptive Clustering Experience Replay (SODACER), designed to achieve safe and scalable optimal control of nonlinear systems. The proposed SODACER mechanism consisting of a Fast-Buffer for rapid adaptation to recent experiences and a Slow-Buffer equipped with a self-organizing adaptive clustering mechanism to maintain diverse and non-redundant historical experiences. The adaptive clustering mechanism dynamically prunes redundant samples, optimizing memory efficiency while retaining critical environmental patterns. The approach integrates SODACER with Control Barrier Functions (CBFs) to guarantee safety by enforcing state and input constraints throughout the learning process. To enhance convergence and stability, the framework is combined with the Sophia optimizer, enabling adaptive second-order gradient updates. The proposed SODACER-Sophia's architecture ensures reliable, effective, and robust learning in dynamic, safety-critical environments, offering a generalizable solution for applications in robotics, healthcare, and large-scale system optimization. The proposed approach is validated on a nonlinear Human Papillomavirus (HPV) transmission model with multiple control inputs and safety constraints. Comparative evaluations against random and clustering-based experience replay methods demonstrate that SODACER achieves faster convergence, improved sample efficiency, and a superior bias-variance trade-off, while maintaining safe system trajectories, validated via the Friedman test.
Early clinical assessment of Alzheimer's disease relies on behavior scores that measure a subject's language, memory, and cognitive skills. On the medical imaging side, functional magnetic resonance imaging has provided invaluable insights into the neural pathways underlying Alzheimer's disease. While prior studies have used resting-state functional MRI by extracting functional connectivity matrices, these approaches neglect the temporal dynamics inherent in functional data. In this work, we present a deep state space modeling framework that directly leverages the blood-oxygenation-level-dependent time series to learn a sparse collection of brain regions to predict behavior scores. Our model extracts temporal features that encapsulate nuanced patterns of intrinsic brain activity, thereby enhancing predictive performance compared to traditional connectivity methods. We identify specific brain regions that are most predictive of cognitive impairment through experiments on data provided by the Michigan Alzheimer's Disease Research Center, providing new insights into the neural substrates of early Alzheimer's pathology. These findings have important implications for the possible development of risk monitoring and intervention strategies in Alzheimer's disease.
Using Bayesian decision theory, we modify the perfect-information, differential game-based guidance law (DGL1) to address the inevitable estimation error occurring when driving this guidance law with a separately-designed state estimator. This yields a stochastic guidance law complying with the generalized separation theorem, as opposed to the common approach, that implicitly, but unjustifiably, assumes the validity of the regular separation theorem. The required posterior probability density function of the game's state is derived from the available noisy measurements using an interacting multiple model particle filter. When the resulting optimal decision turns out to be nonunique, this feature is harnessed to appropriately shape the trajectory of the pursuer so as to enhance its estimator's performance. In addition, certain properties of the particle-based computation of the Bayesian cost are exploited to render the algorithm amenable to real-time implementation. The performance of the entire estimation-decision-guidance scheme is demonstrated using an extensive Monte Carlo simulation study.
Objectives: To determine whether targeted T2 fluid-attenuated inversion recovery (T2-FLAIR) dropout training improves robustness of glioblastoma MRI tumor segmentation and whole-tumor volumetry when T2-FLAIR is unavailable, without degrading performance when T2-FLAIR is available. Materials and Methods: In this retrospective multi-dataset study, 3D nnU-Net models were trained on a subset of the BraTS 2021 cohort (n=848) and externally validated on the University of Pennsylvania glioblastoma cohort (n=403). Models were trained with no dropout or targeted T2-FLAIR dropout (probability rate (r)=0.35 or 0.50) by replacing only the T2-FLAIR channel with zeros during training. Testing used prespecified T2-FLAIR-present and T2-FLAIR-absent scenarios, with the absent scenario simulated by zeroing the T2-FLAIR channel at inference. The primary endpoint was per-patient overall region-wise Dice similarity coefficient (DSC), secondary endpoints were region-specific DSC, 95th percentile Hausdorff distance and Bland-Altman whole-tumor volume bias. Results: With T2-FLAIR present, overall median DSC was 94.8% (interquartile range [IQR] 90.0%-97.1%) with dropout (r=0.35) and 95.0% (IQR 90.3%-97.1%) without dropout, supporting equivalence (p<0.001). With T2-FLAIR absent, overall median DSC improved from 81.0% (IQR 75.1%-86.4%) without dropout to 93.4% (IQR 89.1%-96.2%) with dropout (r=0.35). Whole-tumor DSC improved from 60.4% to 92.6%, whole tumor 95th percentile Hausdorff distance improved from 17.24 mm to 2.45 mm, and whole-tumor volume bias improved from -45.6 mL to 0.83 mL. Conclusions: In a simulated T2-FLAIR-unavailable scenario, targeted T2-FLAIR dropout preserved segmentation performance when T2-FLAIR was available and substantially reduced whole-tumor segmentation error and volumetric bias when T2-FLAIR was absent.
Safety filters based on Control Barrier Functions (CBFs) provide formal guarantees of forward invariance, but are often difficult to implement in networked dynamical systems. This is due to global coupling and communication requirements. This paper develops locally implementable approximations of networked CBF safety filters that require no coordination across subsystems. The proposed approach is based on a two-time-scale dynamic implementation inspired by singular perturbation theory, where a small parameter $\epsilon$ separates fast filter dynamics from the plant dynamics; then, a local implementation is enabled via derivative estimation. Explicit bounds are derived to quantify the mismatch between trajectories of the systems with dynamic filter and with the ideal centralized safety filter. These results characterize how safety degradation depends on the time-scale parameter $\epsilon$, estimation errors, and filter activation time, thereby quantifying trade-offs between safety guarantees and local implementability.
This paper investigates the distributed safety critical control for multi-agent systems (MASs) in the presence of uncontrollable agents with uncertain behaviors. To ensure system safety, the control barrier function (CBF) is employed in this paper. However, a key challenge is that the CBF constraints are coupled when MASs perform collaborative tasks, which depend on information from multiple agents and impede the design of a fully distributed safe control scheme. To overcome this, a novel reconstructed CBF approach is proposed. In this method, the coupled CBF is reconstructed by leveraging state estimates of other agents obtained from a distributed adaptive observer. Furthermore, a prescribed performance adaptive parameter is designed to modify this reconstruction, ensuring that satisfying the reconstructed CBF constraint is sufficient to meet the original coupled one. Based on the reconstructed CBF, we design a safety-critical quadratic programming (QP) controller and prove that the proposed distributed control scheme rigorously guarantees the safety of the MAS, even in the uncertain dynamic environments involving uncontrollable agents. The effectiveness of the proposed method is illustrated through a simulation.
This paper is motivated by controllers developed for autonomous vehicles which occasionally result into conditions where safety is no longer guaranteed. We develop an exact-time safety recovery framework for any control-affine nonlinear system when its state is outside a safe region using time-varying Control Barrier Functions (CBFs) with optimal barrier tracking. Unlike conventional formulations that provide only conservative upper bounds on recovery time convergence, the proposed approach guarantees recovery to the safe set at a prescribed time. The key mechanism is an active barrier tracking condition that forces the barrier function to follow exactly a designer-specified recovery trajectory. This transforms safety recovery into a trajectory design problem. The recovery trajectory is parameterized and optimized to achieve optimal performance while preserving feasibility under input constraints, avoiding the aggressive corrective actions typically induced by conventional finite-time formulations. The safety recovery framework is applied to the roundabout traffic coordination problem for Connected and Automated Vehicles (CAVs), where any initially violated safe merging constraint is replaced by an exact-time recovery barrier constraint to ensure safety guarantee restoration before CAV conflict points are reached. Simulation results demonstrate improved feasibility and performance.
A unified structural framework is presented for model-based fault diagnosis that explicitly incorporates both fault locations and constraints imposed by the residual generation methodology. Building on the concepts of proper and minimal structurally overdetermined (PSO/MSO) sets and Test Equation Supports (TES/MTES), the framework introduces testable PSO sets, Residual Generation (RG) sets, irreducible fault signatures (IFS), and Irreducible RG (IRG) sets to characterize which submodels are suitable for residual generation under given computational restrictions. An operator $M^*$ is defined to extract, from any model, the largest testable PSO subset consistent with a specified residual generation method. Using this operator, an algorithm is developed to compute all RG sets, and it is shown that irreducible fault signature sets form the join-irreducible elements of a join-semilattice of sets and fully capture the multiple-fault isolability properties in the method-constrained setting. The approach is exemplified on a semi-explicit linear DAE model, where low structural differential index can be used to define $M^*$. The results demonstrate that the proposed framework generalizes MTES-based analysis to residual generation scenarios with explicit computational limitations.
Site-specific channel inference plays a critical role in the design and evaluation of next-generation wireless communication systems by considering the surrounding propagation environment. However, traditional methods are unscalable. Recently, satellite imagery has emerged as a valuable modality containing rich propagation information for AI-based channel prediction. However, existing approaches using these images are limited to predicting large-scale fading parameters, lacking the capacity to reconstruct the complete channel impulse response (CIR). To address this limitation, we propose a deep learning-based site-specific channel modeling and inference framework using satellite images to predict structured Tapped Delay Line (TDL) parameters. We first establish a joint channel-satellite dataset based on measurements. Then, a novel deep learning network is developed to reconstruct the channel parameters. Specifically, a cross-attention-fused dual-branch pipeline extracts macroscopic and microscopic environmental features, while a recurrent tracking module captures the long-term dynamic evolution of multipath components. Experimental results demonstrate that the proposed method achieves high-quality reconstruction of the CIR in unseen scenarios, with a Power Delay Profile (PDP) Average Cosine Similarity exceeding 0.96. This work provides a pathway toward site-specific channel inference for future dynamic wireless networks.
Coordinating multiple autonomous agents to reach a target region while avoiding collisions and maintaining communication connectivity is a core problem in multi-agent systems. In practice, agents have a limited communication range. Thus, network links can appear and disappear as agents move, making the topology state-dependent and time-varying. Existing distributed solutions to multi-agent reach-avoid problems typically assume a fixed communication topology, and thus are not applicable when encountering discontinuities raised by time-varying topologies. This paper presents a distributed optimization-based control framework that addresses these challenges through two complementary mechanisms. First, we introduce a truncation function that converts the time-varying communication graph into a smoothly state-dependent one, ensuring that constraints remain continuous as communication links are created or removed. Second, we employ auxiliary mismatch variables with two-time-scale dynamics to decouple globally coupled state-dependent constraints, yielding a singular perturbation system that each agent can solve using only local information and neighbor communication. Through singular perturbation analysis, we prove that the distributed controller guarantees collision avoidance, connectivity preservation, and convergence to the target region. We validate the proposed framework through numerical simulations involving multi-agent navigation with obstacles and time-varying communication topologies.
This work formalizes the differential topology of redundancy resolution for systems governed by signed-quadratic actuation maps. By analyzing the minimally redundant case, the global topology of the continuous fiber bundle defining the nonlinear actuation null-space is established. The distribution orthogonal to these fibers is proven to be globally integrable and governed by an exact logarithmic potential field. This field foliates the actuator space, inducing a structural stratification of all orthants into transverse layers whose combinatorial sizes follow a strictly binomial progression. Within these layers, adjacent orthants are continuously connected via lower-dimensional strata termed reciprocal hinges, while the layers themselves are separated by boundary hyperplanes, or portals, that act as global sections of the fibers. This partition formally distinguishes extremal and transitional layers, which exhibit fundamentally distinct fiber topologies and foliation properties. Exploiting this geometric framework, we prove that the orthogonal manifolds within the extremal orthants form a global diffeomorphism to the entire unbounded task space. This establishes the theoretical existence of globally smooth right-inverses that permanently confine the system to a single orthant, guaranteeing the absolute avoidance of kinematic singularities. While motivated by the physical actuation maps of multirotor and marine vehicles, the results provide a strictly foundational topological classification of signed-quadratic surjective systems.
Tomographic synthetic aperture radar (TomoSAR) enables three-dimensional imaging by resolving targets along the elevation dimension, which is essential for environment reconstruction and infrastructure monitoring. A critical challenge in TomoSAR is the severe multipath propagation that causes ghost targets, range offsets, and elevation ambiguities. To address this, this paper proposes an enhanced Newtonized orthogonal matching pursuit (NOMP) algorithm to extract the delay, Doppler, and complex amplitude parameters of each propagation path, effectively separating line-of-sight (LoS) and multipath components prior to TomoSAR processing. Additionally, a height fusion strategy combining TomoSAR estimates with LoS-ground reflection delay-based inversion improves elevation accuracy. Simulation results demonstrate that the proposed method achieves improved positioning and elevation accuracy while effectively suppressing multipath-induced artifacts.
Smartphone cameras have gained immense popularity with the adoption of high-resolution and high-dynamic range imaging. As a result, high-performance camera Image Signal Processors (ISPs) are crucial in generating high-quality images for the end user while keeping computational costs low. In this paper, we propose DRIFT (Deep Restoration, ISP Fusion, and Tone-mapping): an efficient AI mobile camera pipeline that generates high quality RGB images from hand-held raw captures. The first stage of DRIFT is a Multi-Frame Processing (MFP) network that is trained using a adversarial perceptual loss to perform multi-frame alignment, denoising, demosaicing, and super-resolution. Then, the output of DRIFT-MFP is processed by a novel deep-learning based tone-mapping (DRIFT-TM) solution that allows for tone tunability, ensures tone-consistency with a reference pipeline, and can be run efficiently for high-resolution images on a mobile device. We show qualitative and quantitative comparisons against state-of-the-art MFP and tone-mapping methods to demonstrate the effectiveness of our approach.
Recently, artificial intelligence-based dubbing technology has advanced, enabling automated dubbing (AD) to convert the source speech of a video into target speech in different languages. However, natural AD still faces synchronization challenges such as duration and lip-synchronization (lip-sync), which are crucial for preserving the viewer experience. Therefore, this paper proposes a synchronization method for AD processes that paraphrases translated text, comprising two steps: isochrony for timing constraints and phonetic synchronization (PS) to preserve lip-sync. First, we achieve isochrony by paraphrasing the translated text with a language model, ensuring the target speech duration matches that of the source speech. Second, we introduce PS, which employs dynamic time warping (DTW) with local costs of vowel distances measured from training data so that the target text composes vowels with pronunciations similar to source vowels. Third, we extend this approach to PSComet, which jointly considers semantic and phonetic similarity to preserve meaning better. The proposed methods are incorporated into text-to-speech systems, PS-TTS and PS-Comet TTS. The performance evaluation using Korean and English lip-reading datasets and a voice-actor dubbing dataset demonstrates that both systems outperform TTS without PS on several objective metrics and outperform voice actors in Korean-to-English and English-to-Korean dubbing. We extend the experiments to French, testing all pairs among these languages to evaluate cross-linguistic applicability. Across all language pairs, PS-Comet performed best, balancing lip-sync accuracy with semantic preservation, confirming that PS-Comet achieves more accurate lip-sync with semantic preservation than PS alone.
The realization of Urban Air Mobility (UAM) necessitates scalable global path planning algorithms capable of ensuring safe navigation within complex urban environments. This paper proposes a multi-scale risk-aware cell decomposition method that efficiently partitions city-scale airspace into variable-granularity sectors based on obstacle proximity and potential risk. Unlike uniform grid approaches or sampling-based methods, our approach dynamically balances resolution with computational speed. Comparative experiments against classical A*, Artificial Potential Fields (APF), and Informed RRT* across diverse urban topologies demonstrate that our method generates significantly safer paths (lower cumulative risk) while reducing computation time by orders of magnitude. The proposed framework, \Larp Path Planner, is open-sourced and integrates directly with OpenStreetMap to facilitate reproducible research in city-wide aerial navigation.
An electroencephalogram (EEG)-based brain-computer interface (BCI) enables direct communication between the brain and external devices. However, such systems face at least three major challenges in real-world applications: limited decoding accuracy, poor robustness, and privacy risks. Although prior studies have addressed one or two of these issues, methods that simultaneously improve accuracy, robustness, and privacy remain largely unexplored. In this paper, we propose Privacy-preserving Adversarial Transfer (PAT), a unified training framework that combines data alignment, adversarial training, and privacy-preserving transfer. PAT provides a single pipeline that can be instantiated under three privacy-preserving scenarios, i.e., centralized source-free transfer, federated source-free transfer, and transfer with privacy-preserved source data, while jointly improving accuracy and robustness. Experiments on five public EEG datasets under three privacy-preserving scenarios (centralized source-free transfer, federated source-free transfer, and transfer with privacy-preserved source data) show that PAT outperforms over ten classic and state-of-the-art methods in both accuracy and robustness. PAT also outperformed leading transfer learning approaches that do not incorporate any privacy mechanisms by 9.76% in terms of average accuracy and robustness. To our knowledge, this is the first approach that simultaneously addresses all three major challenges in EEG-based BCIs. We believe this work can help motivate further research on more accurate, robust, and privacy-preserving EEG decoding.
In this paper, we focus on a class of decentralized constraint-coupled optimization problem: $\min_{x_i \in \mathbb{R}^{d_i}, i \in \mathcal{I}; y \in \mathbb{R}^p}$ $\sum_{i=1}^n\left(f_i(x_i) + g_i(x_i)\right) + h(y) \ \text{s.t.} \ \sum_{i=1}^{n}A_ix_i = y$, over an undirected and connected network of $n$ agents. Here, $f_i$, $g_i$, and $A_i$ represent private information of agent $i \in \mathcal{I} = \{1, \cdots, n\}$, while $h$ is public for all agents. Building on a novel dual$^2$ approach, we develop two accelerated algorithms to solve this problem: the inexact Dual$^2$ Accelerated (iD2A) gradient method and the Multi-consensus inexact Dual$^2$ Accelerated (MiD2A) gradient method. We demonstrate that both iD2A and MiD2A can guarantee asymptotic convergence under a milder condition on $h$ compared to existing algorithms. Furthermore, under additional assumptions, we establish linear convergence rates and derive significantly lower communication and computational complexity bounds than those of existing algorithms. Several numerical experiments validate our theoretical analysis and demonstrate the practical superiority of the proposed algorithms.
Autonomous highway driving involves high-speed safety risks due to limited reaction time, where rare but dangerous events may lead to severe consequences. This places stringent requirements on trajectory planning in terms of both reliability and computational efficiency. This paper proposes a hybrid highway trajectory planning (H-HTP) framework that integrates learning-based adaptability with optimization-based formal safety guarantees. The key design principle is a deliberate division of labor: a learning module generates a traffic-adaptive velocity profile, while all safety-critical decisions including collision avoidance and kinematic feasibility are delegated to a Mixed-Integer Quadratic Program (MIQP). This design ensures that formal safety constraints are always enforced, regardless of the complexity of multi-vehicle interactions. A linearization strategy for the vehicle geometry substantially reduces the number of integer variables, enabling real-time optimization without sacrificing formal safety guarantees. Experiments on the HighD dataset demonstrate that H-HTP achieves a scenario success rate above 97% with an average planning-cycle time of approximately 54 ms, reliably producing smooth, kinematically feasible, and collision-free trajectories in safety-critical highway scenarios.
We propose the Soft Graph Transformer (SGT), a soft-input-soft-output neural architecture designed for MIMO detection. While Maximum Likelihood (ML) detection achieves optimal accuracy, its exponential complexity makes it infeasible in large systems, and conventional message-passing algorithms rely on asymptotic assumptions that often fail in finite dimensions. Recent Transformer-based detectors show strong performance but typically overlook the MIMO factor graph structure and cannot exploit prior soft information. SGT addresses these limitations by combining self-attention, which encodes contextual dependencies within symbol and constraint subgraphs, with graph-aware cross-attention, which performs structured message passing across subgraphs. Its soft-input interface allows the integration of auxiliary priors, producing effective soft outputs while maintaining computational efficiency. Experiments demonstrate that SGT achieves near-ML performance and offers a flexible and interpretable framework for receiver systems that leverage soft priors.
Consider $N$ players and $K$ games taking place simultaneously. Each of these games is modeled as a Tug-of-War (ToW) game where increasing the action of one player decreases the reward for all other players. Each player participates in only one game at any given time. At each time step, a player decides the game in which they wish to participate in and the action they take in that game. Their reward depends on the actions of all players that are in the same game. This system of $K$ games is termed a 'Meta Tug-of-War' (Meta-ToW) game. These games can model scenarios such as power control, distributed task allocation, and activation in sensor networks. We propose the Meta Tug-of-Peace algorithm, a distributed algorithm where the action updates are done using a simple stochastic approximation algorithm, and the decision to switch games is made using an infrequent 1-bit communication between the players. We prove that in Meta-ToW games, our algorithm converges to an equilibrium that satisfies a target Quality of Service reward vector for the players. We then demonstrate the efficacy of our algorithm through simulations for the scenarios mentioned above.
Ensuring power grid resilience requires the timely and unsupervised detection of anomalies in synchrophasor data streams. We introduce T-BiGAN, a novel framework that integrates window-attention Transformers within a bidirectional Generative Adversarial Network (BiGAN) to address this challenge. Its self-attention encoder-decoder architecture captures complex spatio-temporal dependencies across the grid, while a joint discriminator enforces cycle consistency to align the learned latent space with the true data distribution. Anomalies are flagged in real-time using an adaptive score that combines reconstruction error, latent space drift, and discriminator confidence. Evaluated on a realistic hardware-in-the-loop PMU benchmark, T-BiGAN achieves an ROC-AUC of 0.95 and an average precision of 0.996, significantly outperforming leading supervised and unsupervised methods. It shows particular strength in detecting subtle frequency and voltage deviations, demonstrating its practical value for live, wide-area monitoring without relying on manually labeled fault data.
Autonomous robots operating in dynamic environments must balance global path optimality with real-time responsiveness to disturbances. This requires addressing a fundamental trade-off between computationally expensive global planning and fast local adaptation. Sampling-based planners such as RRT* produce near-optimal paths but struggle under perturbations, while dynamical systems approaches like SEDS enable smooth reactive behavior but rely on offline data-driven optimization. We introduce Sampling-Based Adaptive Motion Planning (SBAMP), a hybrid framework that combines RRT*-based global planning with an online, Lyapunov-stable SEDS-inspired controller that requires no pre-trained data. By integrating lightweight constrained optimization into the control loop, SBAMP enables stable, real-time adaptation while preserving global path structure. Experiments in simulation and on RoboRacer hardware demonstrate robust recovery from disturbances, reliable obstacle handling, and consistent performance under dynamic conditions.
Time-series anomaly detection plays a critical role in numerous real-world applications, including industrial monitoring and fault diagnosis. Recently, Mamba-based state-space models have shown remarkable efficiency in long-sequence modeling. However, directly applying Mamba to anomaly detection tasks still faces challenges in capturing complex temporal patterns and nonlinear dynamics. In this paper, we propose Fourier-KAN-Mamba, a novel hybrid architecture that integrates Fourier layer, Kolmogorov-Arnold Networks (KAN), and Mamba selective state-space model. The Fourier layer extracts multi-scale frequency features, KAN enhances nonlinear representation capability, and a temporal gating control mechanism further improves the model's ability to distinguish normal and anomalous patterns. Extensive experiments on MSL, SMAP, and SWaT datasets demonstrate that our method significantly outperforms existing state-of-the-art approaches. Keywords: time-series anomaly detection, state-space model, Mamba, Fourier transform, Kolmogorov-Arnold Network
While the zero-drift first arrival position (FAP) channel exhibits a Cauchy-distributed lateral displacement, nonzero drift in practical systems introduces advective transport that regularizes this singular limit. This letter characterizes the drift-induced transition of FAP distribution from heavy-tailed algebraic regime to exponential regularization. By asymptotically examining the exact FAP density, we identify a characteristic propagation distance (CPD) that serves as the fundamental boundary separating diffusion-dominated and drift-dominated regimes. Numerical experiments demonstrate that in low-drift environments, variance-matched Gaussian approximations severely underestimate the true communication potential, whereas the zero-drift Cauchy law provides a robust, physically grounded performance baseline.
Spatially inhomogeneous magnetic fields offer a valuable, non-visual information source for positioning. Among systems leveraging this, magnetic field-based simultaneous localization and mapping (SLAM) systems are particularly attractive because they can provide positioning information and build a magnetic field map on the fly. Moreover, they have bounded error within mapped regions. However, state-of-the-art methods typically require low-drift odometry data provided by visual odometry or a wheel encoder, etc. This is because these systems need to minimize/reduce positioning errors while exploring, which happens when they are in unmapped regions. To address these limitations, this work proposes a loosely coupled and a tightly coupled inertial magnetic SLAM (IM-SLAM) system. The proposed systems use commonly available low-cost sensors: an inertial measurement unit (IMU), a magnetometer array, and a barometer. The use of non-visual data provides a significant advantage over visual-based systems, making it robust to low-visibility conditions. Both systems employ state-space representations, and magnetic field models on different scales. The difference lies in how they use a local and global magnetic field model. The loosely coupled system uses these models separately in two state-space models, while the tightly coupled system integrates them into one state-space model. Experiment results show that the tightly coupled IM-SLAM system achieves lower positioning errors than the loosely coupled system in most scenarios, with typical errors on the order of meters per 100 meters traveled. These results demonstrate the feasiblity of developing a full 3D IM-SLAM systems using low-cost sensors and the potential of applying these systems in emergency response scenarios such as mine/fire rescue.
Accurate and early diagnosis of Alzheimer's disease (AD) is critical for effective intervention and requires integrating complementary information from multimodal neuroimaging data. However, conventional fusion approaches often rely on simple concatenation of features, which cannot adaptively balance the contributions of biomarkers such as amyloid PET and MRI across brain regions. In this work, we propose MREF-AD, a Multimodal Regional Expert Fusion model for AD diagnosis. It is a Mixture-of-Experts (MoE) framework that models mesoscopic brain regions within each modality as independent experts and employs a gating network to learn subject-specific fusion weights. Utilizing tabular neuroimaging and demographic information from the Alzheimer's Disease Neuroimaging Initiative (ADNI), MREF-AD achieves competitive performance over strong classic and deep baselines while providing interpretable, modality- and region-level insight into how structural and molecular imaging jointly contribute to AD diagnosis.
This paper introduces a fuzzy reinforcement learning framework, Enhanced-FQL($\lambda$), that integrates novel Fuzzified Eligibility Traces (FET) and Segmented Experience Replay (SER) into fuzzy Q-learning with the Fuzzified Bellman Equation (FBE) for continuous control. The proposed approach employs an interpretable fuzzy rule base instead of complex neural architectures, while maintaining competitive performance through two key innovations: a fuzzified Bellman equation with eligibility traces for stable multi-step credit assignment, and a memory-efficient segment-based experience replay mechanism for enhanced sample efficiency. Theoretical analysis proves the proposed method convergence under standard assumptions. On the Cart--Pole benchmark, Enhanced-FQL($\lambda$) improves sample efficiency and reduces variance relative to $n$-step fuzzy TD and fuzzy SARSA($\lambda$), while remaining competitive with the tested DDPG baseline. These results support the proposed framework as an interpretable and computationally compact alternative for moderate-scale continuous control problems.
Until open-world foundation models match the performance of specialized approaches, deep learning systems remain dependent on task- and sensor-specific data availability. To bridge the gap between available datasets and deployment domains, domain adaptation strategies are widely used. In this work, we propose XD-MAP, a novel approach to transfer sensor-specific knowledge from an image dataset to LiDAR, an entirely different sensing domain. Our method leverages detections on camera images to create a semantic parametric map. The map elements are modeled to produce pseudo labels in the target domain without any manual annotation effort. Unlike previous domain transfer approaches, our method does not require direct overlap between sensors and enables extending the angular perception range from a front-view camera to a full 360° view. On our large-scale road feature dataset, XD-MAP outperforms single shot baseline approaches by +19.5 mIoU for 2D semantic segmentation, +19.5 PQth for 2D panoptic segmentation, and +32.3 mIoU in 3D semantic segmentation. The results demonstrate the effectiveness of our approach achieving strong performance on LiDAR data without any manual labeling.
Sum-of-squares (SOS) optimization provides a computationally tractable framework for certifying polynomial nonnegativity. If the considered problem is convex, the SOS problem can be transcribed into and solved by semi-definite programs. However, in case of nonconvex problems iterative procedures are needed. Yet tractable and efficient solution methods are still lacking, limiting their application, for instance, in control engineering. To address this gap, we propose a filter line search algorithm that solves a sequence of quadratic subproblems. Numerical benchmarks demonstrate that the algorithm can significantly reduce the number of iterations, resulting in a substantial decrease in computation time compared to established methods for nonconvex SOS programs
Ensuring the microbiological safety of large, heterogeneous water distribution systems (WDS) typically requires managing appropriate levels of disinfectant residuals including chlorine. WDS include complex fluid interactions that are nonlinear and noisy, making such maintenance a challenging problem for traditional control algorithms. This paper proposes an evolutionary framework to this problem based on neuroevolution, multi-objective optimization, and surrogate modeling. Neural networks were evolved with NEAT to inject chlorine at strategic locations in the distribution network at select times. NSGA-II was employed to optimize four objectives: minimizing the total amount of chlorine injected, keeping chlorine concentrations homogeneous across the network, ensuring that maximum concentrations did not exceed safe bounds, and distributing the injections regularly over time. Each network was evaluated against a surrogate model, i.e.\ a neural network trained to emulate EPANET, an industry-level hydraulic WDS simulator that is accurate but infeasible in terms of computational cost to support machine learning. The evolved controllers produced a diverse range of Pareto-optimal policies that could be implemented in practice, outperforming PPO, a standard reinforcement learning method. The results thus suggest a pathway toward improving urban water systems, and highlight the potential of using evolution with surrogate modeling to optimize complex real-world systems.
Integrating domain knowledge into neural networks is a central challenge in scientific machine learning. Three paradigms have emerged -- data-driven (Neural Ordinary Differential Equations, NODEs), soft-constrained (Physics-Informed Neural Networks, PINNs), and hard-constrained (Differentiable Programming, DP) -- each encoding physical knowledge at different levels of structural commitment. However, how these strategies impact not only predictive accuracy but also downstream tasks such as control synthesis remains insufficiently understood. This paper presents a comparative study of NODEs, PINNs, and DP for dynamical system modeling, using the Single Machine Infinite Bus power system as a benchmark. We evaluate these paradigms across three tasks: trajectory prediction, parameter identification, and Linear Quadratic Regulator control synthesis. Our results yield three principal findings. First, knowledge representation determines generalization: NODE, which learns the system operator, enables robust extrapolation, whereas PINN, which approximates a solution map, restricts generalization to the training horizon. Second, hard-constrained formulations (DP) reduce learning to a low-dimensional physical parameter space, achieving faster and more reliable convergence than soft-constrained approaches. Third, knowledge fidelity propagates to control performance: DP produces controllers that closely match those obtained from true system parameters, while NODE provides a viable data-driven alternative by recovering control-relevant Jacobians with $3-4\%$ relative error and yielding LQR gains within $0.36\%$ of the ground truth. Based on these findings, we propose a practical decision framework for selecting knowledge integration strategies in neural modeling of dynamical systems.
Flow matching has emerged as a powerful framework for generative modeling, with recent empirical successes highlighting the effectiveness of signal-space prediction ($x$-prediction). In this work, we investigate the transfer of this paradigm to binary manifolds, a fundamental setting for generative modeling of discrete data. While $x$-prediction remains effective, we identify a latent structural mismatch that arises when it is coupled with velocity-based objectives ($v$-loss), leading to a time-dependent singular weighting that amplifies gradient sensitivity to approximation errors. Motivated by this observation, we formalize prediction-loss alignment as a necessary condition for flow matching training. We prove that re-aligning the objective to the signal space ($x$-loss) eliminates the singular weighting, yielding uniformly bounded gradients and enabling robust training under uniform timestep sampling without reliance on heuristic schedules. Finally, with alignment secured, we examine design choices specific to binary data, revealing a topology-dependent distinction between probabilistic objectives (e.g., cross-entropy) and geometric losses (e.g., mean squared error). Together, these results provide theoretical foundations and practical guidelines for robust flow matching on binary -- and related discrete -- domains, positioning signal-space alignment as a key principle for robust diffusion learning.
Neural codec language models enable high-quality discrete speech synthesis, yet their inference remains vulnerable to token-level artifacts and distributional drift that degrade perceptual realism. Rather than relying on preference optimization or retraining, we propose MSpoof-TTS, a training-free inference framework that improves zero-shot synthesis through multi-resolution spoof guidance. We introduce a Multi-Resolution Token-based Spoof Detection framework that evaluates codec sequences at different temporal granularities to detect locally inconsistent or unnatural patterns. We then integrate the spoof detectors into a hierarchical decoding strategy, progressively pruning low-quality candidates and re-ranking hypotheses. This discriminator-guided generation enhances robustness without modifying model parameters. Experiments validate the effectiveness of our framework for robust and high-quality codec-based speech generation. Audio samples are available at this https URL.
Large language model (LLM) agents are increasingly deployed in competitive multi-agent settings, raising fundamental questions about whether they converge to equilibria and how their strategic behavior can be characterized. In this paper, we study LLM agent interactions in two standard games: a network resource allocation game and a Cournot competition game. Rather than converging to Nash equilibria, we find that LLM agents tend to cooperate when given multi-round prompts and non-zero-sum context. Chain-of-thought analysis reveals that fairness reasoning is central to this behavior. We propose an analytical framework that captures the dynamics of LLM agent reasoning across rounds and explains these experimental findings.
We extend the algebraic diversity (AD) framework from classical signal processing to quantum measurement theory. The central result -- the Quantum Algebraic Diversity (QAD) Theorem -- establishes that a group-structured positive operator-valued measure (POVM) applied to a single copy of a quantum state produces a group-averaged density matrix estimator that recovers the spectral structure of the true density matrix, analogous to the classical result that a group-averaged outer product recovers covariance eigenstructure from a single observation. We establish a formal Classical-Quantum Duality Map connecting classical covariance estimation to quantum state tomography, and prove an Optimality Inheritance Theorem showing that classical group optimality transfers to quantum settings via the Born map. SIC-POVMs are identified as algebraic diversity with the Heisenberg-Weyl group, and mutually unbiased bases (MUBs) as algebraic diversity with the Clifford group, revealing the hierarchy $\mathrm{HW}(d) \subseteq \mathcal{C}(d) \subseteq S_d$ that mirrors the classical hierarchy $\mathbb{Z}_M \subseteq G_{\min} \subseteq S_M$. The double-commutator eigenvalue theorem provides polynomial-time adaptive POVM selection. A worked qubit example demonstrates that the group-averaged estimator from a single Pauli measurement recovers a full-rank approximation to a mixed qubit state, achieving fidelity 0.91 where standard single-basis tomography produces a rank-1 estimate with fidelity 0.71. Monte Carlo simulations on qudits of dimension $d = 2$ through $d = 13$ (200 random states per dimension) confirm that the Heisenberg-Weyl QAD estimator maintains fidelity above 0.90 across all dimensions from a single measurement outcome, while standard tomography fidelity degrades as $\sim 1/d$, with the improvement ratio scaling linearly with $d$ as predicted by the $O(d)$ copy reduction theorem.
A key challenge in disaster response is maintaining situational awareness of an evolving landscape, which requires balancing exploration of unobserved regions with sustained monitoring of changing Regions of Interest (ROIs). Unmanned Aerial Vehicles (UAVs) have emerged as an effective response tool, particularly in applications like environmental monitoring and search-and-rescue, due to their ability to provide aerial coverage, withstand hazardous conditions, and navigate quickly and flexibly. However, efficient and adaptable multi-robot coverage with limited sensing in disaster settings and evolving time-varying information maps remains a significant challenge, necessitating better methods for UAVs to continuously adapt their trajectories in response to changes. In this paper, we propose a decentralized multi-agent coverage framework that serves as a high-level planning strategy for adaptive coverage in unknown, time-varying environments under partial observability. Each agent computes an adaptive ergodic policy, implemented via a Markov-chain transition model, that tracks a continuously updated belief over the underlying importance map. Gaussian Processes are used to perform those online belief updates. The resulting policy drives agents to spend time in ROIs proportional to their estimated importance, while preserving sufficient exploration to detect and adapt to time-varying environmental changes. Unlike existing approaches that assume known importance maps, require centralized coordination, or assume a static environment, our framework addresses the combined challenges of unknown, time-varying distributions in a more realistic decentralized and partially observable setting. We compare against alternative coverage strategies and analyze our method's response to simulated disaster evolution, highlighting its improved adaptability and transient performance in dynamic scenarios.
Reconfigurable Intelligent Surfaces (RIS) has a potential to engineer smart radio environments for next-generation millimeter-wave (mmWave) networks. However, the prohibitive computational overhead of Channel State Information (CSI) estimation and the dimensionality explosion inherent in centralized optimization severely hinder practical large-scale deployments. To overcome these bottlenecks, we introduce a ``CSI-free" paradigm powered by a Hierarchical Multi-Agent Reinforcement Learning (HMARL) architecture to control mechanically reconfigurable reflective surfaces. By substituting pilot-based channel estimation with accessible user localization data, our framework leverages spatial intelligence for macro-scale wave propagation management. The control problem is decomposed into a two-tier neural architecture: a high-level controller executes temporally extended, discrete user-to-reflector allocations, while low-level controllers autonomously optimize continuous focal points utilizing Multi-Agent Proximal Policy Optimization (MAPPO) under a Centralized Training with Decentralized Execution (CTDE) scheme. Comprehensive deterministic ray-tracing evaluations demonstrate that this hierarchical framework achieves massive RSSI improvements of up to 7.79 dB over centralized baselines. Furthermore, the system exhibits robust multi-user scalability and maintains highly resilient beam-focusing performance under practical sub-meter localization tracking errors. By eliminating CSI overhead while maintaining high-fidelity signal redirection, this work establishes a scalable and cost-effective blueprint for intelligent wireless environments.
Dexterous robotic manipulation requires more than geometrically valid grasps: it demands physically grounded contact strategies that account for the spatially non-uniform mechanical properties of the object. However, existing grasp planners typically treat the surface as structurally homogeneous, even though contact in a weak region can damage the object despite a geometrically perfect grasp. We present a pipeline for grasp selection and force regulation in a five-fingered robotic hand, based on a map of locally admissible contact loads. From an operator command, the system identifies the target object, reconstructs its 3D geometry using SAM3D, and imports the model into Isaac Sim. A physics-informed geometric analysis then computes a force map that encodes the maximum lateral contact force admissible at each surface location without deformation. Grasp candidates are filtered by geometric validity and task-goal consistency. When multiple candidates are comparable under classical metrics, they are re-ranked using a force-map-aware criterion that favors grasps with contacts in mechanically admissible regions. An impedance controller scales the stiffness of each finger according to the locally admissible force at the contact point, enabling safe and reliable grasp execution. Validation on paper, plastic, and glass cups shows that the proposed approach consistently selects structurally stronger contact regions and keeps grip forces within safe bounds. In this way, the work reframes dexterous manipulation from a purely geometric problem into a physically grounded joint planning problem of grasp selection and grip execution for future humanoid systems.
This paper develops a sparsity-promoting integral concurrent learning (SP-ICL) adaptation law for a linearly parametrized uncertain nonlinear control-affine system. The unknown parameters are learned using ICL with sparsity-promoting $\ell_1$ regularization. The use of $\ell_1$ regularization for sparsity promotion is common in system identification and machine learning; however, unlike existing approaches, this paper develops an online parameter update law that integrates the regularization penalty with ICL via sliding modes. Using the SP-ICL update law, we show via non-smooth Lyapunov analysis that the trajectories of the closed-loop system are ultimately bounded. Simulations verify the effectiveness of the sparsity penalty in the SP-ICL update law on recovering sparse dynamics during trajectory tracking.