The solid electrolyte interphase (SEI) - a critical passivation layer that governs the longevity, safety, and efficiency of lithium-ion batteries - is created during the last step in cell manufacturing called cell formation. Conventional cell formation protocols are largely empirical, resulting in long processing times and limited control over the SEI growth rate that influences SEI quality and lifetime performance. This paper develops a control-oriented, semi-empirical model to estimate SEI thickness growth from terminal voltage and cell expansion measurements acquired in-operando during manufacturing using low-cost micrometer-precision integrated-sensing fixture. Model parameters are calibrated against cell formation data, and an unscented Kalman filter is employed to estimate the SEI film growth. The results lay the foundation for future closed-loop control of SEI growth, enabling high-quality and more efficient formation processes.
Thermoplastic composite structures enable lightweight, recyclable, and high-throughput aerospace manufacturing, but reliable quality assurance of advanced joining processes remains a key challenge. This work presents a compact, low-cost, and wireless ultrasonic non-destructive testing system for real-time, inline monitoring of continuous ultrasonic welding of thermoplastic carbon fiber composites. The system integrates custom-fabricated polymer-based capacitive micromachined ultrasonic transducers (polyCMUTs) with the ultra-low-power WULPUS platform, enabling operation in the harsh, high-interference welding environment. An eight-element linear polyCMUT array operating at a center frequency of approximately 3.6 MHz is designed, fabricated, packaged, and integrated into an industrial welding setup. Inline measurements are performed during welding of carbon fiber laminates with intentionally introduced defects. Process-synchronous ultrasonic data reveal consistent depth-of-echo shifts at defect locations, in strong agreement with X-ray computed tomography ground truth. Across 21 welds, all induced defects are detected without false negatives and with limited false positives. The results demonstrate that polymer-based CMUT technology enables robust, scalable, and manufacturing-compatible ultrasonic sensing, representing a decisive step toward intelligent process monitoring and quality assurance for next-generation thermoplastic composite welding.
We study a two-node stochastic resource system operating over a finite horizon. Each node experiences uncertain supply and demand and is equipped with finite storage. The objective is to ensure that resource levels remain within prescribed limits with high probability. To this end, we formulate a chance-constrained capacity-design problem in which resources can be exchanged through a capacity-limited transport link. We characterize the minimum storage required at each node, derive the optimal transport policy, and quantify the trade-off between storage and transport capacities. Our results show the existence of a critical transport-capacity threshold that enables full risk pooling between the nodes. Moreover, this threshold decreases with the operating horizon, implying that full-pooling performance can be achieved with progressively smaller transport capacity over longer horizons.
We propose an abstraction-free framework for controller synthesis for continuous-time dynamical systems subject to Linear Temporal Logic (LTL) specifications and bounded control inputs. The proposed method combines the sequential decomposition of LTL tasks with the use of formally certified Control Lyapunov-Barrier Functions (CLBFs). By formulating local specifications as a sequence of safe-stabilization problems, we systematically approximate and patch the winning sets of the decomposed subtasks. The satisfaction of these local constraints is guaranteed by the offline-computed level sets of the CLBFs. As a result, our framework yields formally verified switching feedback controllers that enable efficient online planning and dynamic re-planning. This ensures robust continuous specification satisfaction in the presence of state perturbations, avoiding the explicit state-space abstractions commonly required in the literature. The approach is validated through numerical simulations and a hardware demonstration on a Crazyflie quadrotor.
While MPC effectively handles structured, diverse, and low-level specifications, it lacks the capability to dynamically incorporate high-level contextual information such as social norms, user intent, or natural language instructions. To address this limitation, this manuscript introduces an agentic MPC framework that enables context-aware, semantically adaptive control synthesis by integrating with large language model-based agents. The agent interprets heterogeneous inputs, including natural language messages, environmental observations, and external knowledge, to resynthesize the control specifications. The effectiveness of the framework is demonstrated in an autonomous driving scenario, where the system aligns with personal preferences or responds to social situations such as emergency vehicle yielding.
No model of the Korean transmission system at native resolution is publicly available, which makes reproducible research on one of the world's most distinctive grids difficult-an islanded interconnection with extreme separation between generation and the Seoul Metropolitan Area load center, low renewable penetration, and heavy reliance on extra-high-voltage (EHV) transmission. Working strictly from public data, and for research purposes only, we present the GIST 2064-bus test system, a geographically grounded synthetic model of the Korean grid. Unlike fully synthetic cases, whose lines match no real corridor, and aggregated public Korean models, it derives its 345 and 154 kV layout from the OpenStreetMap/OpenInfraMap power layer by a multi-source shortest-path reassembly of overhead-line geometry, gap-fills unreachable substations with a geographic minimum-spanning-tree backbone, and calibrates the aggregate circuit length to published national statistics (108/107/97% at 765/345/154 kV). The model spans 2064 buses, 512 generation and renewable sources (144 GW), 3044 AC line circuits plus high-voltage direct-current (HVDC) equivalents, 3073 transformers, and reactive resources (shunts and 11 FACTS devices), serialized to a PSS/E-compatible CSV schema. A general-purpose pandapower Newton-Raphson solver-with generator reactive limit enforcement, a secant-gain remote voltage-control loop, tap-changer and switched-shunt fixed-point control, and zero-impedance regularization-solves an 85 GW high demand snapshot to a single connected, converged operating point (mean voltage 0.996 pu, 2.3 % losses, no undervoltage buses), structurally consistent with the independent public KPG-193 model. The dataset, maps, and tooling are released as a citable platform for power flow, planning, and decarbonization studies.
Floating solar photovoltaic (FSPV) systems provide a land-efficient pathway to expand clean electricity access in energy-poor regions. South America has among the highest global FSPV potential (approx 38.26 TWh per million acres of water surface), yet deployment remains limited. This study presents a techno-socio-economic framework to assess FSPV for energy access, water security, and grid flexibility, with case studies in Nicaragua, Honduras, and Guyana. Estimated yields for 50 to 398 MW systems exceed 1,500 to 2,000 kWh per kW annually with capacity factors above 20 percent. At El Cajon, FSPV could significantly reduce emissions relative to fossil generation. Results show competitive costs with land-based PV when accounting for avoided land use, shared hydropower infrastructure, and water benefits. The framework also highlights co-location with hydropower and AI data centers, offering a scalable model for deployment in underserved regions.
The simultaneous solution of switched differential-algebraic equations (DAEs) in power system transient simulation may suffer convergence loss following discontinuous events. This difficulty is typically interpreted as a poor post-event initialization problem. This letter presents a geometric framework that explains the underlying convergence mechanism and clarifies why standard convergence-restoration methods may fail at discontinuities. Based on this interpretation, a homotopy-continuation based globalized re-initialization scheme is developed to restore convergence. The proposed method is validated through numerical simulations of representative discontinuities in power system transient simulation. Results show that in the cases where direct post-event solution fails, the proposed scheme can reliably recover convergence.
Affine frequency division multiplexing (AFDM) exhibits excellent Doppler robustness and the ability to characterize doubly selective channels. However, its signal dispersion characteristics make it challenging to directly adopt traditional time-frequency multiple access schemes. To address this issue, we introduce cooperative rate splitting multiple access (RSMA) for AFDM systems. The flexible configuration of AFDM chirp parameters can reduce the correlation between users' equivalent channels, which decreases the interference from RSMA private streams. We conduct a theoretical analysis of the cooperative RSMA-AFDM system and demonstrate that minimizing the overlap in the channel column spaces among users can effectively enhance the system performance. Guided by this analysis, we design a chirp parameter optimization scheme that reduces multi-user interference and maximizes diversity gain. To fully exploit the diversity gain brought by the proposed chirp parameter optimization, two expectation propagation (EP)-based distributed cooperative detection schemes are proposed. First, a decision-fusion-based method is developed, where local information and cooperative information are fused by maximum ratio combining, achieving a globally consistent estimate of the common stream. Second, we develop a belief-consensus EP-based detection scheme. In each iteration, user nodes exchange and fuse the first- and second-order statistics of the common stream, and the resulting beliefs gradually converge to a consistent global decision, which significantly improves the overall reliability.
AI governance for medical imaging is formalizing: the 2026 ACR-SIIM Practice Parameter recommends local acceptance testing and ongoing drift monitoring, and the ACR Assess-AI registry monitors AI outputs using DICOM metadata for context. We argue that a necessary, currently unmonitored layer sits beneath output metrics: whether incoming studies remain within the acquisition envelope a model was validated on. Using a LUNA16-trained MONAI RetinaNet lung-nodule detector, we test whether acquisition state behaves as a structured, measurable variable. On real paired CT differing only in reconstruction kernel (NLST B30f vs B80f), kernel alone shifted AI-measured diameter and flipped a Fleischner size category in 5.2% (8 of 155) of nodules at fixed patient and acquisition, while detection confidence was unchanged (Wilcoxon p=0.22). Under controlled LIDC-IDRI perturbations the effects dissociated by axis: the noise axis degraded detection confidence (p=5.9e-32, concentrated in nodules under 6 mm) but not measurement, while the frequency/kernel axis corrupted measurement (p=8.6e-13) but not detection. A 4-feature pixel fingerprint recovered reconstruction identity (patient-level AUC about 0.95 on real CT, 0.995 on a QIBA phantom) where the ConvolutionKernel DICOM tag was uninformative (identical labels across reconstructions). The kernel axis transported across four manufacturers (leave-one-vendor-out AUC 0.94-0.98, matching the within-vendor ceiling). Acquisition state thus maps to distinct AI failure modes, frequency content to measurement reliability and noise to detection sensitivity, and is not recoverable from metadata. Acquisition-aware, input-side validation is the missing layer for the acceptance-testing and drift-monitoring requirements now entering imaging-AI accreditation.
This paper proposes a beamforming optimization scheme with joint antenna sub-array selection (SAS) and angular perturbation-based nulling (APN) for full-duplex (FD) massive multiple-input multiple-output (mMIMO) systems, to simultaneously suppress self-interference (SI) and multi-user interference (MUI). A comprehensive over-the-air SI channel measurement campaign, conducted with an 8x8Tx-8x8Rx FD array prototype, reveals significant variations across sub-arrays at different spatial locations, as well as reconfigurable characteristics of the SI channel under diverse Tx and Rx sub-array configurations. To exploit the selective SI channels, a particle swarm optimization (PSO)-based algorithm is developed to jointly determine optimal sub-array indices and perturbed steering angles, thereby effectively nullifying potential interference. Selecting sub-arrays with inherently lower SI channels notably enhances the beam-level isolation, while the added selection flexibility among comparable SI channels ensures more uniform SI suppression across diverse DL/UL locations and significantly improves worst-case isolation. Experimental evaluation based on the measured SI channel demonstrates that the proposed SAS technique achieves residual Tx-Rx beam-level SI suppression improvements of 29.2 dB and 26.6 dB for the sample 1x2 and 1x4 sub-arrays, respectively. A worst-case improvement greater than 30.7 dB is observed. Overall, the joint SAS and APN optimization scheme achieves average beam-level isolation of 85.2 dB and 83.3 dB with the 1x2 and 1x4 sub-arrays, respectively. With the application of a baseband precoder, all tested sub-array configurations achieve average MUI suppression better than -181.3 dB. These results confirm the potential of the proposed optimization algorithm to successfully reduce interference to the noise floor, thereby guaranteeing reliable FD mMIMO operation.
Radio maps provide the essential foundation for low altitude networking systems. Unlike terrestrial radio maps that are typically generated via drive test measurements, mapping the air-ground environment requires the deployment of unmanned aerial vehicles (UAVs). This shift introduces two formidable challenges in uncharted 3D scenarios. First, sparse radio measurements and incomplete geometric observations hinder accurate reconstruction. Second, the large 3D action space and strict power constraints from high spectrum scanner energy consumption make informative exploration difficult. To address these issues, this paper proposes 3D uncertainty aware radio active mapping (3D-URAM), a closed loop active perception framework that decouples the mapping process into two offline trained stages. In Stage I, a Bayesian UNet is developed to recover radio maps from sparse measurements and partial geometry while providing calibrated predictive uncertainty. In Stage II, a dynamic probabilistic roadmap and a transformer based waypoint selection policy trained via proximal policy optimization maximize long horizon uncertainty reduction under travel budgets. Experimental results demonstrate that 3D-URAM reduces reconstruction error by over 50% compared to representative baselines. Real-world field tests within a 300mx200mx100m space also validate the potential of active radio map reconstruction.
Vehicle bunching is a major problem for transit operators. When vehicles bunch together, the lead vehicle will service the majority of passenger demand, leaving the following vehicles to operate below capacity, wasting fuel and money. Furthermore, after the last vehicle in the bunch passes, the time before the next vehicle's arrival (headway) will be large. Transit operators can combat bunching by holding buses at stops along a route, trading riding time for even headway times. While prior work has focused on developing holding policies to minimize average case bunching, no work has focused on analyzing the longest and shortest possible headway times under a broad group of such policies. We assume that dwell times at stops and travel times between stops are bounded and develop a dynamic program that computes the maximum and minimum headway times for a single bus route with an arbitrary number of control points, vehicles, and holding policies. These bounds are tight in the sense that it is always possible to identify the specific sequence of events that lead to their occurrence. We use these bounds to investigate the effects of different holding policies, stop placement, and number of vehicles on route headways and worst-case bunching. Finally, we apply these analysis techniques to a real-world transit system in Nashville, TN and show their utility for transit planning.
In this paper, we propose leveraging rotatable antennas (RAs) to enhance near-field communication and sensing by exploiting a new orientation-domain spatial degree-of-freedom (DoF) provided by element-wise antenna rotation. Specifically, we investigate an RA-enabled near-field integrated sensing and communication (ISAC) system with sub-connected hybrid beamforming, where each transmit RA can independently adjust its boresight direction under a practical rotation constraint. A spherical-wave channel model incorporating orientation-dependent antenna gains is established to characterize multi-user communication and target sensing in the presence of clutters. Based on this model, a weighted communication-sensing utility maximization problem is formulated by jointly optimizing the receive beamformer, digital beamformer, analog beamformer, and RA boresight directions. To solve the resulting non-convex problem, an alternating optimization algorithm is developed by combining fractional programming, Riemannian optimization, and a spherical-cap Frank--Wolfe-based boresight update. To further understand the impact of RA rotation on near-field sensing, we derive a closed-form root Cramer--Rao bound (RCRB) expression. Simulation results demonstrate the convergence and effectiveness of the proposed algorithm. It is shown that the RA-enabled hybrid design can match or even outperform the fully-digital FPA benchmark in some regimes, indicating that the orientation-domain DoF introduced by element-wise rotation can compensate for limited RF chains. The RCRB and beampattern results further show that RA rotation improves off-broadside sensing accuracy, enhances range-domain focusing, and suppresses same-angle clutters in the near field.
Driven by the massive video transmission requirements in the Internet of Everything, semantic communication holds great promise for striking a balance between transmission efficiency and quality. This paper introduces a large-model-driven generative video semantic communication (LGVSC) framework, enabling efficient video semantic transmission under extremely low bandwidth conditions. First, by decoupling the encoder and decoder as well as exposing explicit intermediate semantic representations, LGVSC maintains interpretability, avoiding the black-box behavior commonly observed in end-to-end systems. Next, we introduce a new metric, i.e., the probability-based semantic similarity score (PSSS), which quantifies semantic similarity for complex modalities within a continuous range, allowing for more precise evaluation of semantic content. Building on PSSS, we propose a semantic-guided keyframe extraction module driven by a multimodal large model. This module can enhance fine-grained semantic consistency during keyframe selection at the transmitter, optimizing transmission bandwidth without compromising semantic fidelity. Additionally, we design a generative large-model-driven dynamic semantic-adaptive decoder at the receiver, which can adapt to videos of arbitrary lengths. Simulation results demonstrate that LGVSC significantly outperforms traditional schemes, achieving a channel bandwidth ratio on the order of 10^-4 to 10^-3, while maintaining strong zero-shot generalization across downstream tasks.
This paper studies output regulation for a class of unknown continuous-time nonlinear systems driven by almost periodic exosignals. The plant dynamics are assumed to be linearly parameterized over a prescribed nonlinear dictionary, while all coefficient matrices in the plant, input channel, output map, and exosignal channel are unknown. Since the plant model is unavailable, exact nonlinear output regulation would generally require model identification followed by the solution of nonlinear regulator equations. To avoid these steps, we pursue a frequency-selective regulation objective: the steady-state regulation error is allowed to be almost periodic, but its Fourier-Bohr coefficients at prescribed exosystem frequencies are guaranteed to vanish, and the residual error energy is explicitly bounded. To this end, a p-copy internal model is embedded in a dynamic controller, yielding an augmented nonlinear system whose unknown constant matrices are represented directly by measured data. A noise-robust semidefinite program is derived to synthesize the controller gain without model identification and without measuring the exosignal amplitudes or phases. The resulting closed-loop vector field is made exponentially contractive on a prescribed operating set, which implies the existence and uniqueness of a bounded and attracting trajectory. By combining contraction theory with Fourier-Bohr analysis, we prove that this steady-state trajectory is almost periodic, that the embedded-frequency components of the regulation error are eliminated, and that the unmodeled spectral components satisfy a Parseval-type time-averaged energy bound. Numerical and physics-based simulations on a quadrotor with a cable-suspended payload illustrate the effectiveness of the proposed data-driven internal-model design.
Gears are an integral component of electromechanical applications, but accurate condition monitoring methods, including data-driven predictive maintenance, are strongly dependent on high-quality data, especially from faulty components. To address the scarcity of data, we proposed a multibody simulation using an advanced polygonal contact method to replicate torsional vibrations from an experimental azimuth thruster test rig. The key novelty is the ability to simulate both healthy and faulty gears with arbitrary fault geometries. The simulated signals closely matched the measurements in both time and frequency domains. In the time domain, average torque levels and periodic fluctuations aligned well, although measured signals exhibited higher peak-to-peak amplitudes and greater noise, particularly in healthy conditions at lower rotational speeds. In the frequency domain, the simulations accurately reproduced expected fault frequencies and corresponding sidebands, with larger faults producing higher amplitudes. While the simulations tended to overestimate peak amplitudes and underestimate external noise, the results were highly comparable to measurements and consistent with the physical expectations. These findings provide a robust foundation for enhancing data-driven condition monitoring methods, particularly those employing machine learning or deep learning.
Controlling the power output of a wind farm in order to track a target signal can be useful for the power grid frequency regulation. It can be achieved by dividing the target into individual setpoints, then followed by each turbines' controller. In this article, we are interested in finding power allocations that fairly spread the power reserves (i.e. unused fraction of available powers) among turbines, helping with robustness to uncertainties and changing wind conditions. In particular, we study the fairness properties of proportional dispatch, which is the most common power dispatching method. We show that due to the wake effects in a wind farm, proportional dispatch has to be applied iteratively to achieve fair distribution of power reserves. We study the convergence of this iterative process (referred to as IPD) to equalized reserves, and then illustrate it on simulated experiments, using steady-state and dynamic simulators. Numerical results show that IPD closely approaches max-min fairness, a related fairness objective, for a cheap computational price compared to black-box optimization. Finally, IPD is also shown to reduce the complexity of the problem of fair power dispatch combined with yaw wake steering optimization.
Scanning Photothermal Radiometry (SPR) is an active thermal technique that is simultaneously non-destructive, contactless, and allows for temporal resolutions on the order of nanoseconds, spatial resolutions down to the sub-micrometre scale and at different depths. This scanning method can be time consuming thus this work shows that it is possible to reduce the amount of measurements taken by 6 when using SPR on a sample consisting of carbon fibres in an aluminium matrix. It uses irregular sampling on sparse signals, and a weighted random technique to further decrease the amount of samples needed.
This paper studies feedback Nash equilibria (FNE) in scalar discounted linear quadratic (LQ) games with $N$ players. By explicitly incorporating the discount factor, we show that finite-cost equilibria may fail to stabilize the original system, motivating a distinction between FNE and stable FNE together with a sufficient stability condition. Based on a parametric characterization of the policies, we propose numerical methods for computing all equilibria. Particular attention is devoted to the symmetric game, where a closed-form expression of the symmetric FNE and conditions for the existence of up to $M\leq2^N-2$ equilibria are derived. Numerical experiments illustrate how equilibrium multiplicity depends on the game configuration and highlight the emergence of finite-cost non-stabilizing equilibria.
6G is expected to bring unprecedented advancements in the capabilities of vehicular networks. However, the advent of 6G will also introduce changes in the operation of vehicular communication infrastructures such as roadside units (RSUs), including the incorporation of autonomous intent-based network paradigm and integrated sensing and communication (ISAC) capabilities. While ISAC enables sensing and communication within a single 6G network node, intent-based network design paradigm ensures that network nodes such as RSUs, act as autonomous cognitive agents to fulfill the objectives of their respective communication service providers. This paradigm shift necessitates the development of V2I communication strategies that learns and adapts to the sensing-assisted communication and the autonomous decision-making strategies of RSUs. We model the RSU as a constrained utility maximizer, where the utility function characterizes the RSU intent, and formulate an inverse learning (IL) problem to infer the underlying utility function from observed ISAC RSU actions, for example the adaptive beamwidth allocation in response to the kinematic states of vehicles within a vehicular micro-cloud (VMC). The main contributions of this paper are: (i) ATIL, a nonparametric method based on Afriat theorem for fixed utility learning; (ii) FICNNIL, a parametric approach using fully input-concave neural networks, for structured fixed utility learning; and (iii) PICNNIL, a parametric approach based on partially input-concave neural networks, for inverse learning of state-dependent utilities. (iv) Federated inverse learning algorithms FedFICNNIL and FedPICNNIL for fixed and state dependent utility, respectively. We demonstrate the proposed IL-based framework for two V2I communication applications in VMCs, namely predictive scheduling for cooperative data downloading and dynamic cluster-head selection.
Multi-talker speech recognition is often addressed by combining automatic speech recognition (ASR) and speaker diarization in a pipeline system. Recently, LLM-based approaches have shown promise by jointly modeling semantic and speaker information, but they typically require large-scale multi-talker corpora that are costly to annotate. In this paper, we investigate how to efficiently train an LLM-based system with limited real-recorded data while maintaining high accuracy in speaker attribution. We propose several strategies: (1) a dual-encoder architecture to extract semantic and speaker features, (2) a feature interleaving format to merge these features as the inputs to the LLM, (3) a length-aware speaker ID loss to enhance diarization capability, and (4) an adaptive threshold strategy for ASR loss computation to mitigate hallucinations caused by speech overlaps. These strategies balance training between ASR and diarization tasks. Our system outperforms open-source baseline approaches, achieving relative improvements of 18% on the AliMeeting corpus and 24% on the Aishell4 corpus.
Training neural networks (NNs) for speech enhancement (SE) in distant speech-capturing scenarios requires paired distorted and clean reference speech signals. While such data are often generated through simulation, the mismatch between simulated and real recordings significantly limits SE accuracy. To address this issue, we propose Close-to-Distant microphone Projection (C2D projection), a method that generates paired data from real recordings captured by close and distant microphones. C2D projection estimates an optimal projection matrix that transforms close-microphone inputs into clean reference signals aligned with distant-microphone recordings, while simultaneously performing denoising. We show this projection can be effectively realized using a variant of the Parametric Multichannel Wiener Filter (PMWF). Experimental results demonstrate that an NN trained with C2D-projected data outperforms the state-of-the-art Guided Source Separation (GSS) on the challenging CHiME6 dinner party ASR task under oracle diarization, when using the enhanced output from GSS as an auxiliary input to the NN.
Variational autoencoder-based neural video coding has demonstrated impressive rate-distortion performance. However, its adoption in real-world applications remains hindered by challenges, such as prohibitively high computational complexity and limited cross-platform interoperability. These issues are often overlooked, as most neural video codecs rely on floating-point arithmetic to fully explore their rate-distortion potential. Practical deployment, however, requires integer-based implementations. Converting floating-point implementations into integer-based networks is non-trivial, since it involves quantizing inter-dependent coding components, whose sensitivity to precision may vary across codec designs. This paper introduces a Jointly-Optimized Mixed-Precision (JOMP) framework, in which both quantization parameters and bit widths are treated as learnable variables during training. This enables different codec modules to operate at varying precision levels, thereby jointly optimizing the rate-distortion-complexity trade-off. To the best of our knowledge, JOMP is the first mixed-precision quantization framework for neural video codecs. Its effectiveness is validated through a systematic investigation of quantization across different coding frameworks and temporal buffering strategies. Our study marks the first attempt to a unified understanding of the combined effects of modern coding frameworks and temporal buffering strategies, with the aim of informing future development of neural video codecs from a practicality perspective. In addition, we develop a complete integerization pipeline to achieve deterministic decoding. Overall, when applied to our best-performing model, JOMP enables end-to-end mixed-precision learning for integer neural video codecs, achieving rate-distortion performance comparable to that of the state-of-the-art DCVC-FM while reducing bit operations by 87.6%.
Underactuated spacecraft faces controllability limitations and heightened sensitivity to environmental disturbances, complicating attitude maneuvering and stabilization. Due to the lack of control authority along the underactuated axis, conventional controllers cannot directly stabilize all attitude components and therefore require reference planning strategies. Furthermore, MPC approaches remain sensitive to inertia uncertainty and unmodeled dynamic couplings, resulting in degraded tracking performance under mismatch. To address these issues, we consider a hierarchical architecture integrating three layers: (i) a nonlinear model predictive controller (NMPC) for constraint and underactuation-aware maneuver planning and nominal closed-loop stability under actuator limits; (ii) a physics-informed neural network (PINN) trained offline on simulation data to estimate residual disturbance torques, with loss terms that enforce consistency with rigid-body rotational dynamics; (iii) a Lyapunov-based supervisory safety mechanism that evaluates the learned correction online and bounds or suppresses its influence to preserve the stability properties of the baseline controller. The architecture is evaluated in a high-fidelity simulation environment modelling reaction wheel dynamics, actuator saturation, and environmental disturbances. Monte Carlo studies show statistically significant reductions in steady-state attitude error relative to standalone NMPC while maintaining robust behavior under uncertainty. The supervisory layer ensures graceful degradation to purely model-based control when the learning-based augmentation is unreliable.
Chopper stabilized amplifiers are popularly used for realizing amplifiers with low offset and for rejecting flicker noise. One of the main limitations of these amplifiers is the low Input Impedance (Zin) produced by the switch capacitor input network. Zin here is resistive due to the switch capacitor action and is inversely proportional to the product of Chopping frequency (Fch) and Input Capacitance (Ci). Since Fch should be greater than the flicker noise corner frequency, this results in a low Zin. When interfacing sensors with high Sensor Output Impedance (Zo), chopper stabilized amplifiers load the sensors resulting in reduced sensitivity. This paper presents a novel input impedance boosting technique - Differential capacitor flipping technique for chopper based Capacitively Coupled Instrumentation Amplifier (CCIA), which prevents discharge and recharge of Ci's in every cycle by reconfiguring the capacitor positions while preserving the chopping operation. This ideally results in a purely capacitive Zin which is independent of Fch. The proposed architecture is used to demonstrate Electrocardiogram (ECG) signal acquisition with dry electrodes that have Zo in the order of a few Mega Ohms. This circuit implemented in TSMC 65 nm CMOS technology node features Zin of 21 GOhms at DC. The circuit has a power consumption of 2.6E(-6)W (2.8E(-6)W including clock generation circuits), with 7.2E(-6)Vrms (1 Hz-150 Hz) of total integrated input referred noise. ~
We describe faust2clap, a framework establishing the first officially maintained compilation pathway from Faust DSP specifications to the CLAP format. The system operates in two different modes. A static mode employs ahead-of-time compilation to yield native binaries of optimal efficiency, while a dynamic mode uses runtime interpretation to permit DSP code modification without interrupting the host application. This latter capability addresses a persistent friction in audio software development, namely the cumulative overhead of the edit, compile, and reload cycle. We detail the algorithmic machinery underlying both modes, focusing specifically on the problem of parameter identity. To preserve both parameter values and their bindings to host automation across structural DSP mutations, we introduce an address-based identity matching algorithm and a stable slot allocation scheme. The implementation, comprising approximately 2,400 lines of C++ architecture and Python tooling code, has been integrated into the main Faust distribution.
As energy prices surge for the second time in recent years driven by the ongoing crisis in the Middle East, the European Union's continuing reliance on fossil energy imports is becoming increasingly apparent. However, despite offering an intriguing prospect of improved energy resilience, the ramp-up of local green hydrogen production lags far behind the officially stated ambitions set after the 2022 energy crisis. A prominent reason for the widening implementation gap between announced and realised production projects is overly strict rules on renewable power sourcing, prompting Member states' ministries and the European Commission to propose advancing a planned rules review from 2028 to 2026. To contribute to a successful review and rule adjustments, we address an important gap in understanding the effects of power purchase rules on green hydrogen production. By taking the perspective of European electrolyser operators, we show how the criterion of additionality and its interaction with required temporal correlation can jeopardise the fulfilment of green hydrogen offtake agreements and affect green hydrogen production costs across different European bidding zones. Applying different design paradigms to a green hydrogen production system reveals that electrolyser operator measures, such as PPA and storage upsizing, can help to mitigate the business risks posed by the additionality criterion but come with increased costs. Alternatively, relaxed temporal correlation and increased offtake flexibility both increase production system robustness and reduce production costs simultaneously. Whereby relaxing temporal correlation rules does not result in exceeded emission intensity thresholds, underlining the potential of extended transitional rules to support the ramp-up of European green hydrogen production.
The transition to heavy-duty battery electric vehicles requires an efficient and cost-effective deployment of the charging infrastructure, particularly when multiple operators share resources. This paper presents a multi-phase optimization framework for the joint planning of charging stations in a shared network, using high-resolution empirical truck trajectory data from two freight companies with distinct operational characteristics. The model is formulated to minimize the total number of charging stations while ensuring that the predefined electrification targets are met over successive expansion stages. The analysis captures heterogeneity in fleet usage, with one company operating a spatially concentrated network with shorter and more consistent routes, and the other exhibiting more dispersed operations with longer and more variable driving patterns. The results show that early-stage infrastructure deployment primarily supports fleets with concentrated operations, while later expansion phases are essential to accommodate long-haul and geographically dispersed transport demand. Furthermore, shared infrastructure not only enables reductions in redundant investments, but also introduces dependencies where certain fleets rely heavily on the full network to sustain electrified operations. In general, the findings highlight the importance of coordinated and data-driven infrastructure planning, and demonstrate that fleet-specific characteristics strongly influence both infrastructure requirements and electrification outcomes. The proposed framework provides practical insights on how collaborative and phased deployment strategies can enhance the scalability and efficiency of freight transport electrification.
Line-commutated converter high-voltage direct current (LCC-HVDC) has proven to be a reliable technology for bulk power transmission over long distances. However, the growing penetration of converter interfaced generation (CIG) is resulting in weaker AC grids, rendering the operation of LCC-HVDC systems vulnerable and posing a serious challenge to their stability. Grid-forming (GFM) controlled voltage source converter (VSC) have been shown to provide stabilizing impact in weak grid conditions. However, the impact of GFM controlled VSCs (GFM-VSC) on stability of LCC-HVDC in weak grid conditions has not been studied in depth in the literature. In this paper, a simplified model of LCC-HVDC is proposed and validated. Then a small-signal state-space model of a system consisting of aforementioned LCC-HVDC, a GFM-VSC and an infinite grid is developed to study the interactions between different components. The small-signal stability analysis shows the stabilizing effect of the GFM-VSC on the stability of the LCC-HVDC link in weak grid condition. Furthermore, the study on the sizing of the GFM power converter reveals that even a modest share of the capacity of the GFM power converter relative to the total nominal apparent power (sum of nominal power of LCC-HVDC and the nominal apparent power of GFM-VSC) is sufficient to ensure the stability of the system, in the test system analyzed in this study. This work just focuses in small-signal stability, but it is important to highlight that other stability phenomena should also be taken into account when selecting the final size of the GFM-VSC.
Practical implementations of phased arrays suffer from per-antenna gain, phase, and delay mismatches, which can significantly worsen the maximum sidelobe level (SLL) of beampatterns. The existing literature either analyzes specific structured mismatch patterns or derives per-angle marginal statistics under random mismatches, which fail to characterize global beampattern metrics such as the maximum SLL. To address this limitation, we propose a frequency-domain framework in which the beampattern is described by a tapering-window-dependent base function evaluated along a deformation determined by the array architecture and signal bandwidth. This formulation enables a spectral analysis of mismatches, revealing that element-wise errors generate weighted replicas of the ideal beampattern whose amplitudes are given by the discrete Fourier transform of the mismatch sequence. Building on this insight, we derive an approximation of the maximum SLL distribution under random gain and phase mismatches. The resulting expressions enable yield-oriented design and rapid design-space exploration without relying on computationally intensive Monte-Carlo simulations.
Affine frequency division multiplexing~(AFDM) has emerged as a compelling waveform candidate for future wireless networks, owing to its strong resilience to doubly selective channels and its ability to enable the seamless integration of communication and sensing functionalities. Against this context, this article provides a systematic study of AFDM from a standardization perspective. We first introduce the principles of AFDM and discuss the major considerations involved in waveform standardization. We then examine the backwards compatibility of AFDM with 4G/5G multi-numerology frameworks and their anticipated evolution, frequency-modulated continuous-wave (FMCW) radar waveforms, and long-range (LoRa) modulation, demonstrating that AFDM can be incorporated into legacy processing chains with limited modification. Key standardization-critical capabilities are further discussed, including multiple-antenna and multi-user support, and peak-to-average power ratio (PAPR). Finally, we investigate the potential of AFDM in several emerging scenarios, including non-terrestrial networks~(NTN), integrated sensing and communications (ISAC), vehicle-to-everything (V2X), and underwater acoustic (UWA) communications, whereby severe delay-Doppler dispersion places stringent demands on waveform robustness. Through these explorations, it is shown that that AFDM represents a timely and compelling technology for future wireless networks.
Low-resolution data converters can significantly reduce the power consumption and silicon area of all-digital massive multi-user (MU) multiple-input multiple-output (MIMO) basestations. However, the existing literature almost exclusively focuses on idealistic quantization models, neglecting the inherent non-idealities present in real-world analog-to-digital converter (ADC) implementations. To overcome this limitation, we propose two affine models, one based on Bussgang's decomposition and one that maximizes the signal-to-distortion ratio (SDR), both accounting for the most prominent non-idealities in successive approximation register (SAR) ADCs. Subsequently, we utilize these models to devise low-complexity methods that mitigate SAR-ADC non-idealities in massive MU-MIMO wireless systems.
We consider stochastic model predictive control (MPC) for constrained linear systems subject to multiplicative binary input uncertainty, motivated by applications such as networked control with packet losses and intermittent actuation. A common approach in this setting replaces the stochastic dynamics with their expectation, yielding tractable formulations that admit standard terminal ingredients and stability guarantees in expectation. We show that such formulations can exhibit structural properties that differ fundamentally from those of deterministic MPC and may be misleading as indicators of realized closed-loop behaviour. In particular, the expected value function is not necessarily monotonic in the prediction horizon, and value function-based inner approximations of the region of attraction may deteriorate as the horizon increases. Furthermore, we establish a probabilistic comparison with certainty-equivalent (optimistic) MPC, showing that the latter can ensure a strictly positive probability of recursive feasibility in situations where stochastic MPC certifies feasibility but fails with probability one. These results highlight inherent limitations of expectation-based stochastic MPC for systems with multiplicative binary uncertainty and motivate a re-examination of how stochasticity is incorporated into constrained predictive control design for such systems.
While low-latency interaction is critical for spoken dialogue, cascaded architectures are often bottlenecked by reactive turn-completion detection. We propose Endpoint Anticipation, shifting from reactive detection to proactive forecasting of end-of-turn signals. Our speech-based model anticipates endpoints upto 2.56 seconds in advance, enabling speculative execution of LLM and TTS pipelines on partial context. We introduce metrics to quantify the trade-off between realized latency reduction and computational redundancy. Evaluation across conversational and task-oriented datasets shows our model consistently outperforms competitive VAP-based baselines. Integration with the Unmute framework demonstrates a 505 ms average latency reduction with a 28.4% increase in speculative computation, effectively masking sequential bottlenecks to enable complex reasoning in real-time speech-to-speech interaction.
This paper proposes a novel adaptive control framework that embeds nonlinear opinion dynamics within the dynamical sensorimotor layers of an automated vehicle governed by second-order nonholonomic bicycle kinematics. The framework enables an ego vehicle to perform adaptive decision-making and achieve safe motion control under interaction uncertainty with non-cooperative neighboring agents. We consider a representative case study in which an ego vehicle autonomously attempts to merge into a lane occupied by human-driven or automated vehicles whose intentions are unknown. Within the proposed framework, the ego vehicle adaptively selects and executes merging versus non-merging behaviors in response to changing environmental conditions. Formal safety guarantees, as well as equilibrium and stability analyses of the closed-loop system, are provided. Numerical simulations further demonstrate the effectiveness of the proposed approach.
We present a communication-aware task decomposition framework for multi-agent systems with collaborative relative configuration objectives specified in Signal Temporal Logic (STL), allowing for dynamic task reallocation under time-varying communication networks. Building on our prior work, the framework supports the direct use of existing feedback controllers for reactive task satisfaction. We address two key challenges: disjunctive STL specifications and time-varying communication networks. Disjunctive specifications are handled through a graph transition system that captures the alternative task sequences induced by logical OR operators. To address time-varying connectivity, we introduce a redistribution mechanism that transfers tasks from disconnected agents to connected ones as the network evolves while preserving decentralized execution. Simulations and experiments on a swarm of Crazyflie drones demonstrate scalability in the number of agents, communication connectivity, and specification complexity.
Knee rehabilitation exoskeletons must enforce a prescribed joint trajectory while remaining safely compliant with involuntary spasm and voluntary patient effort-objectives in tension for any fixed-gain impedance controller. We present an Impedance Model Predictive Control framework for knee rehabilitation exoskeletons, demonstrated on a series-elastic-actuator (SEA) platform: an algebraic feedforward reduces the knee dynamics to a constant-coefficient scalar double integrator, and a receding-horizon quadratic program (QP) computes corrective torques while enforcing hard range-of-motion, torque, and velocity limits (ISO 13482). A Kalman disturbance state driven by direct SEA-based torque sensing (the series-elastic spring deflection measured through the elastic element - an intrinsic, EMG-free patient-torque estimate, not a separate load cell) gives a nominal offset-free guarantee and, via its sign and the desired-motion direction, sensorless Assist-as-Needed. The constant state matrix permits offline precomputation of the QP cost inverse, enabling 500 Hz operation with a multi-step horizon. Across seven-controller benchmarks (sinusoidal tracking, isometric hold), the 500 Hz Kalman MPC is offset free 0.1 mrad RMS, 0.1 mrad steady-state, 0.2 mrad peak under 15 Nm spasm, versus a 515 mrad steady-state offset for classical impedance at the same stiffness - the direct-measurement channel converging the estimate near-immediately (within a few sampling periods). Without the estimator it realizes a classical impedance (4.8 mrad RMS, 8.3 mrad steady-state). All MPC variants meet the 87 mrad clinical criterion; no classical controller does. The architecture is formulated for the 20 DOF MyoSuite myoLeg via coupling-aware per-joint QPs.
The frequency bands between 7 and 24 GHz, also known as upper midband or Frequency Range (FR) 3, are being considered as an enabler of 6th Generation (6G) mobile networks. This portion of the spectrum exhibits different propagation characteristics compared to frequencies above 24 GHz, while also offering the potential to provide larger bandwidth allocations for mobile systems than those available in the sub-6 GHz range. 6G technology and spectrum policy, however, will need to guarantee coexistence with the incumbents that already use these frequency bands, which include a variety of services, from radiolocation to satellite-based communications, remote sensing, and radioastronomy. In this paper, we consider the challenge of coexistence between 6G terrestrial systems and satellite incumbents in different portions of the FR3 bands. Using a large-scale 3D model of a terrestrial deployment in the city of Boston and an open-source ray tracing solution, we evaluate the level of Radio Frequency Interference (RFI) that tens of terrestrial Next Generation Node Bs (gNBs) generate toward satellites at different elevation angles. Our model, based on realistic obstruction, clutter, diffraction, and reflections, shows that sidelobes and Non-Line-of-Sight (NLoS) paths can significantly contribute to RFI. Besides directionality, the spatial distribution of gNBs also plays a key role in defining the RFI levels, suggesting that a careful design and operation of terrestrial deployments can create coexistence opportunities.
Turn-taking in multi-party spoken conversations remains a fundamental challenge for voice-based agents, particularly under dynamic floor competition and varying user expectations. We propose ModeratorLM, a role-playing voice agent that conditions turn-taking behavior on an explicitly assigned role in multi-party settings. The system is built on a speech large language model operating in chunk-wise streaming manner. We further introduce a reasoning-augmented variant that incorporates chain-of-thought reasoning over conversational context and the assigned role. We construct RolePlayConv, a large-scale synthetic dataset of spoken multi-party conversations with diverse assistant roles. Experiments on real-world meeting data and RolePlayConv show improved turn-taking precision by over 40% and recall by more than 70%, while substantially reducing false-positive interruptions compared to non-role-conditioned baselines.
In this paper, we investigate a secure integrated sensing and communication (ISAC) system in which multiple communication users (CUs) coexist with multiple untrusted sensing users (SUs) that may eavesdrop on the confidential information intended for the CUs. To promote security fairness among users, we formulate a max-min secrecy rate optimization problem subject to a transmit power budget and sensing quality requirements characterized by beampattern matching error constraints. The resulting design problem is highly non-convex due to the secrecy rate expressions and non-convex sensing constraints. To address these challenges, we first reformulate the problem using semidefinite relaxation (SDR). Based on the reformulated problem, we develop a branch-and-bound (BB) framework combined with convex relaxations to obtain the globally optimal solution within a prescribed accuracy. To further reduce computational complexity, we propose a low-complexity algorithm based on successive convex approximation (SCA), which iteratively solves a sequence of convex subproblems and converges to a local solution. Numerical results demonstrate that the proposed BB algorithm achieves the global optimum and provides a benchmark for performance evaluation. Moreover, the proposed SCA-based algorithm attains near-optimal secrecy performance with significantly lower computational complexity, making it attractive for practical ISAC deployments.
Aerial wildfire suppression requires not only predicting fire spread, but also designing effective intervention strategies under operational and environmental uncertainty. We present a modeling and optimization framework for aerial wildfire suppression that combines a hybrid neural-cellular automaton wildfire model with gradient-based design of targeted aerial drops. The wildfire model predicts spatially varying spread behavior from terrain, fuel, and wind data, while the intervention module determines binary drop actions with continuous-valued location and orientation parameters mapped to the simulation grid. Water and retardant are represented with distinct suppression effects, corresponding to immediate reduction of active burning and persistent reduction of future spread. To evaluate the robustness of the resulting suppression plans, we quantify both aleatoric uncertainty through Monte Carlo sampling of daily fire-state realizations and epistemic uncertainty through spatially correlated prediction-error perturbations. A case study based on the 2020 Bear Fire shows that the framework can generate coherent aerial suppression schedules for reducing total fire-affected area and can support uncertainty-aware analysis of wildfire intervention strategies.
Offline reinforcement learning allows control policies to be learned directly from data without online interaction, making it suitable for safety-critical tasks. Recent studies have applied diffusion models to offline reinforcement learning to leverage their strong capacity for modeling complex data distributions. However, existing approaches primarily focus on single-agent settings, leaving the safety challenges in multi-agent environments largely unexplored. In this work, we propose a safe offline multi-agent reinforcement learning algorithm that embeds neural individual control barrier functions into the diffusion model to enhance safety during trajectory generation, with control policies recovered through inverse dynamics. We evaluate our algorithm across diverse benchmarks, demonstrating substantial safety improvements while maintaining competitive rewards.
Rapidly expanding low Earth orbit satellite constellations are placing increasing demands on terrestrial ground networks, motivating the development of more efficient ground station network designs. Current approaches select sites from predefined locations, limiting optimization to existing infrastructure and constraining performance. In contrast, free-placement optimization operates over a continuous spatial domain on Earth, broadening the search space and allowing higher-throughput configurations at the cost of potentially requiring new infrastructure deployment. In this work, we introduce SCORE (Sequential Cyclic Optimization via Refinement & Evaluation), a two-stage free-placement method for ground station design. SCORE combines sequential coordinate selection with cyclic refinement to manage high-dimensionality, non-convexity, and local minima that challenge global optimizers. We benchmark SCORE against one-shot methods such as differential evolution (DE) and integer programming approaches using locations from Kongsberg Satellite Services and the World Teleport Association. Tests across two commercial Earth observation constellations (Capella Space and ICEYE) and one synthetic Walker-Star constellation show that SCORE requires up to 5x fewer function evaluations to converge relative to DE while improving downlink throughput by up to 13%. Compared to fixed-site methods, unconstrained SCORE achieves up to 15% greater total downlink, establishing a strong empirical performance benchmark for flexible placement; infrastructure-constrained SCORE retains over 92% of this gain while restricting placement to within proximity of existing fiber and power infrastructure. We also explore trade-offs between expanding existing stations and deploying new sites, informing future ground network design for operational constellations.
Federated learning (FL) enables collaborative model training without sharing raw patient data, but standard approaches such as FedAvg treat each client as a black box and provide no mechanism for isolating an adversarial contributor, auditing per-client influence, or honoring a departed participant's right to be forgotten. We present Fed-FBD (Federated Functional Block Diversification), a modular federated architecture that decomposes a ResNet backbone into six functional blocks (the stem, four residual groups, and the classification head) and maintains a warehouse of N color variants, each assembled from independently tracked and contributor-stamped blocks. Fed-FBD provides three capabilities absent in FedAvg: (i) architecturally guaranteed block-level isolation, so that an adversarial or mislabelled client cannot contaminate the clean colous; (ii) privacy-by-design, where membership inference advantage is already indistinguishable from chance before any privacy mechanism is applied; and (iii) surgical machine unlearning of a departed participant's contribution at sub-second cost and without retraining. Experiments on six MedMNIST-2D datasets, PathMNIST at 224x224, and CIFAR-10 show that Fed-FBD trades a modest 0.3%-3.1% IID accuracy gap on the adequately sized datasets for these guarantees, remains within 0.8%-4.0% of FedAvg at Dirichlet alpha=1.0 on three of four datasets, and confines all six adversarial attacks we study to the poisoned client's own blocks with at most +/-0.01 AUC drift on the clean colors.
Auto-regressive models have emerged as powerful tools for sequential data, from language to video. Understanding how and why these models learn latent representations remains an open theoretical question. In this work, we demonstrate that when trained by empirical risk minimization on data from partially observed linear dynamical systems, two-layer linear auto-regressive models naturally learn to approximate Kalman filtering. In particular, we show that the learned hidden representation coincides, up to a similarity transformation, with the state estimates produced by the optimal (Kalman) filter, even though the model has no explicit knowledge of the underlying dynamics or state. The result follows from three main insights. First, we establish that the Kalman filter is well approximated by an auto-regressive model with bounded truncation error. Second, we show that despite non-convexity, the two-layer optimization landscape is benign, i.e., all stationary points are either strict saddles or global minima. Finally, as our main contributions, we provide finite-sample guarantees on prediction error, parameter estimation error, and latent state recovery. Numerical simulations support the theoretical results and demonstrate that the latent representations of auto-regressive models recover state estimates.
Radio-frequency (RF) fingerprinting systems must operate in open-world environments where signals from unknown transmitters and temporal drift introduce distribution shift at test time. Out-of-distribution (OOD) detection provides a natural framework for this problem, yet its application to RF fingerprinting (RFF) remains limited. A key barrier to their adoption is that most OOD detectors require auxiliary OOD data for parameter tuning, an assumption that is difficult to satisfy in RF environments where representative OOD data is impractical to collect. In this work, we introduce a promising set of OOD detection methods from the machine learning literature to open-set RFF domain. We present these methods within a unified mathematical framework based on information theory, which is a natural framework for communication systems. Our framework allows for the systematic analysis of methods and development of new methods. We further demonstrate the applicability of recent work on tuning OOD detectors without given OOD tuning data for open-set RFF. We evaluate on the POWDER RF fingerprinting dataset, showing that detectors tuned without any given OOD data achieve performance comparable to baselines with access to true OOD tuning data and greatly out-perform baseline approaches without access to true OOD tuning data, showcasing the practical viability for the RFF problem.
The resilience literature measures urban performance as recovery: the degree to which a city returns to its pre-shock baseline. This paper develops a stronger concept -- civic ascent -- as part of a broader research program on the ethology of coupled agent-environment systems, of which the city is the deepest available empirical instance. Civic ascent is defined as the condition in which a city emerges from shock with higher functional capacity than before. We develop a conceptual framework in the ethological tradition, treating the city as a coupled system of three slow state variables -- topos (physical structure), nomos (institutional structure), and hexis (civic judgment) -- together with a fast affective channel (delta) through which shocks to topos and nomos reach hexis. The framework distinguishes three structurally distinct pressures on civic systems: shocks (discontinuities in T or M), decay (continuous entropy), and leakage (active extraction of civic surplus into non-civic pools). The ascent condition is that reinforcement from cross-coupling of T, M, and H exceeds the combined loss from decay and leakage. Post-shock ascent is measured by a normalised improvement index A(T) applied to a composite civic performance signal P(t) constructed from scale-adjusted key performance indicators, distinguishing intrinsic civic ascent from demographically driven growth. New York City after September 11, 2001, is proposed as the primary empirical case; the operational measurement program is specified in the companion NYC Civic Data Map (Washburn 2026c, 133 KPIs) and executed in Paper 2. The reader for whom only the urban contribution is of interest will find it complete in itself; the reader interested in the larger program will find this paper its formal core.
Global Business Services (GBS) have emerged as a "living laboratory" for the Twin Transition of Green and Digital Transformation, as multinational corporations (MNCs) face increasing pressure to harmonize digital efficiency with environmental stewardship. Aiming to derive a socio-technical framework, this paper synthesizes Technology Roadmapping (TRM) with the International Telecommunication Union (ITU) ICT-centric innovation ecosystem toolkit. A bibliometric analysis of research clusters reveals an evolutionary shift from basic process automation toward "Sustainable Intelligence," identifying the GBS unit as a central "operational airlock" that mediates between landscape pressures -- such as the EU's dual mandate and Carbon Border Adjustment Mechanisms -- and niche innovations in AI-native workflows. The study further maps these clusters onto a stakeholder engagement canvas, highlighting how resilient "Middle Power" hubs in Poland, Portugal, and Malaysia are bypassing the middle-income trap to provide a "third way" for global value chains amidst a bifurcated geopolitical cloud. The results offer a data-driven design approach for leaders and entrepreneurial support networks to orchestrate talent and supply chain flows, thereby enriching the conceptual understanding of Industry 5.0 and the role of GBS as a primary mechanism for navigating a volatile, multipolar digital economy.
As the EU Carbon Border Adjustment Mechanism (CBAM) approaches, the global semiconductor value chain faces growing structural tensions between regulatory transparency and data sovereignty. This article proposes a RegTech reference architecture using the International Data Spaces (IDSA) framework to orchestrate trustworthy environmental telemetry across the semiconductor-petrochemical nexus. The framework distinguishes the mandatory CBAM requirements from voluntary Science Based Targets initiative (SBTi) frameworks, while addressing the additive complexities of the Safe-and-Sustainable-by-Design (SSbD) framework. Moving beyond standard linear technology stacks, we introduce a prospective roadmapping methodology that transforms upstream physical vulnerabilities into circular, negative feedback loops. Focusing on the Taipei and Penang technology corridor, the article details how sovereign data exchange enables Digital Product Passports (DPPs) to drive Global Business Services (GBSs) capability demands. Finally, we discuss the integration of Agentic AI for autonomous compliance and FinTech green financing, providing a scalable blueprint for global industrial clusters to achieve sovereign, sustainable, and transparent value chains.
The monomial parameterization of finite-memory Volterra identification is ill-conditioned under non-Gaussian input, and the Wiener--Hermite expansion removes this ill-conditioning only for Gaussian white-noise input. We construct the distribution-matched Volterra--Wiener--Kunchenko (VWK) basis by oriented Gram--Schmidt orthogonalization of monomials in $L^2(P)$ and use it as an arbitrary-polynomial-chaos coordinate system for finite-memory Volterra identification from data, following the generalized polynomial chaos of Xiu and Karniadakis (2002) and the data-driven arbitrary polynomial chaos of Oladyshkin and Nowak (2012). The basis itself is classical; the contribution is the Volterra-estimation reading. First, an order-2 misspecification-penalty theorem shows that a self-normalized diagonal estimator in the variance-matched Gaussian basis incurs an excess $L^2(P)$ risk governed by the skew coefficient $\delta=\mu_3/\sigma^2$, vanishing exactly for symmetric inputs. Second, conditioning experiments separate the constructional fact that the population matched Gram is the identity from the finite-sample design Gram: at $n=2000$, the centered-exponential empirical VWK Gram remains far better conditioned than the power Gram, although it degrades with degree. Third, a machine-checked Lean 4 proof establishes the Binomial$(N,p)$ Krawtchouk row for arbitrary $N$. Full least squares over a fixed span is basis-invariant, so VWK stabilizes diagonal cross-correlation and regularized coordinate fits rather than claiming universal prediction superiority. The analysis is moment-based, finite-memory, and restricted to product input laws.
For robotics to be effectively integrated into household or industrial environments, machines must adapt to natural-language prompts in real time. Although Vision-Language Models (VLMs) have enabled zero-shot generalization in robot task and motion planning (TAMP), current state-of-the-art approaches often remain computationally "heavyweight" or require extensive training on thousands of demonstrations. We present GRASP (Grounded Reasoning and Symbolic Planning), a framework designed as a step toward open-vocabulary tabletop manipulation. Our approach leverages a pretrained VLM to translate natural-language queries into neuro-symbolic goal states, grounded in the physical world via a bounding-box detection pipeline. Unlike methods that rely on fixed color lists or hard-coded coordinates, GRASP enables robots to interpret abstract spatial concepts such as "top shelf" and execute tasks without additional fine-tuning. We achieve 73.3% overall success across 90 real-robot trials at three difficulty levels, requiring no task-specific training.
We present OpenMedQ, a medical vision-language model pretrained on the broadest fully-open medical mix to date: 14 datasets totaling ~3.35M pretraining samples spanning pathology, radiology, microscopy, and text-only clinical QA. OpenMedQ reaches state-of-the-art BLEU-1 on PathVQA (75.9), beating Med-PaLM M variants up to 562B parameters (~80x larger), and matches the best reported VQA-MED BLEU-1 (64.5). Its vision encoder, transferred to 8 unseen medical classification benchmarks under an identical downstream recipe, obtains the highest average macro-F1 (0.757) among BiomedCLIP (0.745), PMC-CLIP (0.745), PubMedCLIP (0.746), and a from-scratch baseline (0.616). We release our code and an interactive demo is publicly available as a reproducible baseline for the community.
Vision-language-action (VLA) policies bring natural language into closed-loop robot control, enabling robots to execute manipulation tasks directly from text instructions. The same interface gives text a recurring role in control because the prompt is reused at every replanning step, and each prompt-conditioned action changes the future observations on which the policy acts. Existing VLA attacks study adversarial prompts that elicit targeted low-level actions or make such actions persist across changing images. We identify a stronger trajectory-level failure mode: a prompt that still $\textit{appears}$ to specify the intended task but redirects the final physical outcome. We mathematically formalize this setting as $\textit{command-preserving trajectory redirection}$, a prompt-only threat model in which the attacker chooses one prompt before the episode, all policy and environment components remain fixed, and the prompt must stay close to the benign instruction while omitting target words and correction language. To find such prompts, we introduce an on-policy prompt search method that uses rollouts to discover perturbations whose closed-loop behavior tracks a target task while satisfying the command-preserving constraints. Experiments in simulation and on hardware show that near-benign prompt perturbations can redirect VLA rollouts to attacker-specified targets. These results expose a trajectory-level vulnerability in VLA instruction grounding: text that appears to preserve the intended command can still give an adversary control over the robot's final physical outcome. Project website: this https URL
Pixel-bin image sensors are becoming the default choice for smartphone cameras due to their resolution vs light-gathering trade-off. However, their larger inter-color separation compared to the Bayer color filter array (CFA) makes them challenging to demosaic. Furthermore, existing deep learning-based demosaicing methods are CFA-specific, requiring multiple individual models that take up precious onboard resources and demand larger development and maintenance efforts. In this work, we propose a modular unified architecture for demosaicing various pixel-bin sensors that provides higher image quality while being extensible and lightweight. Additionally, to enable plug-and-play operation, we introduce a learning-free CFA-identification module to detect the CFA type of raw data accurately.
Self-supervised foundation models have shown strong promise in medical imaging. However, existing MRI foundation-model studies have primarily emphasized segmentation and dense prediction tasks, while systematic investigation of self-supervised foundation models for MRI-based disease detection remains limited. In this work, we investigate two major self-supervised pretraining paradigms for MRI-based disease detection: reconstruction-based learning via Masked Autoencoders (MAE) and predictive representation learning via Joint Embedding Predictive Architectures (JEPA). We study the role of auxiliary objectives by introducing a novel spectral-domain reconstruction loss for MAE to enhance sensitivity to fine-grained anatomical structure, and by integrating variance--covariance regularization (VCR) within our JEPA framework to encourage decorrelated latent representations. Our models are pretrained on heterogeneous single-contrast MRI volumes in a contrast-agnostic setting, without modality concatenation. Across five downstream disease detection tasks, our results highlight the importance of self-supervised objective design for medical foundation model pretraining, demonstrating that the downstream benefit of each objective is determined by its relevance to the task's structure. Specifically, spectral regularization yields the largest improvements when the downstream discriminative signal is characterized by strong high-frequency anatomical structures, while covariance regularization is most beneficial when discriminative information spans multiple decorrelated feature dimensions. MAE with spectral-domain supervision consistently achieves superior downstream performance for MRI-based disease detection. These findings suggest that self-supervised objectives in medical imaging encode specific biases, and their downstream benefit is fundamentally conditioned on the task's structure.
The posture of the vocal folds produced by laryngeal muscle activation plays a central role in determining the dynamics of voice production. Abnormal vocal fold configurations are frequently associated with inefficient phonation and a variety of voice disorders. Although diverse glottal closure patterns have been observed clinically, the biomechanical mechanisms governing their dynamic behavior and resulting phonatory characteristics remain incompletely understood. Moreover, existing numerical models that incorporate the effects of the intrinsic musculature on posturing and glottal conformation are computationally expensive, which limits their suitability for large-scale parametric investigations. In this work, we introduce a computationally inexpensive vocal fold (VF) model wherein the body and cover VF layers are treated as a composite beam and a coupled membrane, respectively. Intrinsic laryngeal muscle activation, in addition to positioning the arytenoid cartilages and cricothyroid joint, introduces moments at the boundaries of the structure that influence glottal conformation. The model produces phonatory characteristics that are qualitatively consistent with those reported in high-fidelity finite-element models and clinical studies, thereby supporting its predictive capability while offering substantial computational advantage. The proposed framework provides biomechanical insights into the influence of incomplete glottal closure on phonation dynamics and may serve as a computationally tractable tool for investigating mechanisms underlying certain voice disorders.
Koopman linearization opens many possibilities for control synthesis and analysis of nonlinear systems. Whether or not any given nonlinear control system admits a finite-dimensional Koopman representation remains a crucial question to address. A related problem is to categorize the class of all Koopman linearizable nonlinear control systems. In this work, we present differential geometric conditions on the drift and control vector fields of a control-affine nonlinear system, that must be necessarily satisfied for Koopman linear transformation to exist. The same conditions are also shown to be sufficient for (a slightly weaker notion of) Koopman linearizability on control-invariant manifolds. Further, these conditions, together with an additional condition, become necessary and sufficient for Koopman linearizability to a controllable linear system. Our examples illustrate the ease of checking these conditions, and also shed light on how Koopman linearizing transformation may not exist for a control-affine system even though one can linearize the autonomous part of the system via Koopman lifting.
Dexterous robotic hands are usually formulated as high dimensional active control systems governed by degrees of freedom, actuation, and algorithms. Human hand dexterity, however, is partly encoded in the physical architecture of bones, ligaments, tendons, aponeuroses, and intrinsic muscles. This work describes that contribution as two linked forms of structural intelligence: structural prior generation, in which wrist to finger tenodesis, FDS/FDP routing, and the dorsal extensor hood transform low dimensional posture inputs into default grasp configurations and PIP to DIP coordination; and muscle mediated modulation, in which extrinsic muscles, lumbricals, and interossei regulate MCP posture, distal stability, fingertip force paths, and contact states around that default state. Based on this framework, MCR-Bionic Hand is developed as a 1:1 musculoskeletal biomimetic hand integrating a two row eight bone wrist, cross wrist tendons, anatomical flexor routing, volar plate and collateral ligament constraints, the dorsal extensor hood, and intrinsic muscle pathways within one body. Functional demonstrations and geometric mechanical models show that wrist posture induces multi joint pre shaping, the extensor hood maps PIP posture to a coupled DIP response, and intrinsic plus pathways modulate distal stability and fingertip action direction after grasp formation. Contact rich tasks, including coin rotation, pen transfer, dorsal coin flipping, and cube manipulation, show that MCR-Bionic links low dimensional state generation with fine post contact modulation. These results suggest that anatomical biomimetics is valuable not for visual similarity, but for identifying human hand structures that perform part of control.
This paper presents a distribution-agnostic robust trajectory-optimization framework based on chance-constrained reinforcement learning. The uncertainty is represented here through initial conditions and process noise, with the only requirement being that it can be sampled. A deterministic nominal trajectory is first computed offline, and reinforcement learning is then used only to robustify that baseline through a structured affine closed-loop correction law comprising a feedforward control adjustment and time-varying feedback gains. Probabilistic feasibility is enforced empirically through rollout-based upper-tail quantiles, while terminal dispersion is regulated through covariance-feasibility penalties. The framework is assessed on two materially different trajectory design problems. The flagship case study is a three-dimensional multi-impulse Earth-Mars transfer, where the learned policy is benchmarked against a recent robust trajectory-optimization reference under Gaussian uncertainty and then evaluated under bounded uniform uncertainty and under process disturbances not seen during training. The second case study is a stochastic atmospheric pinpoint rocket landing problem, used to assess portability to a short-horizon continuous-thrust setting with drag, mass depletion, and glide-slope constraints. The results show that the proposed framework can remain competitive in upper-tail fuel cost while preserving probabilistic feasibility, and that the same robustification scaffold can be carried across heterogeneous spacecraft trajectory planning problems without redesign of its core stochastic-control structure.
In this article, we employ active simultaneously transmitting and reflecting reconfigurable intelligent surfaces (ASRIS) to enhance the quality of 6G cellular network services. The network integrates commensal symbiotic radio (CSR) subsystems to facilitate communication between passive Internet of Things (IoT) users and active users, referred to as symbiotic backscatter devices (SBDs) and symbiotic user equipments (SUEs), respectively. Since the SBDs are passive, transmitting information to the SUEs poses significant challenges. To overcome this challenge, we harness the capabilities of massive multiple input multiple output (MIMO) antennas within the base station (BS) to relay the information transmitted by SBDs with greater power. This scheme uses the non-orthogonal multiple access (NOMA) technique for multiple access among all users, and potential interferences are eliminated using successive interference cancellation (SIC). The primary objective is to maximize the throughput between SBDs and SUEs. To achieve this, we formulate an optimization problem involving variables such as active beamforming coefficients at the BS and ASRIS, phase adjustments of ASRIS, and scheduling parameters between CSR and cellular networks. To solve this optimization problem, we used three deep reinforcement learning (DRL) methods: proximal policy optimization (PPO), twin delayed deep deterministic policy gradient (TD3), and asynchronous advantage actor critic (A3C). These methods were simulated, and the results demonstrate that A3C, TD3, and PPO have the best convergence speeds and achieve the highest increases in network throughput, respectively. Finally, the proposed scheme was evaluated using passive simultaneously transmitting and reflecting RIS (STAR-RIS), which demonstrated poorer performance compared to ASRIS.
Plug-and-Play (PnP) algorithms are a class of iterative algorithms that address image inverse problems by combining a physical model and a deep neural network for regularization. Even if they produce impressive image restoration results, these algorithms rely on a non-standard use of a denoiser on images that are less and less noisy along the iterations, which contrasts with recent algorithms based on Diffusion Models (DM), where the denoiser is applied only on re-noised images. We propose a new PnP framework, called Stochastic deNOising REgularization (SNORE), which applies the denoiser only on images with noise of the adequate level. It is based on an explicit stochastic regularization, which leads to a stochastic gradient descent algorithm to solve ill-posed inverse problems. A convergence analysis of this algorithm and its annealing extension is provided. Experimentally, we prove that SNORE is competitive with respect to state-of-the-art methods on deblurring and inpainting tasks, both quantitatively and qualitatively.
Existing Audio Deepfake Detection (ADD) systems often struggle to generalise effectively due to the significantly degraded audio quality caused by audio codec compression and channel transmission effects in real-world communication scenarios. To address this challenge, we developed a rigorous benchmark to evaluate the performance of the ADD system under such scenarios. We introduced ADD-C, a new test dataset to evaluate the robustness of ADD systems under diverse communication conditions, including different combinations of audio codecs for compression and packet loss rates. Benchmarking three baseline ADD models on the ADD-C dataset demonstrated a significant decline in robustness under such conditions. A novel Data Augmentation (DA) strategy was proposed to improve the robustness of ADD systems. Experimental results demonstrated that the proposed approach significantly enhances the performance of ADD systems on the proposed ADD-C dataset. Our benchmark can assist future efforts towards building practical and robustly generalisable ADD systems.
This paper investigates a low Earth orbit (LEO) satellite communication system enhanced by an active stacked intelligent metasurface (ASIM), mounted on the backplate of the satellite solar panels to efficiently utilize limited onboard space and reduce the main satellite power amplifier requirements. The system serves multiple ground users via rate-splitting multiple access (RSMA) and IoT devices through a symbiotic radio network. Multi-layer sequential processing in the ASIM improves effective channel gains and suppresses inter-user interference, outperforming active RIS and beyond-diagonal RIS designs. Three optimization approaches are evaluated: block coordinate descent with successive convex approximation (BCD-SCA), model-assisted multi-agent constraint soft actor-critic (MA-CSAC), and multi-constraint proximal policy optimization (MCPPO). Simulation results show that BCD-SCA converges fast and stably in convex scenarios without learning, MCPPO achieves rapid initial convergence with moderate stability, and MA-CSAC attains the highest long-term spectral and energy efficiency in large-scale networks. Energy-spectral efficiency trade-offs are analyzed for different ASIM elements, satellite antennas, and transmit power. Overall, the study demonstrates that integrating multi-layer ASIM with suitable optimization algorithms offers a scalable, energy-efficient, and high-performance solution for next-generation LEO satellite communications.
Networked Control Systems (NCSs) have been instrumental in realizing fully connected and responsive intelligent environments within the context of real-time virtual control and management. However, traditional NCSs face considerable challenges in handling the vast amounts of data generated by large-scale control applications, particularly in terms of data acquisition, storage, and computational processing. To address these challenges, the emergence of cloud computing and advancements in control theory have empowered the new paradigm known as Cloud Control Systems (CCSs). Recently, CCSs have received substantial attention from industries for their potential properties, such as large-scale data management, complex computations, and data-centric optimized decisions. This study presents an extensive review of recent progress in CCSs spanning over multiple studies published between 2012 and 2025. Specifically, the focus is on providing a taxonomy of the current findings in CCS research, encompassing various perspectives, such as its efficient implementations in industrial automation, security and privacy considerations, and cloud-based control techniques. Each category is examined in depth through selected state-of-the-art analyses of different approaches and contrasting methodologies. Furthermore, we discuss future directions aimed at designing more efficient and practical CCSs. The insights gained from this study can help researchers, practitioners, and decision-makers in their domain for effective CCS design and deployment.
This paper develops a generalized finite horizon recursive solution to the discrete time stage bound disturbance attenuation regulator (StDAR) for state feedback control. This problem addresses linear dynamical systems subject to stage bound disturbances, i.e., disturbance sequences constrained independently at each time step through stagewise squared two-norm bounds. The term generalized indicates that the results accommodate arbitrary initial states. By combining game theory and dynamic programming, this work derives a recursive solution for the optimal state feedback policy. The optimal policy is nonlinear in the state and requires solving a tractable convex optimization for the Lagrange multiplier vector at each stage; the control is then explicit. For systems with constant stage bound, the problem admits a steady-state optimization expressed as a tractable linear matrix inequality (LMI) whose empirical computational cost is approximately cubic in $n$. Numerical examples illustrate the properties of the solution. This work provides a complete feedback solution to the StDAR for arbitrary initial states. Companion papers address the signal bound disturbance attenuation regulator (SiDAR): the finite horizon solution in Part~I-A and convergence properties in Part~I-B.
The Internet of Underwater Things (IoUT) supports marine sensing, environmental monitoring, subsea inspection, and autonomous underwater operations. However, IoUT communication is constrained by limited bandwidth, long propagation delay, time-varying underwater channels, intermittent connectivity, and strict energy budgets. Semantic Communication (SC) offers a promising alternative by transmitting task-relevant meaning rather than raw data, thereby improving communication efficiency in resource-constrained underwater networks. This paper presents a critical and feasibility-aware survey of SC for IoUT, focusing on opportunities, challenges, limitations, and future research directions. We first review the fundamentals of SC-enabled IoUT systems, including semantic representations, layered architectures, semantic channel modeling, and task-oriented evaluation metrics. We then examine learning-driven approaches based on machine learning (ML), knowledge graphs (KGs), vision-language models (VLMs), generative models, and federated learning (FL), with emphasis on their feasibility under underwater edge constraints. Representative applications, including environmental monitoring, marine ecology, subsea infrastructure inspection, disaster response, and autonomous underwater vehicle (AUV) coordination, are analyzed from an SC perspective. Finally, we identify key research directions involving standardized semantic models, reproducible testbeds, compute--communication trade-offs, trustworthy reconstruction, hybrid underwater links, energy-aware edge intelligence, semantic security, digital twins (DTs), and cross-domain interoperability. This survey provides a structured foundation for developing reliable, efficient, and meaning-driven IoUT communication systems.
Neural network controllers are increasingly deployed in robotic systems for tasks such as trajectory tracking and pose stabilization. However, their reliance on potentially untrusted training pipelines or supply chains introduces significant security vulnerabilities. This paper investigates backdoor (Trojan) attacks against neural controllers, using a differential-drive mobile robot platform as a case study. In particular, assuming that the robot's tracking controller is implemented as a neural network, we design a lightweight, parallel Trojan network that can be embedded within the controller. This malicious module remains dormant during normal operation but, upon detecting a highly specific trigger condition defined by the robot's pose and goal parameters, compromises the primary controller's wheel velocity commands, resulting in undesired and potentially unsafe robot behaviours. We provide a proof-of-concept implementation of the proposed Trojan network, which is validated through simulation under two different attack scenarios. The results confirm the effectiveness of the proposed attack and demonstrate that neural network-based robotic control systems are subject to potentially critical security threats.
Classical first-order optimization methods for imaging inverse problems scale poorly with image resolution. Wavelet based multilevel strategies can accelerate convergence under strong blur, but their fixed coarse-to-fine schedules lose effectiveness in moderate-blur or noise-dominated regimes. In this work, we propose an adaptive multiresolution block coordinate Forward-Backward algorithm for image restoration. Multiresolution block selection is driven by the local magnitude of the proximal update via a stochastic non-smooth Gauss-Southwell rule applied to the wavelet decomposition of the image. This adaptive selection strategy dynamically balances updates across scales, emphasizing coarse or fine blocks according to the degradation regime. As a result, the proposed method automatically adapts to varying blur and noise levels without relying on a predefined hierarchical update scheme.
Single-snapshot FDA-MIMO-GPR requires clutter models that account for dispersive-medium uncertainty, yet the statistical link between complex-medium characterization and clutter covariance analysis has remained unclear. This paper develops a propagation-side statistical framework that maps random perturbations of the relaxation spectrum to complex permittivity, complex wavenumber, steering-vector perturbation, medium-induced clutter covariance, and total clutter covariance. Within this framework, the effects of medium uncertainty on effective rank, effective clutter-subspace dimension, and target--clutter separability are characterized through a KL-based modal decomposition and a subspace-projection analysis. Numerical validation uses five literature-informed dielectric families to define physically traceable prior scenarios, a controlled random-field model to exercise the main propagation chain, and gprMax-based full-wave FDTD snapshots for an independent solver-level consistency check. Monte Carlo closure shows stage-wise numerical consistency, identifies steering linearization as the dominant approximation-sensitive step, and supports a weak perturbation regime with a bounded extension into a moderate regime. In a representative whitening-and-detection benchmark, the structured covariance model raises AUC from 0.593 for a diagonal baseline to 0.753, while prior-mismatch experiments indicate gradual rather than abrupt degradation. These results provide an explicit and interpretable interface for embedding complex-medium uncertainty into FDA-MIMO-GPR clutter analysis within a first-order, propagation-dominated setting.
Increasing penetration of electric vehicles, heat pumps, and rooftop photovoltaics is creating thermal and voltage stress in low-voltage distribution grids. This work links the German Federal Government energy transition pathway (2025-2045) with state estimation performance requirements, evaluated on two SimBench reference networks across three equipment quality levels (good, medium, poor) and three VDE Forum Netztechnik/Netzbetrieb (VDE FNN) measurement constellations that differ in the availability of transformer and feeder-level instrumentation. Within this work's analysis, congestion is caused exclusively by transformer overloading and voltage-band violations. No individual line exceeds its thermal rating (maximum: 89.5%). Equipment quality governs congestion onset for a given deployment trajectory: under good equipment, congestion remains absent through 2045, under medium equipment it emerges from 2035 (3/6 scenarios), under poor equipment from 2025 (6/6). Without transformer instrumentation, median voltage estimation errors reach 6-42% regardless of smart meter penetration. Adding a single transformer measurement reduces errors by an order of magnitude, achieving median errors of 0.5-1.7%. In urban networks, transformer-level instrumentation meets the VDE FNN voltage accuracy target (99th percentile voltage error below 2%) in all configurations. In rural networks under poor equipment, the target is approached but not met. These findings motivate prioritizing transformer instrumentation as an effective first step for grid observability and supplementing the current consumption-driven metering rollout with risk-based deployment criteria linked to local congestion exposure.
This paper presents a novel nonlinear backstepping control law for continuous, low-thrust station-keeping in the Earth-Moon system. Quasi-periodic libration point orbits are targeted under a high-fidelity model of the dynamics. Almost global uniform exponential stability guarantees are attained, as shown through Lyapunov's stability theory. Saturation of the actuators is formally included in the controller design, such that these guarantees hold even in the event of saturation. The relationship between saturation threshold, control gains, and deviation is studied and an optimal procedure for gain selection is discussed. The control solution is tested numerically through a Monte Carlo analysis over representative application cases, subject to operational errors, constraints, and external perturbations. Station-keeping under actuation saturation is validated considering a conservative threshold for typical electric propulsion systems.
Recent advances in spoken dialogue language models have shifted from turn-based to full-duplex designs, where the model continuously listens to the user while generating responses. However, existing duplex backbones still lack a native channel for in-conversation planning and tool calling, leaving real-time agentic behaviour either tied to turn boundaries or relegated to an external cascade. We propose DuplexSLA, a native full-duplex Speech-Language-Action foundation model that decodes assistant audio together with a structured action stream on a shared 160 ms chunk timeline. DuplexSLA is built on a dual-stream three-channel formulation: a continuous user audio channel, a discrete assistant audio channel, and a rate-limited textual action channel, all decoded jointly by a single backbone, so that listening, speaking, planning, and tool calling unfold on one shared clock. Two capabilities define the model: (1) semantic-driven turn-taking control, where interruption, pause, and backchannel are handled inside the same backbone instead of by an external semantic VAD; and (2) in-conversation planning and tool calling, where planning text and structured tool calls are emitted on the action channel without halting assistant audio, so that multi-action and backchannel-triggered tool use are interleaved with ongoing speech. To evaluate these capabilities together, we further construct DuplexSLA-Bench, a duplex benchmark covering pause, interrupt, and backchannel turn-taking together with three styles of in-conversation tool calling. Our project page, interactive demos, and the DuplexSLA-Bench evaluation suite are publicly available at this https URL.
Engineered infrastructure systems pose inverse problems in which hidden states, unknown parameters, and subsystem couplings must be inferred from sparse and noisy measurements. These problems are difficult because physical subsystems are heterogeneous, sensing is partial, uncertainty is distributed across subsystem interfaces, and computational cost grows rapidly with system size. We address this challenge with probabilistic compositional inference, a graph-based architecture that represents a coupled system as interacting subsystems, each retaining its own local model, estimator, and uncertainty representation, while coupling is handled through physically meaningful stochastic messages exchanged across subsystem interfaces. This formulation allows mechanistic, learned, and deterministic components to coexist within a single inference framework and propagates calibrated uncertainty without assembling a global augmented state or covariance. We validate the framework in three increasingly demanding settings: a sparse-sensing canonical inverse problem, where interface couplings can also be learned from data; infrastructure-scale power networks, where the method matches centralized joint state-and-parameter inference while reducing computational scaling from approximately cubic to approximately linear; and a multi-physics turbine embedded in a power-grid network, where heterogeneous subsystems compose hierarchically without degrading local inference or collapsing local posteriors into a global estimate. Together, these results show that subsystem structure can be exploited as the organizing principle for uncertainty-aware inverse inference in coupled engineered systems.
A unified, closed-form analytical PI/PID tuning method is presented for all-pole plants up to third order that yields a strictly monotonic (zero-overshoot) step response with minimum settling time. The design target is the binomial closed loop p^n/(s+p)^n, which is monotonic with robustness depending only on the order n. Because a fixed PI/PID cannot assign the closed-loop poles and the controller zeros independently, realizing this target exactly requires the controller zeros to be cancelled, which forces the controller numerator to divide the plant denominator. It follows that an exact, real-gained solution exists for any stable plant precisely up to second order with a PI controller and third order with a PID controller; beyond that the residual binomial factor acquires a complex pair of damping sqrt(3)/2, which a generic plant does not contain. Explicit gains are derived for first-order plants (PI), second-order plants with real and complex poles (PI and PID), and third-order plants with three real poles or one real pole plus a complex pair (PID). The freedom of the coincident designs is shown to be bounded: a quadratic nonnegativity condition gives the exact window of the design pole for strict monotonicity, which collapses at the pole-ratio-2 changeover for real poles and is nonempty for damping ratios above approximately 0.443 for complex poles. Monotonicity guarantees Mt = 1, hence Ms <= 2, phase margin >= 60 degrees, and gain margin >= 6 dB, tightening to universal constants for the binomial family. Load-disturbance attenuation obeys IAEd = 1/Ki, making the cost of cancellation explicit, and comparisons with SIMC, the CHR zero-overshoot rule, and deadbeat-fitted explicit formulas quantify the trade: at matched maximum sensitivity the proposed design settles faster than SIMC on the third-order example, with markedly lower controller gains and peak control effort.
Recent speech-aware large language models (Speech-LLMs) rely on a pre-trained speech encoder to convert audio into semantic-rich representations consumable by LLM. In this work, instead, we explore: can an LLM learn to read Mel spectrogram directly without a dedicated speech encoder? We propose Mel-LLM, an encoder-free Speech-LLM that feeds lightly pre-processed Mel spectrogram patches directly into the LLM through a linear projection, allowing the LLM to learn speech-text alignment purely through its own parameters. We conduct extensive experiments on both automatic speech recognition (ASR) and text-to-speech (TTS) tasks. For ASR, we evaluate on the OpenASR leaderboard public sets and production-level scaling experiments, demonstrating that the encoder-free solution achieves competitive performance with only limited degradation compared to encoder-initialized counterparts. We find that when data is limited, initialization from a multimodal checkpoint (Phi-4-MM) is crucial for maintaining performance. We also present ablation studies revealing which LLM layers are less relevant to speech encoding. For TTS, we show preliminary results with a next-token VAE approach. While TTS performance is not yet optimal, these results establish the feasibility of a fully unified encoder-free architecture for autoregressive speech-text modeling.
Coordinated multi-satellite (CoMS) transmission and non-orthogonal multiple access (NOMA) are envisioned to jointly enhance coverage, capacity, and spectrum efficiency for satellite networks. Their integration into a unified CoMS-NOMA framework will allow more efficient, reliable, and energy-efficient multi-user access. This paper investigates the downlink performance of CoMS-NOMA networks from a system-level perspective, in which multiple satellites cooperatively serve multiple users via NOMA. Leveraging tools from stochastic geometry, related angles and distances in CoMS-NOMA are first derived as intermediate results. Then, we obtain the combined signal power distributions and analyze coverage and spectrum performance under both inter- and intra-satellite interference, accounting for potential imperfect successive interference cancellation (SIC). The analytical model is validated across a range of system parameters, including the number of satellites, service region angle, error-propagation factor, and power allocation coefficients. Numerical results indicate that increasing the number of cooperative satellites does not always improve coverage and spectrum efficiency. Additionally, while a higher main-lobe gain improves coverage, a near-perfect SIC provides only slightly greater benefits than a reasonably good SIC. With properly selected power allocation coefficients, CoMS-NOMA achieves up to a 270% improvement in coverage and a 56% gain in sum spectral efficiency, compared with conventional orthogonal and single-satellite schemes, indicating potential for green, energy-efficient satellite networking.
The mechanical complexity of soft robots creates significant challenges for their model-based control. Specifically, linear data-driven models have struggled to control soft robots on complex, spatially extended paths that explore regions with significant nonlinear behavior. To account for these nonlinearities, we develop here a model-predictive control strategy based on the recent theory of adiabatic spectral submanifolds (aSSMs). This theory is applicable because the internal vibrations of heavily overdamped robots decay at a speed that is much faster than the desired speed of the robot along its intended path. In that case, low-dimensional attracting invariant manifolds (aSSMs) emanate from the path and carry the dominant dynamics of the robot. Aided by this recent theory, we devise an aSSM-based model-predictive control scheme purely from data. We demonstrate the effectiveness of our data-driven model in tracking dynamic trajectories across diverse tasks. We validate on high-fidelity, high-dimensional finite-element models of a soft trunk robot and Cosserat-rod-based elastic soft arms, with additional experiments confirming robust performance even in the presence of experimental noise. Notably, we find that five- or six-dimensional aSSM-reduced models outperform the tracking performance of other data-driven modeling methods by a factor up to 10 across all closed-loop control tasks.
Unlike conventional "black-box" transformers with classical self-attention mechanism, we build a lightweight and interpretable transformer-like neural net by unrolling a mixed-graph-based optimization algorithm to forecast traffic with spatial and temporal dimensions. We construct two graphs: an undirected graph $\mathcal{G}^u$ capturing spatial correlations across geography, and a directed graph $\mathcal{G}^d$ capturing sequential relationships over time. We predict future samples of signal $\mathbf{x}$, assuming it is "smooth" with respect to both $\mathcal{G}^u$ and $\mathcal{G}^d$, where we design new $\ell_2$ and $\ell_1$-norm variational terms to quantify and promote signal smoothness (low-frequency reconstruction) on a directed graph. We design an iterative algorithm based on alternating direction method of multipliers (ADMM), and unroll it into a feed-forward network for data-driven parameter learning. We periodically insert graph learning modules for $\mathcal{G}^u$ and $\mathcal{G}^d$ that play the role of self-attention. Experiments show that our unrolled networks achieve competitive traffic forecast performance as state-of-the-art prediction schemes, while reducing parameter counts drastically.
Deep learning-based machine listening is broadening the scope of industrial acoustic analysis, yet its widespread implementation on live shop floors is hindered by the reliance on large, task-specific annotated datasets for every new task. While emerging general-purpose sound foundation models aim to alleviate data dependency, they reveal critical dilemmas in practice. General-purpose sound foundation models are computationally expensive and fail in industrial scenarios characterized by tonal harmonics, broadband noise, and transient fault events, making instant, on-site deployment impractical. These challenges combined mean that a practical, end-to-end system for deploying a sound foundation model on a live shop floor has remained elusive. To address this challenge, this study introduces LISTEN (Lightweight Industrial Sound-representable Transformer for Edge Notification), the first lightweight foundation model specialized for industrial sound. Through Knowledge Distillation (KD) from the large-scale teacher model IMPACT (Industrial Machine Perception via Acoustic Cognitive Transformer), we construct LISTEN optimized for resource-constrained edge environments. By freezing the backbone and training only a shallow head on minimal target-process data, rather than performing full fine-tuning or retraining, LISTEN achieves nearly identical performance to IMPACT across diverse manufacturing processes. This study further demonstrates a complete system for real-time machine monitoring, encompassing data acquisition with Industrial Internet of Things (IIoT) devices, rapid model adaptation using minimal annotated data, and real-time monitoring on a low-cost edge device. By validating the entire system on a live CNC machine, this work establishes the first feasible end-to-end system for deploying a lightweight industrial sound foundation model in an active industrial environment.
Dynamic control of soft continuum robots (SCRs) holds great potential for expanding their applications, but remains a challenging problem due to the high computational demands of accurate dynamic models. While data-driven approaches like Koopman-operator-based methods have been proposed, they typically lack adaptability and cannot reconstruct the full robot shape, limiting their applicability. This work introduces a real-time-capable nonlinear model-predictive control (MPC) framework for SCRs based on a domain-decoupled physics-informed neural network (DD-PINN) with adaptable bending stiffness. The DD-PINN serves as a surrogate for the dynamic Cosserat rod model with a speed-up factor of up to 44,000. It is also used within an unscented Kalman filter for estimating the model states and bending compliance from end-effector position measurements. We implement a nonlinear evolutionary MPC running at 70 Hz on the GPU. In simulation, it demonstrates accurate tracking of dynamic trajectories and setpoint control with end-effector position errors below 3 mm (2.3\% of the actuator's length). In real-world experiments, the controller achieves similar accuracy and accelerations up to 3.55 m/s2.
Biophysical models in diffusion MRI (dMRI) hold promise for characterizing gray matter tissue microstructure. Yet, the reliability of their parameter estimates remains largely under-studied, especially in models that incorporate water exchange. In this study, we investigate the accuracy, precision, and presence of degeneracy of two recently proposed gray matter models, NEXI and SANDIX, using established acquisition protocols, on both simulated and \textit{in vivo} data. We employ $\mu$GUIDE, a Bayesian inference framework based on deep learning, to quantify parameter uncertainty and detect degeneracies, enabling a more interpretable assessment of model fits. Our results show that while some microstructural parameters, such as extra-cellular diffusivity and neurite signal fraction, are robustly estimated, others, including exchange time and soma radius, are often associated with high uncertainty and estimation bias, particularly under realistic noise conditions and reduced acquisition protocols. Comparison with non-linear least squares fitting highlights the critical advantage of uncertainty-aware methods: the ability to flag and filter out unreliable estimates. Together, these findings emphasize the need to report uncertainty and account for model degeneracies when interpreting model-based estimates. Our study advocates for the integration of probabilistic fitting approaches into imaging pipelines to improve reproducibility and biological interpretability.
Integrating the Alternating Direction Method of Multipliers (ADMM) with Differential Dynamic Programming (DDP) provides a scalable framework for distributed multi-agent trajectory optimization. In practice, ADMM is typically truncated for computational efficiency, tightly coupling parameters that would otherwise separately govern coordination quality and task performance. In this paper, we propose Differentiable Coordination (DiffCoord), a unified framework that jointly meta-learns these coupled parameters for the truncated ADMM-DDP pipeline. These parameters are generated by agent-wise neural networks for task adaptation, and the same networks are shared among isomorphic agents to enable scalability to varying agent counts. We achieve efficient meta-learning by differentiating the ADMM-DDP pipeline end-to-end. Notably, this yields an auxiliary ADMM-LQR distributed gradient solver that computes and coordinates meta-gradients with respect to these parameters. This solver inherits the computational structure of the pipeline, enabling reuse of key computation results and efficient parallelization over agents and along trajectory horizons. We validate DiffCoord through numerical and physical experiments on a cooperative aerial transport system, where it reconfigures quadrotor formations for safe 6-DoF load manipulation in tight spaces. It adapts robustly to varying team sizes and load dynamics, while reducing per-agent gradient computation time by up to 70% compared with state-of-the-art trajectory-gradient methods.
Deploying reliable bioacoustic monitoring systems requires models that generalize under high-noise, low-SNR conditions and evaluation protocols that expose deployment-relevant failure modes, gaps largely unaddressed in current UPAM practice. Intrinsic noise, variable propagation, and mixed biological and anthropogenic sources induce distribution shifts that conventional models and single-split evaluations obscure, inflating performance and masking instability. We introduce GetNetUPAM, a hierarchical nested cross-validation framework that uses the nested stage to quantify model stability rather than tune for inflated hold-out scores. By partitioning data into site-year blocks, GetNetUPAM preserves ecological heterogeneity and forces each outer fold to represent a distinct environmental regime, preventing overfitting to localized noise or sensor artifacts. Inner stratified folds measure generalization across the full UPAM signal distribution, enforcing strict separation between model development and the outer held-out deployment condition. Using GetNetUPAM, we evaluate the Adaptive Resolution Pooling and Attention Network (ARPA-N), a CNN architecture for irregular spectrogram dimensions. ARPA-N integrates CBAM spatial attention as a learned noise suppressor, producing attention maps that localize true call structure and avoid the global, non-biological cues exploited by standard CNNs on long-window data. Under GetNetUPAM, ARPA-N generalizes robustly across diverse environmental regimes. In the zero-training support Balleny Islands region, it reduces false positives per hour by over an order of magnitude (approximately 10x) at fixed 90 percent recall, yielding consistently improved metrics across folds. These advances provide a reproducible benchmark and move UPAM toward scalable, deployment-reliable ecological monitoring.
Metriplectic conditional flow matching (MCFM) learns dissipative dynamics without violating first principles. Neural surrogates often inject energy and destabilize long-horizon rollouts; MCFM instead builds the conservative-dissipative split into both the vector field and a structure preserving sampler. MCFM trains via conditional flow matching on short transitions, avoiding long rollout adjoints. In inference, a Strang-prox scheme alternates a symplectic update with a proximal metric step, ensuring discrete energy decay; an optional projection enforces strict decay when a trusted energy is available. We provide continuous and discrete time guarantees linking this parameterization and sampler to conservation, monotonic dissipation, and stable rollouts. On a controlled mechanical benchmark, MCFM yields phase portraits closer to ground truth and markedly fewer energy-increase and positive energy rate events than an equally expressive unconstrained neural flow, while matching terminal distributional fit.
The Hamilton-Jacobi skeleton, also known as the medial axis, is a powerful shape descriptor that represents binary objects in terms of the centres of maximal inscribed discs. Despite its broad applicability, the medial axis suffers from sensitivity to noise: Minor boundary variations can lead to disproportionately large and undesirable expansions of the skeleton. Classical pruning methods mitigate this shortcoming by systematically removing extraneous skeletal branches. This sequential simplification of skeletons resembles the principle of sparsification scale-spaces that embed images into a family of reconstructions from increasingly sparse pixel representations. We combine both worlds by introducing skeletonisation scale-spaces: They leverage sparsification of the medial axis to achieve hierarchical simplification of shapes. Unlike conventional pruning, our framework inherently satisfies key scale-space properties such as hierarchical architecture, controllable simplification, and equivariance to geometric transformations. We provide a rigorous theoretical foundation in both continuous and discrete formulations and extend the concept further with densification. By growing the skeleton successively instead of shrinking it, we allow inverse progression from coarse to fine scales. Densification scale-spaces can even reach beyond the original skeleton to produce overcomplete shape representations with relevancy for practical applications. Through proof-of-concept experiments, we demonstrate the effectiveness of our framework for practical tasks including robust skeletonisation, shape compression, and stiffness enhancement for additive manufacturing.
Robust and generalizable segmentation of brain tumors on multi-parametric magnetic resonance imaging (MRI) remains difficult because tumor types differ widely. The BraTS 2025 Lighthouse Challenge benchmarks segmentation methods on diverse high-quality datasets of adult and pediatric tumors: multi-consortium international pediatric brain tumor segmentation (PED), preoperative meningioma tumor segmentation (MEN), meningioma radiotherapy segmentation (MEN-RT), and segmentation of pre- and post-treatment brain metastases (MET). We present a flexible, modular, and adaptable pipeline that improves segmentation performance by selecting and combining state-of-the-art models and applying tumor- and lesion-specific processing before and after training. Radiomic features extracted from MRI help detect tumor subtype, ensuring a more balanced training. Custom lesion-level performance metrics determine the influence of each model in the ensemble and optimize post-processing that further refines the predictions, enabling the workflow to tailor every step to each case. On the BraTS testing sets, our pipeline achieved performance comparable to top-ranked algorithms across multiple challenges. These findings confirm that custom lesion-aware processing and model selection yield robust segmentations yet without locking the method to a specific network architecture. Our method has the potential for quantitative tumor measurement in clinical practice, supporting diagnosis and prognosis.
Uncrewed aerial vehicles (UAVs) have played an important role in the low-altitude economy and have been used in various applications. However, with the increasing number of UAVs and explosive wireless data, the existing bit-oriented communication network has approached the Shannon capacity, which cannot satisfy the quality of service (QoS) with ultra-reliable low-latency communication (URLLC) requirements for command and control (C\&C) transmission in bit-oriented UAV communication networks. To address this issue, we propose a novel semantic-aware C\&C transmission for multi-UAVs under limited wireless resources. Specifically, we leverage semantic similarity to measure the variation in C\&C messages for each UAV over continuous transmission time intervals (TTIs) and capture the correlation of C\&C messages among UAVs, enabling multicast transmission. Based on the semantic similarity and the importance of UAV commands, we design a trigger function to quantify the QoS of UAVs. Then, to maximize the long-term QoS and exploit multicast opportunities of C\&C messages induced by semantic similarity, we develop a proximal policy optimization (PPO) algorithm to jointly determine the transmission mode (unicast/multicast/idle) and the allocation of limited resource blocks (RBs) between a base station (BS) and UAVs. Experimental results show that our proposed semantic-aware framework significantly increases transmission efficiency and improves effectiveness compared with bit-oriented UAV transmission.
While music generation models have evolved to handle complex multimodal inputs mixing text, lyrics, and reference audio, evaluation mechanisms have lagged behind. In this paper, we bridge this critical gap by establishing a comprehensive ecosystem for music reward modeling under Compositional Multimodal Instruction (CMI), where the generated music may be conditioned on text descriptions, lyrics, and audio prompts. We first introduce CMI-Pref-Pseudo, a large-scale preference dataset comprising 110k pseudo-labeled samples, and CMI-Pref, a high-quality, human-annotated corpus tailored for fine-grained alignment tasks. To unify the evaluation landscape, we propose CMI-RewardBench, a unified benchmark that evaluates music reward models on heterogeneous samples across musicality, text-music alignment, and compositional instruction alignment. Leveraging these resources, we develop CMI reward models (CMI-RMs), a parameter-efficient reward model family capable of processing heterogeneous inputs. We evaluate their correlation with human judgment scores on musicality and alignment on CMI-Pref along with previous datasets. Further experiments demonstrate that CMI-RM not only correlates strongly with human judgments, but also enables effective inference-time scaling via top-k filtering. Code is available at GitHub (this https URL). Model weights: CMI-RM (this https URL). Datasets: CMI-Pref-Pseudo (this https URL) and CMI-Pref (this https URL)
In this work, we address the need for efficient and formally stable Recurrent Neural Networks (RNNs) in environments with limited computational resources by analyzing the stability of the Minimal Gated Unit (MGU) network, a lightweight alternative to common gated RNNs used in system identification. We derive sufficient parametric conditions for the MGU network's input-to-state stability and incremental input-to-state stability properties. These conditions enable a-posteriori validation of model stability and form the basis for novel stability-promoting training methodologies, including a warm-start of the network's parameters and a projected gradient-based optimization scheme, both of which are presented in this work. Comparative evaluation, including robustness analysis and validation on synthetic and real-world data (i.e., the Silverbox benchmark), demonstrates that the minimal gated unit network successfully combines formal stability guarantees with superior parameter efficiency and faster inference times compared to other state-of-the-art recurrent neural networks, while maintaining comparable and satisfactory accuracy. Notably, the results attained on the Silverbox benchmark illustrate that the stable MGU network effectively captures the system dynamics, whereas other stable RNNs fail to converge to a reliable model.
As autonomous and agentic AI systems scale in robotic and human-machine environments, managing hallucination and persistent but unjustified action remains an open challenge. Rather than attributing these failures solely to model or alignment limitations, this paper explores the architectural vulnerability of unbounded autonomy - the presumption that an agent should continue operating regardless of rising uncertainty. It introduces a theory of managed autonomy that defines intelligent behavior through the formal capacity to detect epistemic drift, suspend reasoning, attempt recovery, and ultimately surrender control when reliability diminishes. We instantiate this theory via the SMARt (Self-Managing Multi-tier Autonomous Reasoning with Regulated/Revoked transitions) model, a four-layer framework featuring Stable, Meta-cognitive, Assisted, and Regulated states. By developing a timed, guarded Petri net formulation, we establish theoretically bounded properties for the system, demonstrating how architecture can formally mandate escalation, constrain invalid outputs, and ensure governance reachability under specified conditions. We further analyze how incorporating domain-specific trigger sets across varied operational settings (e.g., healthcare, robotics, etc.) can systematically preserve safety, assuming completeness and soundness criteria are met. Because these triggers are designed to be adaptive, the SMARt model accommodates the safe, controlled expansion of an agent's operational scope over time. We conclude that formalizing failure management within the autonomy lifecycle is a crucial step toward realizing reliable and governed artificial intelligence.
Speech Large Language Models (SLLMs) underperform their text counterparts on complex reasoning. We reveal that this gap is not a uniform cognitive deficit. Evaluating two architecturally diverse SLLMs, we show speech-to-text (S2T) matches or exceeds text-to-text (T2T) on spatial, syntactic, and factual tasks. Yet on logical tasks requiring entity tracking, S2T accuracy collapses to chance. We diagnose this as an entity binding failure: continuous speech features blur precise entity-property associations during implicit reasoning. To validate this diagnosis, we introduce Entity-Aware Chain-of-Thought (EA-CoT), a lightweight inference-time intervention forcing SLLMs to enumerate entities and bind them to claims before reasoning. EA-CoT bridges the gap, even when spoken names are misrecognized, yielding up to a 24.4 percentage-point accuracy gain. Ablations confirm the gains stem from explicit semantic binding, reframing the gap as an elicitation failure rather than a missing capability.