New articles on Electrical Engineering and Systems Science


[1] 2606.27405

Automated brain tumor detection in MRI images using CNN and ResNet architectures

Deep learning has shown significant potential in medical image analysis, particularly for disease detection using MRI scans. Accurate and early diagnosis of brain tumors remains challenging due to the complexity of brain structures and reliance on manual interpretation. This work presents an automated deep learning-based approach for brain tumor detection from MRI images using Convolutional Neural Networks and Residual Networks. Transfer learning is applied with two pretrained architectures, ResNet18 and ResNet50, to classify MRI scans into tumor and non-tumor categories. Experiments are conducted on a dataset of 3,929 brain MRI images, evaluating the impact of model depth and fine-tuning strategies. The results show that ResNet18 achieves a higher accuracy of 97% compared to 96% for ResNet50, demonstrating better generalization on limited medical data. The proposed framework enables fast, accurate, and cost-effective brain tumor detection, supporting early diagnosis and clinical decision-making.


[2] 2606.27410

DFM: Difference Feature Modeling with Text-Guided Gated Contrastive Loss for Remote Sensing Image Change Captioning

The primary goal of Remote Sensing Image Change Captioning (RSICC) is to automatically generate descriptions of changes between remote sensing images captured at different time points. Existing models still rely on a single autoregressive generation paradigm, which tends to prioritize learning easily generated vocabulary over capturing discriminative differences between images. To address this, we reframe the training paradigm and propose a novel Difference Feature Modeling (DFM) framework. Specifically, we introduce a Text-guided Gated Contrastive Loss (TGCL) to guide the vision encoder to extract critical features from a text-modal perspective. Additionally, we incorporate a pre-trained Change Detection model to transfer stable change detection knowledge. In order to further enhance the representation, we design a Joint Feature Modeling (JFM) module to achieve the fusion of multi-scale difference representations, thereby capturing comprehensive spatiotemporal variations between multi-temporal images. Extensive experiments on multiple datasets demonstrate the effectiveness of our approach.


[3] 2606.27461

Access Selection for Finite-SNR Modal Recoverability in Sampled-Wave Receivers

Large-aperture wave receivers can contain a large number of candidate sensor locations, antenna ports, or measurement blocks, while hardware and processing constraints allow only a subset to be activated. In this paper, receiver selection is formulated as a finite-SNR modal-recoverability problem, where the selected subset is required to support reliable recovery in every direction of a prescribed modal subspace. However, large trace, log-det, or codebook-distance values alone do not ensure that every prescribed modal direction is resolved. Specifically, the proposed framework consists of three nested recoverability criteria for individual modal degrees, a joint target subspace, and target modes in the presence of active non-target modes. The third criterion, the Schur-complement information floor, provides an exact worst-direction posterior-error interpretation. We further show that stricter recoverability criteria require at least as many active receiver accesses and derive tests that identify reliability targets that remain unattainable even when all available accesses are activated. Next, we specialize the framework to finite spherical-wave sampling and compare greedy receiver-selection rules. Numerical results demonstrate that global log-det is generally more access-efficient at moderate reliability floors, whereas Schur-based selection is more effective at stringent floors. While this paper is motivated by holographic and XL-MIMO receivers, the framework can be applied to general sampled wave systems.


[4] 2606.27464

Comparison of Non-Deterministic Nonlinear Systems

We characterize a notion of system comparison, termed as $(T_e,\gamma,\delta)$-similarity, for non-deterministic nonlinear systems. Building on a similar notion recently proposed for stable linear systems, the proposed notion characterizes the dissimilarity between the outputs, measured using the $L_2$ norm, of two nonlinear dynamical systems in terms of their inputs and disturbances. By establishing a relationship between $(T_e,\gamma,\delta)$-similarity and differential dissipativity, we establish equivalence between $(T_e,\gamma,\delta)$-similarity of nonlinear systems and the $(T_e,\gamma,\delta)$-similarity of their differential dynamics. We characterize the $(T_e,\gamma,\delta)$-similarity for nonlinear systems as a Linear Matrix Inequality feasibility problem and also provide necessary and sufficient conditions for solving this feasibility problem. We demonstrate the utility of the proposed notion through its use in two applications: (i) robust hierarchical control applied to a planar aircraft and (ii) the improvement (or design) of abstract models applied to the Moore-Greitzer model and an electronic circuit.


[5] 2606.27471

Electronically Reconfigurable Pinching Antennas for Millimeter-Wave Communication in LoS and NLoS Environments

This letter presents the design and operation of an electronically reconfigurable pinching antenna (E-pinching antenna) and examines its capability to establish controllable millimeter-wave links that can circumvent blockages. The antenna consists of a low-loss rectangular dielectric waveguide that leaks energy through modular varactor-loaded elements to form tunable radiation points. A copper reflector ensures unidirectional radiation, thereby boosting forward-link efficiency and spatial selectivity. Full-wave simulations demonstrate that the radiated power and transmission coefficient to a receiving patch antenna can be dynamically tuned by adjusting the varactor capacitances. A multi-user scenario is investigated by activating two radiation modules along the waveguide to serve spatially separated receiving antennas isolated by a metallic partition (blockage). Simulation results confirm the capability to establish links to both LoS and NLoS users with minimal propagation losses. The proposed architecture enables a scalable and electronically controllable distributed antenna platform for reconfigurable wireless systems with enhanced blockage mitigation.


[6] 2606.27542

Threshold Optimization and Dynamic Adaptation of Distributed Optimal Power Flow in 5G Networks

In this paper, we present an experimental evaluation study of the Alternating Direction Method of Multipliers (ADMM), which is a widely used technique in the distributed optimization of power distribution networks. The focus of this study is on how real 5G communication performance affects ADMM in a fully experimental platform that features commercial 5G connectivity and real-time control. The ADMM-based Distributed Optimal Power Flow (DOPF) problem is solved using the IEEE 123-bus unbalanced distribution feeder subdivided into five areas, each managed by a local controller implemented on a Raspberry Pi. To mitigate the impact of the communication network variability, we propose a delay threshold-based mechanism that yields a 7.75% reduction in convergence time compared to a no-threshold baseline. We also devised a policy to dynamically update the threshold value based on communication and computation conditions, achieving a 26.42% reduction in the convergence time compared with the static optimal threshold. These results demonstrate the potential of adaptive, communication-aware control strategies for real-world Smart Grid (SG) deployments.


[7] 2606.27642

Real-Time State Estimation in Smart Grids over 5G Networks: Experimental Validation Using Raspberry Pis and Typhoon HIL

Reliable, low-latency communication is critical for real-time monitoring and control in modern Smart Grids (SGs). The emergence of 5G networks, with enhanced reliability, significantly lower latency, and native support for massive machine-type communication, offers strong potential to enable advanced grid applications such as state estimation (SE) and fault detection. While existing studies investigate 5G for SG use cases, most rely on simulations or analytical models; experimental validation using real hardware and SG data remains limited. This paper fills this gap by presenting a fully experimental validation of real-time SE over a commercial 5G network using a 5G-based multi-node testbed built with Raspberry Pi (RPi)-based SG nodes and a Typhoon Hardware-in-the-Loop (HIL) real-time simulator. We first characterize 5G communication performance using simulated SG data under varying reporting rates and deployment environments by evaluating Key Performance Indicators (KPIs) such as end-to-end delay, jitter, and frame loss. Experimental results show that the worst-case mean delay observed for the 5G is approximately 6.5x lower than that of our previous LTE cat-M study at the corresponding reporting rate. We then stream real-time voltage, current, and phase-angle measurements-generated by an IEEE 4-node feeder model in Typhoon HIL simulator-to a remote Phasor Data Concentrator (PDC) for SE and fault detection. Results demonstrate that 5G-enabled measurements support accurate SE under both steady-state and dynamic load variations. Furthermore, fault-detection experiments confirm reliable and prompt fault detection, with detection delays as low as 0.80 s.


[8] 2606.27649

LightFARM: Model Predictive Lighting Control with Battery-Free IoT for Energy-Efficient Indoor Farming

Lighting is the dominant energy load in indoor farming, yet most deployed systems still rely on fixed rule-based or schedule-based control. We present LightFARM, a predictive lighting control framework that couples crop illumination with battery-free sensing for more energy-efficient indoor farming. LightFARM combines finite-horizon predictive control with compact models of photosynthesis, thermal dynamics, and sensor energy state. The controller adjusts lighting intensity to balance photosynthetic benefit, electrical power consumption, thermal safety, and sensing-energy feasibility. A key design feature is that the same light-emitting diode (LED) fixtures serve both as the photosynthetic light source for crops and as a controllable energy source for self-powered sensor nodes. We implement LightFARM in a real indoor basil cultivation system and evaluate it through two independent 12-day cultivation trials. Compared with a conventional rule-based baseline, LightFARM reduces lighting energy consumption by approximately 41% and improves energy productivity from 36.1 to 52.9 $\mathrm{g\,kWh^{-1}}$ and from 41.1 to 60.2 $\mathrm{g\,kWh^{-1}}$ ($\approx 46.5\%$ on average). These results suggest that energy-cooperative predictive lighting control is a promising approach to improving indoor farming efficiency under practical resource constraints, while explicitly accounting for the trade-off between energy savings and crop yield.


[9] 2606.27656

Characterizing Driver Interactions with Autonomous Vehicles via Response Maps

Understanding human responses to autonomous vehicle (AV) behaviors is essential for socially aware interaction, which is crucial for socially compatible navigation in shared traffic environments. We characterize human driving responses in interactions with AVs as feedback laws over the coupled state space of the human driven vehicle and the AV. We model the human driver's actions using a response map, a concept based in game theory, and employ a linear representation to capture driver behaviors as a function of AV behaviors, based on empirical data from a driving simulator study. Our results show that 1) human driver acceleration behavior can be captured using response maps, and 2) human driver responses differ significantly with respect to AV behaviors of yielding, non-yielding, and responsive to the human driver.


[10] 2606.27712

A Bi-Layer TSN Formulation for Separable Scheduling of Mobile Emergency Resources

Separable scheduling unleashes the deployment flexibility of mobile emergency resources by dispatching carriers and functional modules separately yet in a coordinated manner, offering a promising avenue to enhance power system resilience. However, this flexibility induces a distinct carrier-supported module routing structure, where non-self-mobile modules must be routed through compatible carrier movements. The resulting carrier-module spatio-temporal coupling makes exact and tractable optimization challenging. This letter identifies this structure and develops a novel exact bi-layer time-space network formulation as a mixed-integer linear program. The proposed formulation represents carrier and module trajectories as interacting network flows and enforces their support relations through explicit arc-level coupling. Compared with the prior logic-based model, the proposed formulation preserves exactness while improving modeling flexibility by eliminating mandatory post-arrival dwelling. Numerical studies validate its correctness and demonstrate substantial computational advantages.


[11] 2606.27719

Bearing-based Circumnavigation with Collision Avoidance in Time-varying Graphs under Limited Target Information

In this paper, we study distributed circumnavigation of a stationary target by a heterogeneous team of agents. Each agent is modelled as a disk rather than a point mass to account for its physical dimensions. The target location is assumed to be accessible only to a small subset of agents, called leaders. The rest, called followers, therefore use only local information available from their designated out-neighbour in the interaction graph characterised by the selection of nearest neighbours. By controlling only angular speeds, we develop a distributed guidance law to circumnavigate a stationary target. The proposed guidance law works for both static and time-varying interaction graphs. Inter-agent collision avoidance is enforced through a logarithmic Barrier Lyapunov (BLF) Function, which guarantees forward invariance of the collision-free set. We show that every follower converges to circumnavigation about the same target as the leader at the end of its directed path in the interaction graph, provided the initial conditions are admissible. Numerical simulations illustrate the effectiveness of the proposed method for both static and time-varying topologies.


[12] 2606.27781

Repair-before-veto control for safe lithium-ion fast charging under unknown ambient and cooling-fault conditions

Fast charging is decisive for electric-vehicle adoption, but field chargers are deployed as one setting while the cell's true thermal state, ambient temperature, and cooling-system health are uncertain. A current that is safe for a healthy cell at room temperature can overheat the same cell when it is hot or its cooling is degraded. We formulate this as a single-setting, unknown-state safe-fast-charging problem and solve it with a margin-aware repair-before-veto controller (RACL-B). RACL-B requests an aggressive current and repairs it online to the tightest measured margin among terminal voltage, cell temperature, and negative-electrode lithium-plating overpotential, rather than committing to a fixed schedule or shutting charging down. We evaluate one deployed setting across nine conditions, spanning 10/25/40 $^\circ$C ambient temperature and 100/60/40\% cooling health, in a high-fidelity Doyle--Fuller--Newman model with partially reversible lithium plating and lumped thermal coupling. Under a strict 45.0 $^\circ$C peak-temperature audit, fixed and ambient-scheduled protocols overheat in five of nine conditions because neither observes hidden cooling degradation, and rigid protective shutdown fails to deliver the charge in every condition. RACL-B safely completes all nine conditions, is 37.9\% faster than the fastest fixed current safe across the whole envelope, produces the least plated lithium, and remains safe across thermal guard bands. The same margin-aware principle drives a transient-credit fault readout (CREST-B) that, on a real introduced-fault battery-pack dataset, gives the strongest learned sequence-to-global monitor for localizing cooling-fault onset under operating-condition shift. The framework provides a deployable thermal-safety guarantee for fast charging together with a margin-aware monitor for the same physical fault class.


[13] 2606.27800

Distributed Air-Gap Flux and Rotor-Current Fusion for Operating-Regime Identification in a 10-MW Kaplan Hydrogenerator

Reliable monitoring of hydroelectric generators requires descriptors that capture both electrical loading and electromagnetic field behavior. This work investigates operating-regime identification in the Porjus U9 10-MW Kaplan hydrogenerator using synchronized measurements from ten stator-mounted Hall probes and six rotor-current channels. Seven steady guide-vane-opening settings are considered, and each 300s record is divided into 1s windows. The resulting windows are represented by spatial Fourier descriptors of the circumferential air-gap field, probe-wise temporal flux indicators, and channel-wise RMS rotor-current features. Correlation analysis and principal component analysis are used to examine how the feature groups vary with the operating point, and Random Forest, radial-basis-function support vector classification, and multilayer perceptron models are evaluated for supervised identification of the guide-vane-opening state. The analysis shows that RMS rotor-current features mainly track the loading axis, while the magnetic-flux features reveal complementary information associated with spatial imbalance, waveform distortion, and weak low-frequency modulation. Spatial descriptors alone provide limited separability, yielding test accuracies below 27%, whereas rotor-current features alone reach about 84-85%. Combining flux and current information gives the most discriminative representation; the SVC-RBF model achieves 99.5% test accuracy and macro-F1 score. The results indicate that distributed air-gap magnetic sensing, when fused with rotor-current measurements, can support accurate and interpretable data-driven monitoring of Kaplan hydrogenerator operating regimes.


[14] 2606.27970

A Beamforming Microwave Interferometric Radiometer for High-resolution Passive Imaging: Concept, Modeling, and Preliminary Demonstration

High-resolution passive microwave imaging is important for numerical weather prediction, disaster monitoring, and oceanographic studies, but kilometer-level spatial resolution remains difficult to achieve because of aperture limitations and the high complexity of large interferometric arrays. This paper proposes a beamforming microwave interferometric radiometer (BF-MIR) for high-resolution passive microwave imaging. BF-MIR employs beamforming-capable antennas as interferometric elements in a large sparse array. The enlarged spatial-frequency sampling interval reduces the required number of elements and the cross-correlation burden, while a large aperture-to-sampling-interval ratio factor (ASRF) array design enables narrow-beam spatial filtering to suppress brightness temperature (TB) aliasing caused by spatial-frequency under sampling. In addition, beamforming enables dynamic beam steering across multiple pointing directions, thereby compensating for the limited instantaneous coverage of narrow beams. A beamforming interferometric imaging model is established, and the relationships among spatial resolution, radiometric sensitivity, and effective field of view are analyzed. An image-domain Shift-Accumulate method is further introduced to analyze aliasing, based on which an aliasing suppression strategy is developed. In addition, a three-element proof-of-concept prototype provides preliminary experimental validation of dynamic beam interferometric measurement and dynamic beam observation modes. These results indicate that BF-MIR is a promising architecture for further spaceborne high-resolution passive microwave imaging.


[15] 2606.28011

From Detection to Action: Using LLM Agents for Fault-Tolerant Control

We propose an agentic Large Language Model (LLM) framework for active Fault-Tolerant Control (FTC) that transforms fault detection outputs into constraint-aware recovery actions grounded in plant-specific knowledge. The approach couples (i) a multi-agent workflow that decomposes operator duties into monitoring, planning, action synthesis, simulation, validation, and reprompting; (ii) a Digital Process Plant Twin (DPPT) that exposes plant data, models, and a simulation service for pre-execution testing; and (iii) a Graph Retrieval-Augmented Generation (Graph RAG) layer built on the CPSMod ontology, which organizes plant knowledge (structure, function, hybrid dynamics, control context, and fault semantics) into a graph that supports relation-aware, multi-hop retrieval for the agents. Corrective actions are generated as minimal-risk state-machine recovery paths and corresponding discrete commands or continuous setpoint adaptations, then validated deterministically against interlocks, envelopes, and dynamic feasibility before any actuation. If no acceptable plan is found within a bounded time window, control is handed to a safety fallback. The framework is evaluated in simulation on two representative benchmarks: a discrete batch Mixing Module and a Continuous Stirred-Tank Reactor (CSTR) under closed-loop PID regulation. Results with lightweight LLMs (GPT-4o-mini and GPT-4.1-mini) show that semantically grounded agents can derive valid recovery decisions within latency budgets compatible with the respective process dynamics, demonstrating a practical pathway from detection to validated corrective action across both discrete and continuous FTC tasks.


[16] 2606.28023

Decentralized Stability of IBR-dominated Power Grids Using Block Diagonal Dominance

The growing penetration of inverter-based resources (IBRs) necessitates stability assessment methods that are scalable, decentralized, and model-agnostic. This paper develops a block diagonal dominance (BDD) criterion for decentralized small-signal stability of IBR-dominated power grids. The proposed approach forms the basis for an enhanced IBR connection compliance condition from a small-signal stability perspective that can be evaluated locally for IBRs to be connected to the grid. The proposed approach is shown to be much less conservative than strict diagonal dominance (SDD). Beyond mere stability, we ensure a minimum decay rate or maximum settling time for IBR-induced oscillation. Crucially, these are achieved without imposing restrictive assumptions on network or IBR models. The framework therefore, offers a practical and theoretically grounded basis for decentralized stability certificate of IBR-dominated power grids.


[17] 2606.28027

MLVC: Multi-platform Learned Video Codec for Real-World Deployment

Neural video codecs have surpassed classical codecs in coding efficiency but remain impractical for deployment due to cross-platform incompatibility and high computational cost. Existing quantization-based solutions fail to produce deterministic results across diverse hardware platforms, leading to catastrophic decoding failures. We introduce MLVC, a hardware-robust neural video codec designed for practical cross-platform inference. The key idea is to explicitly transmit scale parameters through the hyperprior, which guarantees entropy coding consistency across devices without requiring bit-exact arithmetic. While this increases bitrate overhead, we recover most of the coding efficiency through architectural improvements (gated memory, ReGLU activation), a long-term reference recovery mechanism, and domain-specific perceptual training. On the VCD video conferencing benchmark, MLVC achieves >70% BD-rate (MOS) improvement over hardware HEVC, the strongest deployable baseline, while reaching subjective quality competitive with DCVC-RT, which cannot operate across diverse platforms. Both the encoder and decoder run at 100 FPS on average on commodity NPUs from Apple, Intel, and Qualcomm. MLVC is the first neural video codec to combine competitive compression performance, real-time speed, and cross-platform robustness across diverse consumer devices, making it suitable for widespread deployment. Code will be released.


[18] 2606.28046

Optimized Beamforming and Bandwidth Allocation in Multi-Antenna UAV-Assisted Vehicular Networks

Ensuring reliable communication for mission-critical vehicles in dynamic environments with limited infrastructure is a significant challenge due to interference and spectrum scarcity. This paper investigates a UAV-assisted vehicular communication framework that leverages multi-antenna beamforming and dynamic bandwidth allocation to provide prioritized and interference-mitigated wireless links. Vehicles are classified according to their service priority, with each class assigned a distinct frequency band to reduce interference. Within each class, optimized beamforming further minimizes transmission overlap and enhances spectral efficiency. The optimization problem is solved using an alternating optimization framework, incorporating two beamforming strategies: one based on successive convex approximation (SCA) and the other derived in closed form. Numerical results indicate that the proposed scheme outperforms baseline approaches that optimize only bandwidth allocation or beamforming in terms of overall system performance. Among the two joint optimization methods, the closed-form solution achieves higher sum rates and generally requires less transmit power, while also exhibiting lower computational complexity compared with the SCA-based approach.


[19] 2606.28055

Effects of motion cueing on longitudinal acceleration perception in a driving simulator

The driveability of a new heavy-truck driveline is traditionally assessed using physical prototypes. Enabling early evaluation of the driving experience in a human-in-the-loop driving simulator using a virtual prototype has the potential to significantly improve development efficiency. To enable driveability assessment using a moving-base simulator, participants must be able to perceive small differences in longitudinal acceleration. The just-noticeable difference (JND) was therefore evaluated for two variants of the classical motion-cueing algorithm (MCA) tuned specifically for tip-in/launch tests and compared to a more general variant in a driving simulator with a long linear track. Psychometric functions were fitted to responses obtained using a weighted staircase procedure and analysed using a generalized linear model. No significant differences in JND were found between the motion cueing variants. The mean JND across all participants and MCA variants was 5.4%. The mean point of subjective equality in the JND experiment was -1.9%, suggesting that participants perceived the acceleration as higher in the second stimulus of a pair. In a subjective comparison, most participants preferred the motion cueing variants that were tuned for launch manoeuvres over the general variant.


[20] 2606.28114

Screening Matters: A Comparative Study of Conventional and Crowdsourced Listening Tests

Subjective evaluation remains the most reliable way of testing speech and audio coding techniques. Crowdsourcing the listening task is a cost-efficient and fast way of conducting this evaluation, but the quality of the results tends to be inferior to that of conventional listening tests done in the controlled environment of a laboratory. In this paper, classical and neural speech codecs are evaluated to compare P.808 against P.800 DCR tests. A statistical analysis is conducted to investigate the effectiveness of selected screening methods. The analysis shows that the crowdsourced evaluation can be improved by employing postscreening methods based on anchor ordering and rating span, and continuous screening methods like traps and gold standard questions, thus giving more value to the ratings obtained for the codecs under test. Based on these outcomes, a set of suitable screenings is proposed, for cost-effective, simplified, and bias-free enhancement of listening results.


[21] 2606.28143

Specification-aware Robustness Margins for Symbolic Controllers

We address the problem of robust controller synthesis for a class of linear temporal logic (LTL) specifications over families of perturbed systems using symbolic control techniques. Given a dynamical system, a specification, and a symbolic controller synthesized using the fixed-point algorithm of the specification, the objective is to find the maximal perturbation we can apply to the system while the system continues to satisfy the same specification under the same controller. We first provide general results, by demonstrating that controllers synthesized based on the symbolic model can be refined back to a perturbed version of the concrete system while preserving their correctness. Focusing on four fundamental temporal logic specifications, namely safety, reachability, persistence, and recurrence, we introduce a general measure of the maximal robustness margin. Then, for each class of specifications, we derive a customized version of the measure and establish the corresponding theoretical guarantees. Importantly, the robustness margin depends explicitly on the sequence of sets generated during the fixed-point computation, allowing for specification-dependent and less conservative bounds compared to generic abstraction-based approaches. The theoretical developments are illustrated on two examples, demonstrating the practical applicability and effectiveness of the proposed approach.


[22] 2606.28163

Enhanced Neural Video Representation Compression across Extreme Complexity and Quality Scales

Implicit neural representations (INRs) have recently emerged as a promising approach to video compression, delivering competitive rate-distortion performance alongside rapid decoding. However, existing neural video codecs struggle to balance complexity and scalability. Lightweight models often suffer from degraded compression performance when scaled to different bitrate/quality levels, whereas high-performance models exhibit limited scalability, as their model complexity typically increases with quality. This lack of a unified architecture capable of maintaining consistent complexity across a wide range of bitrates severely limits their diverse real-world deployment. To address these challenges, we introduce NVRC++, a novel INR-based video codec that utilizes a lightweight INR with multiple high-resolution feature grids, providing high scalability at any given complexity level. This is paired with an optimization framework that enables efficient overfitting on high-resolution grids for long video sequences, thereby exploiting spatio-temporal redundancies without prohibitive computational or memory overhead. Additionally, an advanced entropy model is designed for efficiently compressing the high-dimensional grid parameters. As a result, NVRC++ provides four complexity levels (from 7kMACs/pixel to 360kMACs/pixel), each spanning wide bitrate and quality ranges while supporting real-time decoding. The experimental results show that NVRC++ offers a much faster decoding speed (up to 7.6x) compared to the SOTA INR-based video codec, NVRC, while delivering comparable performance.


[23] 2606.28202

An Enhanced Source-Free Unsupervised Domain Adaptation Framework for Cross-Dataset EEG Emotion Recognition via Predictive Coding and Test-Time Training

EEG-based emotion recognition is widely used in affective computing but suffers from poor generalization due to domain shifts caused by inter-subject variability, dataset differences, and recording conditions, especially in cross-dataset settings. Conventional unsupervised domain adaptation methods require source data, which is often unavailable due to privacy constraints. Although source-free UDA addresses this limitation, existing methods still struggle with large domain gaps, noisy pseudo-labels, and unstable adaptation. To address these challenges, we propose an enhanced source-free unsupervised domain adaptation (SF-UDA) framework for cross-dataset EEG emotion recognition. The framework introduces a non-contrastive predictive coding-based self-supervised pretraining strategy to learn robust and transferable EEG representations by modeling temporal dependencies in a reconstruction-based manner. During adaptation, we estimate target-domain structure through class-wise clustering and prediction disagreement, and optimize the model using a dual-stage strategy consisting of Multi-Loss Adaptive Regularization and Localized Consistency Learning, improving stability and neighborhood consistency under noisy pseudo-labels. We also propose a lightweight test-time training mechanism that enables selective online updates for uncertain samples using predictive reconstruction loss and entropy minimization. Experiments on DEAP, SEED, and DREAMER show consistent improvements over state-of-the-art SF-UDA methods, achieving 69.56% and 63.03% accuracy on SEED and DREAMER when trained on DEAP, and 61.38% and 68.90% when trained on SEED.


[24] 2606.28249

HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech

Recently, Large Language Model (LLM)-based Text-to-Speech (TTS) models have achieved remarkable naturalness. However, the standard Supervised Fine-Tuning paradigm often converges to statistically averaged prosody, limiting emotional expressiveness. While preference-driven optimization offers a promising alternative, existing approaches suffer from two structural mismatches: information conflict, where content and emotion in a shared latent space produce conflicting gradients, leading to reward hacking and semantic degradation; and scale gap, where sparse sentence-level rewards struggle to guide dense frame-level generation. To overcome these challenges, we propose HPRO, a hierarchical progressive reward optimization framework. Within HPRO, we introduce the HD-Emo codec as a novel differentiable reward model to resolve the information conflict. It extracts speech into distinct content and style preference tokens, structurally isolating emotional optimization from semantic content. Building upon this structured preference space, HPRO bridges the scale gap by progressively aligning frame-, word- and sentence-level objectives. Experiments demonstrate that HPRO significantly enhances emotional expressiveness, while effectively preserving linguistic intelligibility. The code and audio samples are publicly available at this https URL.


[25] 2606.28281

PAC-Bayesian Certificates for Quadratic Closed-Loop Control

PAC-Bayesian bounds provide finite-sample guarantees for data-dependent randomized predictors, but applying them to learning-based control is difficult because the natural objective is a quadratic trajectory cost. Such losses are unbounded, non-Lipschitz , and lead to response-dependent Chernoff terms. We employ System Level Synthesis parameterization, which exposes the closed-loop trajectory map of a linear system directly and makes the quadratic control loss amenable to explicit certification. Moreover, we provide a set of PAC-Bayes-Chernoff certificates for posterior distributions over feasible closed-loop responses. For Gaussian disturbance trajectories with arbitrary covariance, we derive an exact one-sided Gaussian transform and a tractable quadratic upper bound expressed through closed-loop sensitivity quantities. We also derive a posterior-localized surrogate for settings where pointwise closed-loop response certificates are unavailable or have support related admissibility issues. Although PAC-Bayes certifies a non-degenerate posterior, the convex quadratic form of the SLS loss transfers the certificate to the posterior mean response. We present a deterministic mean response deployment result that is particularly suitable for control while retaining the stochastic posterior in the bound. Additionally, we provide a data-driven bound for this deployment, transitioning away from an oracle bound. Minimizing this bound naturally results in a learning algorithm for control selection from data. Numerical experiments on a double integrator show that the algorithm acts as a sensitivity-aware finite-sample regularizer, improving held-out cost and reducing closed-loop sensitivity in the low-data regime


[26] 2606.27320

Elastic Time: Dynamic Frame Rate Bottlenecks for Neural Audio Coding

Neural audio autoencoders have become a core component of compression, feature extraction, and generation. However, while existing systems support variable bitrate, the vast majority of models still operate at a fixed latent frame-rate, allocating equal temporal budget to regions with very different information density, which can result in unnecessarily long sequences. We introduce Elastic Time, a dynamic frame-rate bottleneck that converts fixed-frame-rate autoencoders to dynamic ones. Our method learns a lightweight latent predictor used to decide which frames can be skipped and later reconstructed, enabling efficient greedy boundary selection at inference. Experiments show our method enables deployment-time rate control while improving efficiency-quality tradeoffs relative to baselines. Overall, we provide a flexible mechanism for adjusting temporal resolution in audio autoencoders, potentially facilitating more efficient downstream modeling for generation and long-context tasks.


[27] 2606.27409

Delayed Verification Destabilizes Multi-Agent LLM Belief: Instability Thresholds and Optimal Corrector Placement

Multi-agent large language model (LLM) systems often rely on verifier and critic agents to suppress hallucinations, but verification is delayed. During this delay, false claims can propagate through the agent network. We model this process as delayed consensus on a graph with grounded corrector nodes. Spectral decomposition by the grounded Laplacian yields a closed-form stability threshold for the verification dose: correction that is too strong or too delayed can turn consensus into oscillation. The most unstable regime occurs when the communication and verification delays coincide; for delay two, the threshold is the inverse golden ratio. The same framework gives a supermodular placement objective and a greedy (1-1/e)-approximation rule for assigning a limited corrector budget to influential nodes. Experiments across five open models confirm the predicted dose-delay oscillations. By contrast, grounded factual answering makes truth an absorbing boundary and eliminates the effect, suggesting that the instability is specific to signed-belief tasks while grounded verification remains stabilizing


[28] 2606.27411

Compression-Driven Anomaly Detection in Brain MRI Using an Interpretable Quantum Autoencoder

We study a quantum autoencoder (QAE) for compression-driven anomaly detection in brain MRI data. The approach leverages angle encoding to map image patches into quantum states, followed by a variational encoder-decoder architecture trained to discard information via auxiliary trash qubits. Anomaly scores reflect the degree to which inputs resist compression relative to normal data, with higher scores corresponding to deviations from the learned normal manifold. Evaluated on publicly available brain MRI DICOM datasets, the method achieves a slice-level ROC-AUC of approximately 0.95 and a patch-level ROC-AUC of approximately 0.813, outperforming classical autoencoder and PCA baselines. Analysis of the learned parameters reveals a pronounced encoder-decoder asymmetry, where effective anomaly detection arises from structured information compression within the encoder rather than increased parameter magnitude or decoder expressivity. This results in a controlled compression-reconstruction trade-off with a clear operating regime that supports principled threshold selection. Qualitative evaluation further shows that the QAE produces spatially localized anomaly heatmaps aligned with tumorous regions. The results, supported by promising baseline performances, demonstrate that quantum autoencoders provide an interpretable and controllable mechanism for anomaly detection based on incompressibility with respect to a learned latent representation. This work highlights the potential of quantum autoencoders as a principled tool for studying compression dynamics in quantum machine learning, with promising implications for decision support in medical imaging workflows.


[29] 2606.27412

Not All Relations Rotate Alike: Transformation-Aware Decoupling for Viewpoint-Robust 3D Scene Graph Generation

3D Scene Graph Generation (3DSGG) represents 3D scenes as structured object-relation-object graphs, providing a compact relational abstraction for spatial understanding. In embodied intelligence settings, the same 3D scene may be observed by agents from viewpoints that differ by yaw rotations. However, current 3DSGG models often fail to produce relation predictions that follow the expected transformation behavior under such viewpoint shifts. This behavior reveals an empirical mismatch related to predicate-level transformation heterogeneity: directional predicates such as left, front, right, and behind should transform with the observation frame, whereas most contact, support, and semantic predicates such as standing on and attached to should remain stable. To reduce this mismatch, we propose Transformation-Aware Decoupling (TAD), a viewpoint-robust 3DSGG framework that decouples relation reasoning according to predicate transformation behavior and is supported by viewpoint-stable object representations. TAD decomposes relation reasoning into two parts: one learns cues that should stay stable across viewpoints, while the other learns directional cues that should change with the observation frame. The two parts are merged for standard multi-label predicate prediction. Transformation-specific descriptors and group-aware auxiliary supervision encourage the two branches to capture complementary relation cues. Extensive experiments on 3DSSG show that TAD achieves state-of-the-art robustness under yaw viewpoint changes without training-time rotation augmentation, while maintaining competitive performance under the standard benchmark. The project page is available at this https URL.


[30] 2606.27455

Directed Graph Topology Inference via Graph Filter Identification

We address the problem of inferring a directed network from nodal measurements generated by linear diffusion dynamics on the sought graph. Observations are modeled as the outputs of a graph convolutional filter, i.e., a polynomial (with unknown coefficients) of a local diffusion graph-shift operator encoding the latent graph topology, excited with an ensemble of independent graph signals with arbitrarily-correlated nodal components. Unlike prior efforts that considered undirected graphs and white signal excitations, here the graph-shift operator and the observations' covariance matrix are not simultaneously diagonalizable. In this challenging context, we first rely on measurements of the output signals along with prior statistical information on the inputs to identify the diffusion filter. Such system identification problem involves solving a system of quadratic matrix equations, which we show is identifiable under spectral-diversity assumptions on the input covariances. For algorithmic purposes we recast it as a smooth quadratic minimization subject to Stiefel manifold constraints. Subsequent identification of the network topology given the graph filter estimate boils down to finding a sparse and structurally admissible shift that commutes with the given filter, thus, forcing the latter to be a polynomial in the sought graph-shift operator. A joint graph filter and topology identification algorithm is also proposed, which alternates between the aforementioned steps in a mutually reinforcing fashion to offer improved sample complexity. Numerical tests corroborate the effectiveness of the proposed algorithms in recovering synthetic digraphs and real-data case studies, and illustrate their potential utility on urban mobility analyses as well as portfolio optimization.


[31] 2606.27609

Training Observable Control Policies to Expose Agent State Through Actions

Physical or operational constraints often impose communications limitations on autonomous agents. Such limitations complicate monitoring or multiagent coordination. Even when strong communications are absent, some information may still be available. The remainder of the relevant agent state may be reconstructed via estimation. The actions taken by an agent are a potential source of information -- as the agent interacts with the environment, these actions may be observed even in the absence of explicit communication. We investigate using actions to estimate the state of an agent, using reinforcement learning to develop policies which make the estimation problem more tractable. Policy observability is encouraged through the training reward and is analyzed using simulation of the trained agent. In an aircraft tracking problem a policy with enhanced observability is found that has minimal impact on nominal task performance.


[32] 2606.27612

Enhancing Co-packaging Optics Enabled Silicon Photonics Security Assurance Hardware Fingerprinting

Silicon photonics enables integration of optical components using standard semiconductor processes, greatly improving data communication bandwidth and energy efficiency. However, photonics integrated circuits (PICs) face unique security challenges, such as counterfeit or tampering threats, that conventional electronic security methods do not address. We propose a novel hardware fingerprinting technique that embeds two dimensional photonic crystal patterns into the density control filler regions of a PIC. Each PhC pattern is designed to resonate a specific visible to near infrared wavelengths, producing a distinctive optical signature (based on wavelength, polarization, and incident angle) for each device. Finite difference time domain (FDTD) simulation using ANSYS Lumerical is employed to optimize nanostructure dimensions and spacing so that each device's reflection/absorption spectrum contains unique narrowband peaks. No extra fabrication steps or materials are required beyond standard lithography, keeping costs low. The embedded nanostructures have sub-50nm precision, making forgery extremely difficult. Our method yields a high resolution, scalable fingerprint for silicon photonic chips, enabling cost-effective device authentication and improved supply chain security.


[33] 2606.27629

Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining

Cross-platform deployment of offensive comment detection for Chinese social media suffers performance degradation. The paper proposes a dual-threshold hard mining method to address this. First, the clean-Chinese-base RoBERTa is finetuned on COLD to establish a binary baseline for fair comparison. Second, a three-class fine-labeled test set covering Weibo, Xiaohongshu, Tieba, and Zhihu is constructed, domain distances from the source are quantified using Jaccard and Proxy-A Distance, as well as the degradation bottleneck of the baseline under domain shift is systematically revealed. Herein, a dual threshold hard example mining strategy is proposed. High- and low-confidence error-prone samples are filtered from unlabeled corpora by prediction confidence. The model is secondarily finetuned under implicit contexts with merely a small set of manually labeled hard examples, realizing low-cost cross-platform domain adaptation. Experiments reveal significant performance gains of the optimized model across four platforms.


[34] 2606.27717

Do Speech Emphasis Models Generalize across Languages and Emotions?

Prosodic emphasis varies across languages, emotions, and speaking styles, yet existing emphasis detection models are largely trained and evaluated on monolingual neutral read speech. We introduce MMEE (Multilingual Multi-Emotion Emphasis), a corpus of 10,000 professionally recorded expressive utterances (14.13 hours) across 7 languages and 34 emotion/style categories, with three-level perceptual labels (10 annotations per sample). We benchmark two state-of-the-art architectures under monolingual, cross-lingual, multilingual, cross-emotion, cross-dataset, and data-scale settings. Monolingual models show limited zero-shot transfer, degrading across typologically distant languages, while multilingual training substantially improves robustness. Models transfer robustly between high- and low-arousal emotions; bidirectional transfer between synthetic and perceptual benchmarks suggests shared prosodic structure; and performance stays robust even at smaller training scales.


[35] 2606.27772

An Embedded Real-Time License Plate Recognition System for Complex Traffic Scenes

Vehicle license plate recognition is an integral component of intelligent transportation systems. In this work, we present an embedded real-time license plate recognition system customized for developing countries. We address the challenge of handling complex, unstructured traffic scenes with diverse vehicle types while implementing the system on an embedded platform for low-cost deployment. Our method consists of license plate detection on a multi-vehicle image, followed by character recognition on the detected license plates. Both steps use lightweight convolutional neural networks to balance accuracy and efficiency. We also introduce the SL-LPR dataset of Sri Lankan road images, which contains a variety of vehicle types and traffic conditions typically seen in developing countries. On this dataset, the license plate detection and character recognition models achieved 93.6% mAP and 87.88% accuracy, respectively, and were competitive against larger models on several public datasets. To achieve real-time performance in a resource-constrained embedded environment, we applied low-bitwidth quantization using the Brevitas library and implemented FPGA acceleration for the models using the FINN framework. The end-to-end system can operate at 11.5~FPS when implemented on the Xilinx Kria KV260 platform. These results demonstrate that our system is effective for real-time license plate recognition on an embedded device, even in complex traffic scenarios. The SL-LPR dataset is available for research use at: this https URL.


[36] 2606.27914

Drifting in the Future: Stabilizing Path Following Drifting on High-Latency Vehicle Systems

Autonomously controlling and handling a vehicle at and beyond its stability limit is a mathematically and computationally demanding task. Prior demonstrations of automated drifting have been limited to research platforms with instantaneous torque delivery and independently actuated wheels, leaving their applicability to production vehicles with actuator latencies and mechanically coupled axles uncertain. To overcome these issues, we design a predictor to compensate for powertrain delays, develop a revised control formulation to accommodate higher actuation latencies as well as a differential coupling on the driven axle, and introduce brake-based velocity stabilization. This paper presents the controller framework, the model extensions, and real-world experimental results. We observe that our controller enables a production sports car with a combustion engine to robustly sustain circular and figure-eight drifts, limiting lateral error to 1.1 m and sideslip overshoot to 0.06 rad despite actuator delays exceeding 250 ms, while mitigating oscillations and maintaining stable path and sideslip tracking. In conclusion, our results establish that autonomous drifting is feasible on production-ready vehicles, opening pathways to advanced safety systems capable of stabilizing cars in scenarios where traditional control fails.


[37] 2606.27965

Grammar-Guided Hierarchical Parsing for Long-form Audio Activity Recognition

Long-form audio exhibits an inherent hierarchy: fine-grained events form sub-activities, which in turn constitute higher-level activities. Prior work often models these levels separately, leading to cross-level inconsistencies and requiring supervision at multiple levels. We formulate the problem as hierarchical parsing from event-level evidence: given detected event segments with class posteriors, we infer an order-consistent Act-Sub-Event parse tree. We propose Hierarchical Activity Grammar, encoding hierarchical composition and temporal-order constraints, and perform grammar-guided decoding that combines event evidence with a grammar prior. This yields a temporally grounded parse tree from which sub-activity segmentation and activity classification are derived, without requiring sub-activity or activity labels for training. Experiments on the long-form MultiAct audio dataset demonstrate improved temporal-order consistency (Edit score) and produces interpretable hierarchies.


[38] 2606.28002

Dialogue to Detection: A Multimodal Hybrid NLP Pipeline for Insurance Fraud Detection

Insurance fraud imposes substantial financial losses and operational inefficiencies, raising premiums and impacting trust among legitimate policyholders. Early detection at FNOL remains a persistent challenge. Existing approaches rely largely on private, text-only datasets, limiting progress on multimodal methods that integrate linguistic, behavioural, and speaker-based indicators. We introduce a synthetic multimodal framework that replicates FNOL conditions. It generates agent-customer dialogue transcripts and two-speaker audios, performs ASR and diarisation. Downstream modules combine NER, regex-based feature extraction, LLM-RAG retrieval, and speaker embeddings in a rule-based risk score to flag narrative reuse, structural inconsistencies, and cross-case voice repetition while balancing sensitivity and false positives. Dataset validation and component-level evaluations show stability and transfer potential, offering a reproducible baseline beyond text-only fraud detection.


[39] 2606.28032

A Flexible Encoding Model for Non-Unique Note Alignments

Symbolic music alignment links notes in a symbolic performance to their counterparts in a score. While existing alignment encoding formats provide unique correspondences between these notes, there are various musical practices and forms such as practice repetitions in rehearsal and improvised realizations in basso continuo that require a more flexible approach to encoding their alignments. In this paper, we propose a minimal, backward-compatible extension to the Match file format to support such non-unique and semantically complex alignments. We introduce two virtual pointer notes - virtual score notes and virtual performance notes - which allow to encode multiple links between performance and score notes. In addition we expand the Match file's 'section' line to include semantically meaningful annotations of performance regions beyond score-indicated musical repetitions. We further demonstrate the utility of these extensions through two representative use-cases in piano rehearsal and basso continuo.


[40] 2606.28048

DG^VoiC: Speaker Clustering for Fraud Investigation under Real Call-Centre Conditions

Insurance fraud remains costly and operationally difficult, particularly in call-centre workflows where many customer interactions begin at FNOL. While recent fraud detection methods mainly rely on structured data, text, or images, repeated speaker identity across calls remains underused as an investigative signal. This paper presents DG^VoiC, a voice clustering framework for customer verification and cross-profile speaker linking on anonymised real call-centre audio. The approach combines sensitive information-aligned anonymisation, speech-focused preprocessing, sliding-window speaker embedding extraction, and cosine similarity based clustering to identify repeated speakers under real telephony conditions. The method was evaluated on 121 recordings, with a curated reference subset of 56 samples in 22 human-agreed speaker clusters. used for validation. The best configuration achieved 96% AMI, 95% ARI, 98% completeness, 100% homogeneity, and 99% V-measure. These results show that speaker clustering can provide a strong additional signal for fraud investigation by helping analysts verify speaker consistency and surface repeated voices across customers.


[41] 2606.28158

Recovering Sharp Conductivity Features in the Finite-Data Calderón Problem with Physics-Informed Neural Networks

Physics-informed neural networks (PINNs) have recently emerged as a promising framework for addressing the Calderón inverse problem from limited boundary data. In this work, we revisit neural Calderón inversion by introducing multiscale boundary excitations based on randomized wavelet functions and investigating the role of Fourier-feature encoding (FFE) for representing sharp conductivity variations. We propose a physics-informed reconstruction framework that represents the unknown conductivity and the associated family of electric potentials with separate neural networks conditioned on the applied boundary excitations. The governing elliptic PDE is enforced through physics-informed residuals, while finite Dirichlet-to-Neumann (DtN) data are incorporated through boundary losses. Using synthetic data from a finite-difference forward solver, we evaluate the method on conductivity fields with inclusions, sharp interfaces, smooth profiles, and heterogeneous media. Results show that the framework recovers dominant conductivity structures from finite boundary measurements with relative errors between $3\%-12\%$ approximately. We show that FFE improves the reconstruction of localized sharp features, particularly for inclusions and interfaces, but are not universally optimal, with raw-coordinate networks performing competitively for smoother fields. These results highlight coordinate representations and boundary excitation design as key factors in neural Calderón inversion.


[42] 2606.28225

Estimation--Prediction Tradeoff in Causal Probabilistic Temporal Graphs

Temporal link prediction is usually evaluated by predictive performance on unseen edges, but in probabilistic temporal graphs this criterion can conflate model error with irreducible uncertainty. We study this issue by characterising an inherent estimation--prediction tradeoff in binary logistic models where regimes that maximise Fisher information and improve parameter recoverability are also those with the highest entropy, making individual predictions intrinsically harder even under perfect parameter recovery. We propose a probabilistic causal framework for generating temporal graphs with transient edges and known ground-truth causal structure, allowing temporal link prediction to be evaluated jointly with causal parameter recovery. For the proposed binary logistic parametrisation, we derive the Cramér--Rao bound and validate the tradeoff between parameter estimation error and irreducible predictive loss. Our results show that predictive accuracy alone may not reflect whether a model has learned the underlying causal mechanism, motivating benchmarks that distinguish reducible model error from intrinsic process uncertainty.


[43] 2211.01720

Response time central-limit and failure rate estimation for stationary periodic rate monotonic real-time systems

Real-time systems consist of a set of tasks, a scheduling policy, and a system architecture, all constrained by timing requirements. Many everyday embedded systems, within devices such as airplanes, cars, trains, and spatial probes, operate as real-time systems. To ensure safe failure rates, response times-the time required for the exection of a task-must be bounded. Rate Monotonic real-time systems prioritize tasks according to their arrival rate. This paper focuses on the use of the central limit of response times built in \cite{zagalo2022} and an approximation of their distribution with an inverse Gaussian mixture distribution. The distribution parameters and their associated failure rates are estimated through a suitable re-parameterization of the inverse Gaussian distribution and an adapted Expectation-Maximization algorithm. Extensive simulations demonstrate that the method is well-suited for the approximation of failure rates. We discuss the extension of such method to a chi-squared independence test adapted to real-time systems.


[44] 2412.17100

TG-OT: Topology-guided CCTA-IVUS registration via optimal transport matching

Registering coronary CT angiography (CCTA) and intravascular ultrasound (IVUS) enables comprehensive coronary analysis that neither modality can provide alone, yet their fusion remains challenging due to differences in imaging geometry, resolution, and artifact profiles. Existing methods depend on pre-computed lumen or vessel wall segmentations that are unreliable under IVUS acoustic shadowing from calcifications, limiting their clinical applicability. We propose TG-OT, a fully automatic CCTA-IVUS registration framework that eliminates this dependency by integrating trained feature detectors directly into the registration pipeline. Lightweight CNNs are trained to predict calcifications, bifurcations, and lumen radii on the topological $(\theta, z)$ cylinder, encouraging topologically coherent detections without requiring explicit segmentation. Registration is formulated as an optimization over centerline warping parameters, driven by an unbalanced Sinkhorn optimal transport loss on the cylindrical geometry that provides spatially informative gradients even for spatially disjoint predictions, complemented by a lumen matching term. Evaluated on $N{=}47$ paired CCTA-IVUS cases in a 5-fold cross-validation setup, TG-OT achieves strong longitudinal ($\overline{\text{Dice}}_\text{ctl}{=}0.99$), rotational ($\overline{S}_c{=}0.96$), and lumen alignment ($\overline{\text{Dice}}_\text{L}{=}0.69$) without manual interaction or prior segmentation, marking a meaningful step toward clinical integration of automatic CCTA-IVUS fusion.


[45] 2504.17408

Introducing Combined Effects of Filtering and ASE Noise in Optical Links Supposing Different Equalization Algorithms

This paper develops and validates a discrete-time modeling framework for the joint impact of cascaded optical filtering, distributed ASE noise, and transceiver noise in coherent optical links. The work focuses on the post-equalization signal-to-noise ratio, which is the central quantity used to quantify filtering penalty under receiver DSP. Starting from an optical-link abstraction with arbitrary filter transfer functions and colored noise spectra, we derive analytical expressions for the matched-filter bound, Zero-Forcing Equalization, Minimum Mean Square Error Equalization, Fractionally Spaced Equalization, and finite-length equalizers with and without explicit colored-noise treatment. The model is coupled to a measurement-based transceiver SNR characterization, so that optical-link penalties and receiver impairments can be evaluated within the same formulation. Time-domain simulations with LMS equalization validate the analytical predictions over severe filtering conditions, different tap lengths, and different ASE-noise positions along the link. Experimental results with commercial transceivers and ROADMs further confirm the accuracy of the MMSE and FSE models, while highlighting the role of realistic filter modeling and equalizer implementation limits. The resulting framework provides a tractable basis for quality-of-transmission estimation and optical-network digital-twin implementations.


[46] 2505.22783

Temporal Convolutional Autoencoder for Interference Mitigation in FMCW Radar Altimeters

Reliable altitude estimation with frequency-modulated continuous wave (FMCW) radar altimeters is increasingly a challenge due to in-band interference from modern communication systems. In this paper, we present a temporal convolutional autoencoder (TCAE) that directly processes in-phase and quadrature (IQ) samples to suppress structured interference while preserving signal phase and frequency content for range estimation. The model is trained and initially evaluated within a full radar altimeter simulation chain, then further validated via over-the-air (OTA) experiments using a universal software radio peripheral (USRP)-based testbed. Results show that the TCAE reduces altitude estimation error by more than 85% compared to least mean squares (LMS) adaptive filtering under severe interference conditions, including low signal-to-interference-plus-noise ratio (SINR) and full temporal overlap between interfering and radar signals. Unlike conventional methods, the TCAE maintains phase fidelity and beat structure, enabling accurate range estimation even when interferers occupy more than one-quarter of the radar bandwidth. The implemented TCAE performs mitigation directly on fixed-length IQ windows using a single feed-forward pass and was integrated into the MATLAB/ONNX-based evaluation chain used for both simulation and OTA testing. These findings demonstrate that learned IQ-domain interference mitigation can enhance radar-altimeter resilience under a range of tested interference conditions.


[47] 2507.19785

Radar and Acoustic Sensor Fusion using a Transformer Encoder for Robust Drone Detection and Classification

The use of drones in a wide range of applications is steadily increasing. However, this has also raised critical security concerns such as unauthorized drone intrusions into restricted zones. Therefore, robust and accurate drone detection and classification mechanisms are required despite significant challenges due to small size of drones, low-altitude flight, and environmental noise. In this letter, we propose a multi-modal approach combining radar and acoustic sensing for detecting and classifying drones. We employ radar due to its long-range capabilities, and robustness to different weather conditions. We utilize raw acoustic signals without converting them to other domains such as spectrograms or Mel-frequency cepstral coefficients. This enables us to use fewer number of parameters compared to the stateof-the-art approaches. Furthermore, we explore the effectiveness of the transformer encoder architecture in fusing these sensors. Experimental results obtained in outdoor settings verify the superior performance of the proposed approach compared to the state-of-the-art methods.


[48] 2508.09348

Inference-Driven Uplink for 6G: Architecture, Principles, and Challenges

Next-generation wireless networks (6G) face a critical uplink challenge arising from stringent device-side resource constraints and the growing demand for intelligent services. This article introduces InferCom, an inference-driven uplink architecture designed to enable robust communication under low signal-to-noise (SNR) conditions. It adopts a compute-asymmetric design with a lightweight transmitter and an inference-capable receiver empowered by generative artificial intelligence models. Grounded in the information bottleneck principle, InferCom redefines communications through task-agnostic compression, inference-driven reconstruction, error distribution channel code, and quality of experience-aware retransmission. A case study demonstrates that InferCom outperforms conventional 5G NR and Deep-JSCC in terms of transmitter-side computational complexity, uplink coverage and retransmission efficiency. Finally, we outline key challenges and research directions for inference driven uplink design in future intelligent 6G networks.


[49] 2509.15628

Blind Room Impulse Response Identification via Reverberant Speech Spectrum Reconstruction

This paper proposes Rec-RIR for blind room impulse response (RIR) identification. Based on the convolutive transfer function (CTF) approximation, we propose a multi-task deep neural network, which sequentially removes noise and reverberation from speech recording, and estimates the CTF filter by reverberant speech spectrum reconstruction. Subsequently, a pseudo intrusive measurement process is employed to convert the CTF filter into RIR by simulating a common intrusive RIR measurement procedure. Experimental results demonstrate that Rec-RIR achieves state-of-the-art (SOTA) performance in blind RIR identification.


[50] 2511.08370

Power Hardware-in-the-loop Interfacing via $\mathcal{H}_\infty$ Model Matching

This paper presents an $\mathcal{H}_\infty$ model matching control-based approach to the problem of power hardware-in-the-loop (PHIL) interfacing. The objective is to interconnect a grid simulation and a physical device via an interface in a way that is stable and accurate. Conventional approaches include the ideal transformer method (ITM) and its impedance-based variants, which trade accuracy for stability, as well as some $\mathcal{H}_\infty$ control-based approaches, which do not make use of all the available information in their optimization for accuracy. Designing for transparency, as opposed to accuracy as existing approaches do, would achieve both accuracy and stability, while making use of all the dynamical information present in the idealized interconnection of the grid and device. The approach proposed in this paper employs model matching to formulate the PHIL problem as an $\mathcal{H}_\infty$ control problem using transparency as the explicit frequency-domain control objective. The approach is experimentally validated in a real-time resistive-load PHIL setup, and is found to achieve accuracy levels that are comparable or superior to those of an ITM-based interface.


[51] 2511.23322

Data-driven Reachability Verification with Probabilistic Guarantees under Koopman Spectral Uncertainty

Providing rigorous reachability guarantees for unknown complex systems is a crucial and challenging task. In this paper, we present a novel data-driven framework that addresses this challenge by leveraging Koopman operator theory. Instead of operating in the state space, the proposed method encodes model uncertainty from finite data directly into Koopman spectral representation with quantifiable error bounds. Leveraging this spectral information, we systematically determine time intervals within which trajectories from the initial set are guaranteed, with a prescribed probability, to reach the target set. We finally demonstrate the efficacy of our framework in numerical examples.


[52] 2601.13236

Pixelwise Uncertainty Quantification of Accelerated MRI Reconstruction

Parallel imaging techniques reduce magnetic resonance imaging (MRI) scan time but image quality degrades as the acceleration factor increases. In clinical practice, conservative acceleration factors are chosen because no mechanism exists to automatically assess the diagnostic quality of undersampled reconstructions. This work introduces a general framework for pixel-wise uncertainty quantification in parallel MRI reconstructions, enabling automatic identification of unreliable regions without access to any ground-truth reference image. Our method integrates conformal quantile regression with image reconstruction methods to estimate statistically rigorous pixel-wise uncertainty intervals. We trained and evaluated our model on Cartesian undersampled brain and knee data obtained from the fastMRI dataset using acceleration factors ranging from 2 to 10. An end-to-end Variational Network was used for image reconstruction. Quantitative experiments demonstrate strong agreement between predicted uncertainty maps and true reconstruction error. Using our method, the corresponding Pearson correlation coefficient was higher than 90% at acceleration levels at and above four-fold; whereas it dropped to less than 70% when the uncertainty was computed using a simpler a heuristic notion (magnitude of the residual). Qualitative examples further show the uncertainty maps based on quantile regression capture the magnitude and spatial distribution of reconstruction errors across acceleration factors, with regions of elevated uncertainty aligning with pathologies and artifacts. The proposed framework enables evaluation of reconstruction quality without access to fully-sampled ground-truth reference images. It represents a step toward adaptive MRI acquisition protocols that may be able to dynamically balance scan time and diagnostic reliability.


[53] 2602.14612

Event-Grounded Question Answering over Long Audio via Structured Retrieval

Answering natural-language questions over multi-hour audio requires both event recognition and temporal grounding. Current large audio-language models perform well on short clips, but are limited by context length, query-time cost, and weak temporal localization. We present LA-RAG (Long Audio-Retrieval Augmented Generation), a structured framework that converts continuous audio into timestamped event records using an open-vocabulary Audio Grounding Model (AGM), stores them in a SQL event database, and answers queries through intent-aware retrieval followed by LLM-based generation. LA-RAG supports offline grounding mode, where long recordings are pre-indexed for low-latency QA, and inference-time grounding mode, where query-conditioned grounding is performed for shorter open-ended clips. We create 24-hour Home-IoT and Industrial-IoT audio benchmarks and augment CASTELLA, a real-world audio moment retrieval dataset with QA pairs. In offline grounding mode, LA-RAG achieves 76.88% overall accuracy on Home-IoT and 71.10% on Industrial-IoT, with average query latencies below 0.6 seconds. In inference-time grounding mode, state-of-the-art LALMs achieve competitive event-detection accuracy on CASTELLA-QA but low temporal detection F1. We further show that LALMs augmented with our structured retrieval metadata achieve consistent temporal detection improvements, with F1 gains of 11-17% across baseline models with improved latency. These results show that explicit timestamped grounding and structured retrieval provide a practical complement to generative audio-language models for deployment-oriented long-audio QA.


[54] 2602.17901

MeDUET: Disentangled Unified Pretraining for 3D Medical Image Synthesis and Analysis

Self-supervised learning (SSL) and diffusion models have respectively advanced representation learning and generative modeling for high-dimensional 3D visual data, yet they are often developed as separate paradigms. Their unification remains challenging under multi-source heterogeneity, as anatomical content must be preserved for analysis while acquisition-related style varies across centers and affects synthesis. In this paper, we propose MeDUET, a 3D Medical image Disentangled UnifiEd PreTraining framework in the variational autoencoder latent space. MeDUET formulates unified pretraining as an empirical factor identifiability problem, aiming to learn domain-invariant content factors for anatomy and domain-specific style factors for appearance. To improve factor separation, MeDUET first uses token demixing with a standard adversarial domain regularizer to establish basic content-style specialization, and further introduces Mixed Factor Token Distillation and Swap-invariance Quadruplet Contrast to reduce mixed-region factor leakage and organize factor spaces with factor-wise invariance and discriminability. With these learned factors, MeDUET transfers effectively to both synthesis and analysis, yielding higher fidelity, faster convergence, and better controllability for synthesis, while achieving competitive or superior domain generalization and label efficiency on diverse datasets, tasks, and modalities. Overall, MeDUET shows that multi-source heterogeneity can serve as useful supervision, with disentanglement providing an effective interface for unifying 3D medical image synthesis and analysis. Our code is available at this https URL.


[55] 2604.12594

Optimal Battery Bidding under Decision-Dependent State-of-Charge Uncertainties

Lithium Iron Phosphate (LFP) Battery Energy Storage Systems (BESSs) are a key enabler of the energy transition. However, they are known to exhibit significant inaccuracies in the estimation of their State of Charge (SOC). Such estimation errors can directly impact the participation of BESSs in electricity markets. In this work, we demonstrate that neglecting SOC uncertainty in battery bidding can lead to significant delivery failures, including the inability to meet promised frequency reserves. To address this risk, we investigate bidding strategies that account for SOC uncertainty. We propose three constraint-tightening optimization approaches of increasing complexity: (i) a fixed-margin formulation, (ii) an adaptive-margin optimizer, and (iii) an uncertainty-aware optimization model. The latter explicitly accounts for the decision-dependent nature of the uncertainty. Numerical results demonstrate that while all three approaches robustify against SOC uncertainty, the uncertainty-aware formulation outperforms the others in maximizing revenue while ensuring reliable frequency reserve provision. This highlights the significance of treating SOC uncertainty as an endogenous process within the operational strategy.


[56] 2605.08668

Beyond Information Redundancy: Expanding Cross-Modal Knowledge Representation for Power Load Time Series Forecasting

Load forecasting is pivotal for stable power systems. Conventional uni-modal methods suffer from representation drift under data scarcity. While recent multi-modal approaches attempt to alleviate this, they exhibit severe information redundancy, merely recycling time series data via superficial intra-modal transformations. In this paper, we argue that the essence of multi-modal time series learning should expand representation manifolds via complementary cross-modal knowledge enrichment rather than duplicating redundant information, especially for few-shot scenarios prevalent in power systems. To this end, we propose KEMM-Net, a Knowledge-Enriched Multi-Modal Network for power load forecasting. KEMM-Net first constructs textual and visual embeddings to strengthen load time series representations from different knowledge perspectives. It then introduces a Partial Information Decomposition (PID)-guided cross-modal contrastive learning mechanism to achieve cross-modal semantic alignment and balance redundant, synergistic, and unique information for forecasting. Extensive experiments on real-world public datasets demonstrate that KEMM-Net consistently outperforms strong deep learning and multi-modal baselines, particularly in few-shot settings. Our code is available at this https URL.


[57] 2606.13544

Adaptive Turn-Taking for Real-time Multi-Party Voice Agents

Turn-taking in multi-party spoken conversations remains a fundamental challenge for voice-based agents, particularly under dynamic floor competition and varying user expectations. We propose ModeratorLM, a role-playing voice agent that conditions turn-taking behavior on an explicitly assigned role in multi-party settings. The system is built on a speech large language model operating in chunk-wise streaming manner. We further introduce a reasoning-augmented variant that incorporates chain-of-thought reasoning over conversational context and the assigned role. We construct RolePlayConv, a large-scale synthetic dataset of spoken multi-party conversations with diverse assistant roles. Experiments on real-world meeting data and RolePlayConv show improved turn-taking precision by over 40% and recall by more than 70%, while substantially reducing false-positive interruptions compared to non-role-conditioned baselines.


[58] 2606.15105

Optimal Ground-to-Air Interception with Time-Varying Acceleration Bounds

This paper proposes novel optimal-control-based guidance laws for ground-to-air missiles with time-varying acceleration bounds. In such engagements, as the missile climbs in altitude, its acceleration bound decreases, which may lead to acceleration saturation and significant miss distances if not explicitly accounted for. The proposed guidance laws incorporate hard acceleration command constraints directly into a linear-quadratic optimal-control framework, in contrast to conventional unbounded or softly constrained approaches. Analytically based guidance laws are developed for linear zero-order and first-order strictly proper missile dynamics with arbitrary-order linear target dynamics. Unlike the constant hard-bound case with minimum-phase missile dynamics, time-varying acceleration command bounds permit an initial unsaturated interval in which the proposed guidance laws can anticipate future saturation and reshape the acceleration profile accordingly. This enables earlier maneuvers when the missile possesses greater low-altitude maneuverability, fundamentally altering the structure of the optimal solution. The proposed approach is evaluated in nonlinear simulations and compared with equivalent unbounded and softly constrained optimal guidance laws. The results demonstrate substantially improved interception performance under saturation, reduced tuning requirements compared to softly constrained guidance laws, and enhanced capability in challenging engagement scenarios.


[59] 2606.26716

Dual-Prior Guided Null-Space Learning with Mixture-of-Splines for Arbitrary Medical Slice Super-Resolution

Arbitrary slice super-resolution reconstructs isotropic volumes from anisotropic clinical acquisitions by synthesizing intermediate slices at arbitrary scales. However, treating this ill-posed inverse problem as unconstrained residual-based regression risks hallucinating anatomically implausible structures or altering the originally observed data. To address both concerns, this paper presents the Dual-Prior Null-space Learning (DP-NSL) framework, which reformulates the task as a constrained recovery process guided by two complementary priors. A Measurement-Consistent Projection (MCP) enforces a Deterministic Observation Prior: the reconstruction undergoes an exact orthogonal projection that reproduces every acquired slice with zero error, confining all learned details to the unobservable null space. Within this null space, a Mixture-of-Splines (MoS) module imposes a Geometric Continuity Prior by dynamically mixing B-spline experts of different analytic orders, allowing each anatomical region to be modeled with a content-aware level of continuity. To promote spatial coherence, a Local Spatial Consistency Decoder (LSCD) further injects local inductive bias. Experiments on three CT and one MRI benchmark show that DP-NSL outperforms existing approaches while strictly preserving measurement consistency. Code is available at this https URL.


[60] 2408.01859

Graph Unfolding and Sampling for Transitory Video Keyframe Selection via Gershgorin Disc Alignment

User-generated videos (UGVs) uploaded from mobile phones to social media sites like YouTube and TikTok are short and non-repetitive. We summarize a transitory UGV into several keyframes in linear-time via fast graph sampling based on Gershgorin disc alignment (GDA). Specifically, we first model a sequence of $N$ frames in a UGV as an $M$-hop path graph $\cG^o$ for $M \ll N$, where the similarity between two frames within $M$ time instants is encoded as a positive edge based on feature similarity. Towards efficient sampling, we then ``unfold'' $\cG^o$ to a $1$-hop path graph $\cG$, specified by a generalized graph Laplacian matrix $\cL$, via one of two graph unfolding procedures with provable performance bounds. We show that maximizing the smallest eigenvalue $\lambda_{\min}(\B)$ of a coefficient matrix $\B = \diag{\h} + \mu \cL$, where $\h$ is the binary keyframe selection vector, is equivalent to minimizing a worst-case signal reconstruction error. We maximize instead the Gershgorin circle theorem (GCT) lower bound $\lambda^-_{\min}(\B)$ by choosing $\h$ via a new fast graph sampling algorithm that iteratively aligns left-ends of Gershgorin discs for all graph nodes (frames). Experiments on multiple short video datasets show that our algorithm achieves comparable or better keyframe selection performance compared to state-of-the-art methods, at a substantially reduced complexity.


[61] 2411.19537

Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook

We survey deepfake generation and detection techniques, covering all deepfake media types: image, video, audio and multimodal content. We identify various kinds of deepfakes and construct taxonomies of deepfake generation and detection methods, illustrating the important groups of methods. Next, we gather datasets used for deepfake detection and provide updated rankings of the best performing detectors on the most popular datasets. In addition, we develop a novel multimodal benchmark to evaluate deepfake detectors on out-of-distribution content. The results indicate that state-of-the-art detectors fail to generalize to deepfakes generated by unseen generators. Our project page and new benchmark are available at this https URL.


[62] 2412.18032

Major Space Weather Risks Identified via Coupled Physics-Engineering-Economic Modeling

Space weather poses an important but under-quantified threat to society. While severe geomagnetic storms are recognized as potential global catastrophes, their socio-economic impacts remain poorly quantified. We present a novel physics-engineering-economic framework that links geophysical drivers to power grid geoelectric fields, transformer vulnerability, and macroeconomic consequences. Using the United States as an example, we estimate daily U.S. economic losses for a 250-year geomagnetic storm from transformer thermal heating of 2.04 billion USD (95 percent confidence interval: 1.86 to 2.22 billion USD), disrupting power for approximately 5.7 million people and 150,000 businesses. These estimates are conservative lower bounds, reflecting only transformer thermal heating effects and excluding voltage collapse, cascading failures, and restoration costs. The true societal risk is likely substantially higher. Nonetheless, the contribution is in providing the first nationwide end-to-end coupling from space physics to potential macroeconomic loss, with quantified uncertainties. Our results demonstrate that coupled socio-economic modeling of space weather is both feasible and essential, and the framework is scalable and transferable, offering a template for assessing space weather risk to critical infrastructure in other countries.


[63] 2505.06668

StableMotion: One-Step Motion Estimation with Diffusion Prior

We present StableMotion, a novel framework that leverages geometric and content priors from pretrained large-scale image diffusion models for motion estimation in single-image rectification tasks such as Stitched Image Rectangling (SIR) and Rolling Shutter Correction (RSC). Specifically, StableMotion takes a text-to-image Stable Diffusion (SD) model as its backbone and repurposes it as an image-to-motion estimator. To mitigate inconsistent outputs produced by diffusion models, we propose Adaptive Ensemble Strategy (AES), which consolidates multiple outputs into a cohesive, high-fidelity result. Additionally, we present Sampling Steps Disaster (SSD), a counterintuitive phenomenon in which increasing the number of sampling steps can lead to poorer outcomes, motivating our one-step inference design. StableMotion is evaluated on two image rectification tasks and delivers state-of-the-art performance on both, while also showing promising transferability through qualitative examples and no-reference evaluations on unseen SIR-OOD and real-captured RSC benchmarks. Supported by SSD, StableMotion achieves efficient one-step inference, offering over 100$\times$ speedup compared to previous diffusion model-based methods even when combined with the optional AES post-processing. Code and weights are available at this https URL.


[64] 2603.02794

An Interpretable, Controllable Time-Varying IIR Denoiser for On-Device Assistive Hearing

We present TVF (Time-Varying Filtering), an interpretable, low-latency speech enhancement model for real-time, on-device assistive hearing. A lightweight neural controller predicts, in real time, the coefficients of a differentiable cascade of 35 second-order IIR filters (biquads), so the model tracks non-stationary noise while keeping a fully interpretable processing chain: every spectral modification is an explicit, adjustable equalizer curve rather than an opaque `black-box' transform. Because the biquad cascade carries the signal processing, the controller can be made very small, driving the cascade with only 24k parameters at a 10.7ms algorithmic latency, within hearing-aid budgets, and running entirely on-device so that audio never leaves the device. We also expose the suppression-versus-preservation trade-off as an explicit control: it can be set during training through the loss weighting, and adjusted at inference, with no retraining, by mixing the noisy input with the denoised output. On hearing-aid metrics (HASPI/HASQI) the 24k model stays within about 0.02 of DFNet3 (2.3M parameters, almost two orders of magnitude larger) while using about 29X fewer multiply-accumulates, although larger black-box models still lead on reference metrics such as PESQ. We present TVF as a proof of concept for a compact, interpretable, and controllable denoiser for on-device assistive hearing.


[65] 2603.15606

Saddle Point Evasion via Curvature-Regularized Gradient Dynamics

Nonconvex optimization underlies many modern machine learning and control tasks, where saddle points pose the dominant obstacle to reliable convergence in high-dimensional settings. Escaping these saddle points deterministically using continuous-time optimization remains an open challenge: gradient descent is blind to curvature, stochastic perturbation methods lack deterministic guarantees, and Newton-type approaches suffer from Hessian singularity. Adopting the perspective of viewing optimization algorithms as dynamical systems, we present Curvature-Regularized Gradient Dynamics (CRGD), which augments the objective with a smooth penalty on the negative Hessian eigenvalues, yielding an augmented cost that serves as an optimization Lyapunov function with user-selectable convergence rates to second-order stationary points. Numerical experiments confirm that CRGD converges to second-order stationary points, even in regimes where gradient descent fails.


[66] 2606.22649

MaRS: Robust Out-of-Distribution Detection via Mahalanobis Residual Scoring

Foundation models provide highly descriptive representations for medical images, yet their reliability degrades under distribution shifts arising from changes in patients, devices, or acquisition conditions. Reliable out-of-distribution (OOD) detection is therefore essential for safe deployment. Recent post-hoc detectors efficiently exploit frozen embeddings (e.g., kNN), whereas reconstruction-based OOD detection in latent feature space has seen limited adoption due to inconsistent performance. In this work, we show that the limitation of reconstruction-based methods in latent space does not stem from poor reconstruction quality, but from how reconstruction errors are scored. Standard L2 residual norms collapse the anisotropic residual structure, thereby suppressing informative deviations. To address this limitation, we introduce MaRS (Mahalanobis Residual Scoring), a label-free OOD detector that learns an in-distribution manifold using a lightweight autoencoder and measures deviation via a Mahalanobis distance on reconstruction residuals, yielding variance-aware OOD scores. Across three imaging modalities, multiple types of distribution shift, and different model families and scales, MaRS outperforms established confidence-, distance-, and reconstruction-based baselines, while remaining fully post-hoc and lightweight. The code is available at this https URL.


[67] 2606.25956

Pulmonary Embolism Risk Stratification from CTPA and Medical Records: Vascular Graphs Are Not All You Need

Risk stratification for pulmonary embolism (PE) is critical for clinical decision-making. Stratification guidelines are based on patient medical records, parameters measured from computed tomography pulmonary angiography (CTPA), and blood tests. However, blood tests are often missing in routine practice. This work studies whether state-of-the-art models can accurately classify risk stratification from only medical records and biomarkers extracted from CTPA images. We benchmark different approaches to combine medical records and cardiac biomarkers with rich pulmonary vascular information; we add vascular biomarkers to tabular models and apply graph neural networks (GNNs) on the vascular tree's intrinsic graph representation. We use a private dataset (n=353) with uniquely complete data for PE risk stratification. Our results show that, among global features, medical records and cardiac biomarkers are the most significant predictors, while vascular biomarkers do not further improve stratification. Even more surprising, even GNNs on vascular graphs fail to outperform strong tabular baseline on global features. We consider hypotheses, on both models and data, that could explain this suboptimal performance. Our investigation suggests that, counter-intuitively, vascular graphs might hold no discriminative information for PE risk stratification. Code is available from this https URL.