Cardiovascular diseases remain the leading cause of global mortality, emphasizing the critical need for efficient diagnostic tools such as electrocardiograms (ECGs). Recent advancements in deep learning, particularly transformers, have revolutionized ECG analysis by capturing detailed waveform features as well as global rhythm patterns. However, traditional transformers struggle to effectively capture local morphological features that are critical for accurate ECG interpretation. We propose a novel Local-Global Attention ECG model (LGA-ECG) to address this limitation, integrating convolutional inductive biases with global self-attention mechanisms. Our approach extracts queries by averaging embeddings obtained from overlapping convolutional windows, enabling fine-grained morphological analysis, while simultaneously modeling global context through attention to keys and values derived from the entire sequence. Experiments conducted on the CODE-15 dataset demonstrate that LGA-ECG outperforms state-of-the-art models and ablation studies validate the effectiveness of the local-global attention strategy. By capturing the hierarchical temporal dependencies and morphological patterns in ECG signals, this new design showcases its potential for clinical deployment with robust automated ECG classification.
We present SeizureFormer, a Transformer-based model for long-term seizure risk forecasting using interictal epileptiform activity (IEA) surrogate biomarkers and long episode (LE) biomarkers from responsive neurostimulation (RNS) systems. Unlike raw scalp EEG-based models, SeizureFormer leverages structured, clinically relevant features and integrates CNN-based patch embedding, multi-head self-attention, and squeeze-and-excitation blocks to model both short-term dynamics and long-term seizure cycles. Tested across five patients and multiple prediction windows (1 to 14 days), SeizureFormer achieved state-of-the-art performance with mean ROC AUC of 79.44 percent and mean PR AUC of 76.29 percent. Compared to statistical, machine learning, and deep learning baselines, it demonstrates enhanced generalizability and seizure risk forecasting performance under class imbalance. This work supports future clinical integration of interpretable and robust seizure forecasting tools for personalized epilepsy management.
Pinching antenna systems (PASS) have been proposed as a revolutionary flexible antenna technology which facilitates line-of-sight links via numerous low-cost pinching antennas with adjustable activation positions over waveguides. This letter proposes a two-timescale joint transmit and pinching beamforming design for the maximization of sum rate of a PASS-based downlink multi-user multiple input single output system. A primal dual decomposition method is developed to decouple the two-timescale problem into two sub-problems: 1) A Karush-Kuhn-Tucker-guided dual learning-based approach is proposed to solve the short-term transmit beamforming design sub-problem; 2) The long-term pinching beamforming design sub-problem is tackled by adopting a stochastic successive convex approximation method. Simulation results demonstrate that the proposed two-timescale algorithm achieves a significant performance gain compared to other baselines.
Accurate prediction of non-dispatchable renewable energy sources is essential for grid stability and price prediction. Regional power supply forecasts are usually indirect through a bottom-up approach of plant-level forecasts, incorporate lagged power values, and do not use the potential of spatially resolved data. This study presents a comprehensive methodology for predicting solar and wind power production at country scale in France using machine learning models trained with spatially explicit weather data combined with spatial information about production sites capacity. A dataset is built spanning from 2012 to 2023, using daily power production data from RTE (the national grid operator) as the target variable, with daily weather data from ERA5, production sites capacity and location, and electricity prices as input features. Three modeling approaches are explored to handle spatially resolved weather data: spatial averaging over the country, dimension reduction through principal component analysis, and a computer vision architecture to exploit complex spatial relationships. The study benchmarks state-of-the-art machine learning models as well as hyperparameter tuning approaches based on cross-validation methods on daily power production data. Results indicate that cross-validation tailored to time series is best suited to reach low error. We found that neural networks tend to outperform traditional tree-based models, which face challenges in extrapolation due to the increasing renewable capacity over time. Model performance ranges from 4% to 10% in nRMSE for midterm horizon, achieving similar error metrics to local models established at a single-plant level, highlighting the potential of these methods for regional power supply forecasting.
Cardiovascular diseases (CVDs) remain the leading cause of mortality worldwide, highlighting the critical need for efficient and accurate diagnostic tools. Electrocardiograms (ECGs) are indispensable in diagnosing various heart conditions; however, their manual interpretation is time-consuming and error-prone. In this paper, we propose xLSTM-ECG, a novel approach that leverages an extended Long Short-Term Memory (xLSTM) network for multi-label classification of ECG signals, using the PTB-XL dataset. To the best of our knowledge, this work represents the first design and application of xLSTM modules specifically adapted for multi-label ECG classification. Our method employs a Short-Time Fourier Transform (STFT) to convert time-series ECG waveforms into the frequency domain, thereby enhancing feature extraction. The xLSTM architecture is specifically tailored to address the complexities of 12-lead ECG recordings by capturing both local and global signal features. Comprehensive experiments on the PTB-XL dataset reveal that our model achieves strong multi-label classification performance, while additional tests on the Georgia 12-Lead dataset underscore its robustness and efficiency. This approach significantly improves ECG classification accuracy, thereby advancing clinical diagnostics and patient care. The code will be publicly available upon acceptance.
Calibration is crucial for ensuring the performance of phased array since amplitude-phase imbalance between elements results in significant performance degradation. While amplitude-only calibration methods offer advantages when phase measurements are impractical, conventional approaches face two key challenges: they typically require high-resolution phase shifters and remain susceptible to phase errors inherent in these components. To overcome these limitations, we propose a Rotating element Harmonic Electric-field Vector (RHEV) strategy, which enables precise calibration through time modulation principles. The proposed technique functions as follows. Two 1-bit phase shifters are periodically phase-switched at the same frequency, each generating corresponding harmonics. By adjusting the relative delay between their modulation timings, the phase difference between the $+1$st harmonics produced by the two elements can be precisely controlled, utilizing the time-shift property of the Fourier transform. Furthermore, the +1st harmonic generated by sequential modulation of individual elements exhibits a linear relationship with the amplitude of the modulated element, enabling amplitude ambiguity resolution. The proposed RHEV-based calibration method generates phase shifts through relative timing delays rather than physical phase shifter adjustments, rendering it less susceptible to phase shift errors. Additionally, since the calibration process exclusively utilizes the $+1$st harmonic, which is produced solely by the modulated unit, the method demonstrates consistent performance regardless of array size. Extensive numerical simulations, practical in-channel and over-the-air (OTA) calibration experiments demonstrate the effectiveness and distinct advantages of the proposed method.
Radio-frequency (RF) sensing enables long-range, high-resolution detection for applications such as radar and wireless communication. RF photonic sensing mitigates the bandwidth limitations and high transmission losses of electronic systems by transducing the detected RF signals into broadband optical carriers. However, these sensing systems remain limited by detector noise and Nyquist rate sampling with analog-to-digital converters, particularly under low-power and high-data rate conditions. To overcome these limitations, we introduce the micro-ring perceptron (MiRP) sensor, a physics-inspired AI framework that integrates the micro-ring (MiR) dynamics-based analog processor with a machine-learning-driven digital backend. By embedding the nonlinear optical dynamics of MiRs into an end-to-end architecture, MiRP sensing maps the input signal into a learned feature space for the subsequent digital neural network. The trick is to encode the entire temporal structure of the incoming signal into each output sample in order to enable effectively sub-Nyquist sampling without loss of task-relevant information. Evaluations of three target classification datasets demonstrate the performance advantages of MiRP sensing. For example, on MNIST, MiRP detection achieves $94\pm0.1$\% accuracy at $1/49$ the Nyquist rate at the input RF signal of $1$~ pW, compared to $11\pm0.4$\% for the conventional RF detection method. Thus, our sensor framework provides a robust and efficient solution for the detection of low-power and high-speed signals in real-world sensing applications.
Raman spectroscopy serves as a powerful and reliable tool for analyzing the chemical information of substances. The integration of Raman spectroscopy with deep learning methods enables rapid qualitative and quantitative analysis of materials. Most existing approaches adopt supervised learning methods. Although supervised learning has achieved satisfactory accuracy in spectral analysis, it is still constrained by costly and limited well-annotated spectral datasets for training. When spectral annotation is challenging or the amount of annotated data is insufficient, the performance of supervised learning in spectral material identification declines. In order to address the challenge of feature extraction from unannotated spectra, we propose a self-supervised learning paradigm for Raman Spectroscopy based on a Masked AutoEncoder, termed SMAE. SMAE does not require any spectral annotations during pre-training. By randomly masking and then reconstructing the spectral information, the model learns essential spectral features. The reconstructed spectra exhibit certain denoising properties, improving the signal-to-noise ratio (SNR) by more than twofold. Utilizing the network weights obtained from masked pre-training, SMAE achieves clustering accuracy of over 80% for 30 classes of isolated bacteria in a pathogenic bacterial dataset, demonstrating significant improvements compared to classical unsupervised methods and other state-of-the-art deep clustering methods. After fine-tuning the network with a limited amount of annotated data, SMAE achieves an identification accuracy of 83.90% on the test set, presenting competitive performance against the supervised ResNet (83.40%).
In recent years, non-intrusive load monitoring (NILM) technology has attracted much attention in the related research field by virtue of its unique advantage of utilizing single meter data to achieve accurate decomposition of device-level energy consumption. Cutting-edge methods based on machine learning and deep learning have achieved remarkable results in load decomposition accuracy by fusing time-frequency domain features. However, these methods generally suffer from high computational costs and huge memory requirements, which become the main obstacles for their deployment on resource-constrained microcontroller units (MCUs). To address these challenges, this study proposes an innovative Dynamic Time Warping (DTW) algorithm in the time-frequency domain and systematically compares and analyzes the performance of six machine learning techniques in home electricity scenarios. Through complete experimental validation on edge MCUs, this scheme successfully achieves a recognition accuracy of 95%. Meanwhile, this study deeply optimizes the frequency domain feature extraction process, which effectively reduces the running time by 55.55% and the storage overhead by about 34.6%. The algorithm performance will be further optimized in future research work. Considering that the elimination of voltage transformer design can significantly reduce the cost, the subsequent research will focus on this direction, and is committed to providing more cost-effective solutions for the practical application of NILM, and providing a solid theoretical foundation and feasible technical paths for the design of efficient NILM systems in edge computing environments.
Electroencephalogram (EEG) data is crucial for diagnosing mental health conditions but is costly and time-consuming to collect at scale. Synthetic data generation offers a promising solution to augment datasets for machine learning applications. However, generating high-quality synthetic EEG that preserves emotional and mental health signals remains challenging. This study proposes a method combining correlation analysis and random sampling to generate realistic synthetic EEG data. We first analyze interdependencies between EEG frequency bands using correlation analysis. Guided by this structure, we generate synthetic samples via random sampling. Samples with high correlation to real data are retained and evaluated through distribution analysis and classification tasks. A Random Forest model trained to distinguish synthetic from real EEG performs at chance level, indicating high fidelity. The generated synthetic data closely match the statistical and structural properties of the original EEG, with similar correlation coefficients and no significant differences in PERMANOVA tests. This method provides a scalable, privacy-preserving approach for augmenting EEG datasets, enabling more efficient model training in mental health research.
An integration of satellites and terrestrial networks is crucial for enhancing performance of next generation communication systems. However, the networks are hindered by the long-distance path loss and security risks in dense urban environments. In this work, we propose a satellite-terrestrial covert communication system assisted by the aerial active simultaneous transmitting and reflecting reconfigurable intelligent surface (AASTAR-RIS) to improve the channel capacity while ensuring the transmission covertness. Specifically, we first derive the minimal detection error probability (DEP) under the worst condition that the Warden has perfect channel state information (CSI). Then, we formulate an AASTAR-RIS-assisted satellite-terrestrial covert communication optimization problem (ASCCOP) to maximize the sum of the fair channel capacity for all ground users while meeting the strict covert constraint, by jointly optimizing the trajectory and active beamforming of the AASTAR-RIS. Due to the challenges posed by the complex and high-dimensional state-action spaces as well as the need for efficient exploration in dynamic environments, we propose a generative deterministic policy gradient (GDPG) algorithm, which is a generative deep reinforcement learning (DRL) method to solve the ASCCOP. Concretely, the generative diffusion model (GDM) is utilized as the policy representation of the algorithm to enhance the exploration process by generating diverse and high-quality samples through a series of denoising steps. Moreover, we incorporate an action gradient mechanism to accomplish the policy improvement of the algorithm, which refines the better state-action pairs through the gradient ascent. Simulation results demonstrate that the proposed approach significantly outperforms important benchmarks.
This paper considers the distributed bandit convex optimization problem with time-varying constraints. In this problem, the global loss function is the average of all the local convex loss functions, which are unknown beforehand. Each agent iteratively makes its own decision subject to time-varying inequality constraints which can be violated but are fulfilled in the long run. For a uniformly jointly strongly connected time-varying directed graph, a distributed bandit online primal-dual projection algorithm with one-point sampling is proposed. We show that sublinear dynamic network regret and network cumulative constraint violation are achieved if the path-length of the benchmark also increases in a sublinear manner. In addition, an $\mathcal{O}({T^{3/4 + g}})$ static network regret bound and an $\mathcal{O}( {{T^{1 - {g}/2}}} )$ network cumulative constraint violation bound are established, where $T$ is the total number of iterations and $g \in ( {0,1/4} )$ is a trade-off parameter. Moreover, a reduced static network regret bound $\mathcal{O}( {{T^{2/3 + 4g /3}}} )$ is established for strongly convex local loss functions. Finally, a numerical example is presented to validate the theoretical results.
This paper presents a flexible thin-film underwater transducer based on a mesoporous PVDF membrane embedded with piezoelectrical-actuated microdomes. To enhance piezoelectric performance, ZnO nanoparticles were used as a sacrificial template to fabricate a sponge-like PVDF structure with increased \b{eta}-phase content and improved mechanical compliance. The device was modeled using finite element analysis and optimized through parametric studies of dome geometry, film thickness, and dome size. Acoustic performance was evaluated through underwater testing, demonstrating high SPL output and reliable data transmission even at low drive voltages. The proposed transducer offers a lightweight, low-cost, and energy-efficient solution for short-range underwater communication in next-generation Ocean IoT systems.
Fluid antenna (FA) array is envisioned as a promising technology for next-generation communication systems, owing to its ability to dynamically control the antenna locations. In this paper, we apply FA array to boost the performance of over-the-air computation networks. Given that channel uncertainty will impact negatively not only the beamforming design but also the antenna location optimization, robust resource allocation is performed to minimize the mean squared error of transmitted messages. Block coordinate descent is adopted to decompose the formulated non-convex problem into three subproblems, which are iteratively solved until convergence. Numerical results show the benefits of FA array and the necessity of robust resource allocation under channel uncertainty.
We establish Nash equilibrium learning -- convergence of the population state to a suitably defined Nash equilibria set -- for a class of payoff dynamical mechanism with a first order modification. The first order payoff modification can model aspects of the agents' bounded rationality, anticipatory or averaging terms in the payoff mechanism, or first order Pad\'e approximations of delays. To obtain our main results, we apply a combination of two nonstandard system-theoretic passivity notions.
In the history of audio and acoustic signal processing, perceptual audio coding has certainly excelled as a bright success story by its ubiquitous deployment in virtually all digital media devices, such as computers, tablets, mobile phones, set-top-boxes, and digital radios. From a technology perspective, perceptual audio coding has undergone tremendous development from the first very basic perceptually driven coders (including the popular mp3 format) to today's full-blown integrated coding/rendering systems. This paper provides a historical overview of this research journey by pinpointing the pivotal development steps in the evolution of perceptual audio coding. Finally, it provides thoughts about future directions in this area.
This study performs a comprehensive evaluation of quantitative measurements as extracted from automated deep-learning-based segmentation methods, beyond traditional Dice Similarity Coefficient assessments, focusing on six quantitative metrics, namely SUVmax, SUVmean, total lesion activity (TLA), tumor volume (TMTV), lesion count, and lesion spread. We analyzed 380 prostate-specific membrane antigen (PSMA) targeted [18F]DCFPyL PET/CT scans of patients with biochemical recurrence of prostate cancer, training deep neural networks, U-Net, Attention U-Net and SegResNet with four loss functions: Dice Loss, Dice Cross Entropy, Dice Focal Loss, and our proposed L1 weighted Dice Focal Loss (L1DFL). Evaluations indicated that Attention U-Net paired with L1DFL achieved the strongest correlation with the ground truth (concordance correlation = 0.90-0.99 for SUVmax and TLA), whereas models employing the Dice Loss and the other two compound losses, particularly with SegResNet, underperformed. Equivalence testing (TOST, alpha = 0.05, Delta = 20%) confirmed high performance for SUV metrics, lesion count and TLA, with L1DFL yielding the best performance. By contrast, tumor volume and lesion spread exhibited greater variability. Bland-Altman, Coverage Probability, and Total Deviation Index analyses further highlighted that our proposed L1DFL minimizes variability in quantification of the ground truth clinical measures. The code is publicly available at: https://github.com/ObedDzik/pca\_segment.git.
Our everyday auditory experience is shaped by the acoustics of the indoor environments in which we live. Room acoustics modeling is aimed at establishing mathematical representations of acoustic wave propagation in such environments. These representations are relevant to a variety of problems ranging from echo-aided auditory indoor navigation to restoring speech understanding in cocktail party scenarios. Many disciplines in science and engineering have recently witnessed a paradigm shift powered by deep learning (DL), and room acoustics research is no exception. The majority of deep, data-driven room acoustics models are inspired by DL-based speech and image processing, and hence lack the intrinsic space-time structure of acoustic wave propagation. More recently, DL-based models for room acoustics that include either geometric or wave-based information have delivered promising results, primarily for the problem of sound field reconstruction. In this review paper, we will provide an extensive and structured literature review on deep, data-driven modeling in room acoustics. Moreover, we position these models in a framework that allows for a conceptual comparison with traditional physical and data-driven models. Finally, we identify strengths and shortcomings of deep, data-driven room acoustics models and outline the main challenges for further research.
This paper presents discrete codebook synthesis methods for self-interference (SI) suppression in a mmWave device, designed to support FD ISAC. We formulate a SINR maximization problem that optimizes the RX and TX codewords, aimed at suppressing the near-field SI signal while maintaining the beamforming gain in the far-field sensing directions. The formulation considers the practical constraints of discrete RX and TX codebooks with quantized phase settings, as well as a TX beamforming gain requirement in the specified communication direction. Under an alternating optimization framework, the RX and TX codewords are iteratively optimized, with one fixed while the other is optimized. When the TX codeword is fixed, we show that the RX codeword optimization problem can be formulated as an integer quadratic fractional programming (IQFP) problem. Using Dinkelbach's algorithm, we transform the problem into a sequence of subproblems in which the numerator and the denominator of the objective function are decoupled. These subproblems, subject to discrete constraints, are then efficiently solved by the spherical search (SS) method. This overall approach is referred to as FP-SS. When the RX codeword is fixed, the TX codeword optimization problem can similarly be formulated as an IQFP problem, whereas an additional TX beamforming constraint for communication needs to be considered. The problem is solved through Dinkelbach's transformation followed by the constrained spherical search (CSS), and we refer to this approach as FP-CSS. Finally, we integrate the FP-SS and FP-CSS methods into a joint RX-TX codebook design approach. Simulations show that, the proposed FP-SS and FP-CSS achieve the same SI suppression performance as the corresponding exhaustive search method, but with much lower complexity. Furthermore, the alternating optimization framework achieved even better SI suppression performance.
This paper introduces a novel deep learning-based user-side feedback reduction framework, termed self-nomination. The goal of self-nomination is to reduce the number of users (UEs) feeding back channel state information (CSI) to the base station (BS), by letting each UE decide whether to feed back based on its estimated likelihood of being scheduled and its potential contribution to precoding in a multiuser MIMO (MU-MIMO) downlink. Unlike SNR- or SINR-based thresholding methods, the proposed approach uses rich spatial channel statistics and learns nontrivial correlation effects that affect eventual MU-MIMO scheduling decisions. To train the self-nomination network under an average feedback constraint, we propose two different strategies: one based on direct optimization with gradient approximations, and another using policy gradient-based optimization with a stochastic Bernoulli policy to handle non-differentiable scheduling. The framework also supports proportional-fair scheduling by incorporating dynamic user weights. Numerical results confirm that the proposed self-nomination method significantly reduces CSI feedback overhead. Compared to baseline feedback methods, self-nomination can reduce feedback by as much as 65%, saving not only bandwidth but also allowing many UEs to avoid feedback altogether (and thus, potentially enter a sleep mode). Self-nomination achieves this significant savings with negligible reduction in sum-rate or fairness.
Traditional channel acquisition faces significant limitations due to ideal model assumptions and scalability challenges. A novel environment-aware paradigm, known as channel twinning, tackles these issues by constructing radio propagation environment semantics using a data-driven approach. In the spotlight of channel twinning technology, a radio map is recognized as an effective region-specific model for learning the spatial distribution of channel information. However, most studies focus on static channel map construction, with only a few collecting numerous channel samples and using deep learning for radio map prediction. In this paper, we develop a novel dynamic radio map twinning framework with a substantially small dataset. Specifically, we present an innovative approach that employs dynamic mode decomposition (DMD) to model the evolution of the dynamic channel gain map as a dynamical system. We first interpret dynamic channel gain maps as spatio-temporal video stream data. The coarse-grained and fine-grained evolving modes are extracted from the stream data using a new ensemble DMD (Ens-DMD) algorithm. To mitigate the impact of noisy data, we design a median-based threshold mask technique to filter the noise artifacts of the twin maps. With the proposed DMD-based radio map twinning framework, numerical results are provided to demonstrate the low-complexity reproduction and evolution of the channel gain maps. Furthermore, we consider four radio map twin performance metrics to confirm the superiority of our framework compared to the baselines.
This paper proposes an optimization framework for distributed resource logistics system design to support future multimission space exploration. The performance and impact of distributed In-Situ Resource Utilization (ISRU) systems in facilitating space transportation are analyzed. The proposed framework considers technology trade studies, deployment strategy, facility location evaluation, and resource logistics after production for distributed ISRU systems. We develop piecewise linear sizing and cost estimation models based on economies of scale that can be easily integrated into network-based mission planning formulations. A case study on a multi-mission cislunar logistics campaign is conducted to demonstrate the value of the proposed method and evaluate key tradeoffs to compare the performance of distributed ISRU systems with traditional concentrated ISRU. Finally, a comprehensive sensitivity analysis is performed to assess the proposed system under varying conditions, comparing concentrated and distributed ISRU systems.
This paper explores the application of movable antenna (MA), a cutting-edge technology with the capability of altering antenna positions, in a symbiotic radio (SR) system enabled by reconfigurable intelligent surface (RIS). The goal is to fully exploit the capabilities of both MA and RIS, constructing a better transmission environment for the co-existing primary and secondary transmission systems. For both parasitic SR (PSR) and commensal SR (CSR) scenarios with the channel uncertainties experienced by all transmission links, we design a robust transmission scheme with the goal of maximizing the primary rate while ensuring the secondary transmission quality. To address the maximization problem with thorny non-convex characteristics, we propose an alternating optimization framework that utilizes the General S-Procedure, General Sign-Definiteness Principle, successive convex approximation (SCA), and simulated annealing (SA) improved particle swarm optimization (SA-PSO) algorithms. Numerical results validate that the CSR scenario significantly outperforms the PSR scenario in terms of primary rate, and also show that compared to the fixed-position antenna scheme, the proposed MA scheme can increase the primary rate by 1.62 bps/Hz and 2.37 bps/Hz for the PSR and CSR scenarios, respectively.
In this paper, we investigate an unmanned aerial vehicle (UAV)-enabled secure communication scenario that a cluster of UAVs performs a virtual non-uniform linear array (NULA) to communicate with a base station (BS) in the presence of eavesdroppers (Eves). Our goal is to design the UAV topology, trajectory, and precoding to maximize the system channel capacity. To this end, we convert the original problem into equivalent two-stage problems. Specifically, we first try to maximize the channel gain by meticulously designing the UAV topology. We then study the joint optimization of the trajectory and precoding for total transmit power minimization while satisfying the constraints on providing quality of service (QoS) assurance to the BS, the leakage tolerance to Eves, the per-UAV transmit power, the initial/final locations, and the cylindrical no-fly zones. For the UAV topology design, we prove that the topology follows the Fekete-point distribution. The design of trajectory and precoding is formulated as a non-convex optimization problem which is generally intractable. Subsequently, the non-convex constraints are converted into convex terms, and a double-loop search algorithm is proposed to solve the transmit power minimization problem. Introduce random rotation offsets so as to perform a dynamic stochastic channel to enhance the security. Numerical results demonstrate the superiority of the proposed method in promoting capacity.
In this paper, we propose a hierarchical distributed timing architecture based on an ensemble of miniature atomic clocks. The goal is to ensure synchronized and accurate timing in a normal operating mode where Global Navigation Satellite System (GNSS) signals are available, as well as in an emergency operating mode during GNSS failures. At the lower level, the miniature atomic clocks employ a distributed control strategy that uses only local information to ensure synchronization in both modes. The resulting synchronized time or generated time scale has the best frequency stability, as measured by the Allan variance, over the short control period. In the upper layer, a supervisor controls the long-term behavior of the generated time scale. In the normal operating mode, the supervisor periodically anchors the generated time scale to the standard time based on GNSS signals, while in the emergency operating mode, it applies optimal floating control to reduce the divergence rate of the generated time scale, which is not observable from the measurable time difference between the miniature atomic clocks. This floating control aims to explicitly control the generated time scale to have the least Allan variance over the long control period. Finally, numerical examples are provided to demonstrate the effectiveness and feasibility of the architecture in high-precision, GNSS-resilient atomic timing.
This paper considers the problem of solving constrained reinforcement learning problems with anytime guarantees, meaning that the algorithmic solution returns a safe policy regardless of when it is terminated. Drawing inspiration from anytime constrained optimization, we introduce Reinforcement Learning-based Safe Gradient Flow (RL-SGF), an on-policy algorithm which employs estimates of the value functions and their respective gradients associated with the objective and safety constraints for the current policy, and updates the policy parameters by solving a convex quadratically constrained quadratic program. We show that if the estimates are computed with a sufficiently large number of episodes (for which we provide an explicit bound), safe policies are updated to safe policies with a probability higher than a prescribed tolerance. We also show that iterates asymptotically converge to a neighborhood of a KKT point, whose size can be arbitrarily reduced by refining the estimates of the value function and their gradients. We illustrate the performance of RL-SGF in a navigation example.
In conventional deep speaker embedding frameworks, the pooling layer aggregates all frame-level features over time and computes their mean and standard deviation statistics as inputs to subsequent segment-level layers. Such statistics pooling strategy produces fixed-length representations from variable-length speech segments. However, this method treats different frame-level features equally and discards covariance information. In this paper, we propose the Semi-orthogonal parameter pooling of Covariance matrix (SoCov) method. The SoCov pooling computes the covariance matrix from the self-attentive frame-level features and compresses it into a vector using the semi-orthogonal parametric vectorization, which is then concatenated with the weighted standard deviation vector to form inputs to the segment-level layers. Deep embedding based on SoCov is called ``sc-vector''. The proposed sc-vector is compared to several different baselines on the SRE21 development and evaluation sets. The sc-vector system significantly outperforms the conventional x-vector system, with a relative reduction in EER of 15.5% on SRE21Eval. When using self-attentive deep feature, SoCov helps to reduce EER on SRE21Eval by about 30.9% relatively to the conventional ``mean + standard deviation'' statistics.
The recently introduced discrete power-based control (Ruderman (2024b)) reduces largely the communication efforts in the control loop when compensating for the marginally damped or even slowly diverging output oscillations. The control commutates twice per oscillations period (at the amplitude peaks) and uses the measured harmonic output only. The power-based control scheme requires the knowledge of the instantaneous frequency, amplitude, and bias parameters of the harmonic signal. This paper extends the power-based control by the finite-time estimation of the biased harmonics (Ahmed et al. (2022)). Also an improved analytic calculation of the impulse weighting factor is provided. The power-based oscillations control with online estimation of the harmonic parameters is evaluated experimentally on the fifth-order actuator system with a free hanging load under gravity and measurement noise.
In modern large-scale systems with sensor networks and IoT devices it is essential to collaboratively solve complex problems while utilizing network resources efficiently. In our paper we present three distributed optimization algorithms that exhibit efficient communication among nodes. Our first algorithm presents a simple quantized averaged gradient procedure for distributed optimization, which is shown to converge to a neighborhood of the optimal solution. Our second algorithm incorporates a novel event-triggered refinement mechanism, which refines the utilized quantization level to enhance the precision of the estimated optimal solution. It enables nodes to terminate their operation according to predefined performance guarantees. Our third algorithm is tailored to operate in environments where each message consists of only a few bits. It incorporates a novel event-triggered mechanism for adjusting the quantizer basis and quantization level, allowing nodes to collaboratively decide operation termination based on predefined performance criteria. We analyze the three algorithms and establish their linear convergence. Finally, an application on distributed sensor fusion for target localization is used to demonstrate their favorable performance compared to existing algorithms in the literature.
Optical wireless communication (OWC) is envisioned as a key enabler for immersive indoor data transmission in future wireless communication networks. However, multi-user interference management arises as a challenge in dense indoor OWC systems composed of multiple optical access points (APs) serving multiple users. In this paper, we propose a novel dual-function OWC system for communication and localization. Non-orthogonal multiple access (NOMA) with random linear network coding (RLNC) is designed for data transmission, where NOMA allows the serving of multiple users simultaneously through controlling the power domain, and RLNC helps minimize errors that might occur during signal processing phase. This setup is assisted with a light detection and localization system (LiDAL) that can passively obtain spatio-temporal indoor information of user presence and location for dynamic-user grouping. The designed LiDAL system helps to improve the estimation of channel state information (CSI) in realistic indoor network scenarios, where the CSI of indoor users might be noisy and/or highly correlated. We evaluate the performance of NOMA combined with RLNC by analyzing the probability of successful decoding compared to conventional NOMA and orthogonal schemes. In addition, we derive the Cramer-Rao Lower Bound (CRLB) to evaluate the accuracy of location estimation. The results show that the proposed RLNC-NOMA improves the probability of successful decoding and the overall system performance. The results also show the high accuracy of the unbiased location estimator and its assistant in reducing the imperfection of CSI, leading to high overall system performance.
The performance of irregular phased array architectures is assessed in the context of multi-user multiple-input multiple-output (MU-MIMO) communications operating beyond 100 GHz. Realizing half-wavelength spaced planar phased arrays is challenging due to wavelength-integrated circuit (IC) size conflict at those frequencies where the antenna dimensions are comparable to IC size. Therefore, irregular array architectures such as thinned and clustered arrays are developed to mitigate the wavelength-IC size conflict. In the thinned arrays, radiating elements are permanently deactivated, while in clustered arrays, neighboring elements are grouped into subarrays. Furthermore, irregular arrays are integrated with hybrid beamforming architectures to manage the complexity introduced by full digital beamforming, where a single radio frequency chain is connected per power amplifier. An optimization problem is formulated to determine the optimal arrangement of antenna elements where the trade-off between spectral efficiency (SE) and sidelobe levels (SLL) can be tuned. Clustered array configurations are optimized by genetic algorithm and Algorithm-X based methodologies, where the former relies on a randomized search and the latter exploits brute-force search, respectively. Furthermore, a prototype array is designed on a printed circuit board (PCB) to verify the proposed methodologies through full-wave simulations. To have a fair comparison, clustered arrays with a grouping of two and four elements are compared with thinned arrays with half and quarter thinning ratios, respectively. The combination of hybrid and irregular array architectures leads to minimal or no performance degradation in the case of hybrid fully connected architectures but severe SE and SLL degradation in the case of hybrid partially connected architectures, respectively.
This article investigates the application of pinching-antenna systems (PASS) in multiuser multiple-input single-output (MISO) communications. Two sum-rate maximization problems are formulated under minimum mean square error (MMSE) decoding, with and without successive interference cancellation (SIC). To address the joint optimization of pinching antenna locations and user transmit powers, a fractional programming-based approach is proposed. Numerical results validate the effectiveness of the proposed method and show that PASS can significantly enhance uplink sum-rate performance compared to conventional fixed-antenna designs.
The goal of many applications in energy and transport sectors is to control turbulent flows. However, because of chaotic dynamics and high dimensionality, the control of turbulent flows is exceedingly difficult. Model-free reinforcement learning (RL) methods can discover optimal control policies by interacting with the environment, but they require full state information, which is often unavailable in experimental settings. We propose a data-assimilated model-based RL (DA-MBRL) framework for systems with partial observability and noisy measurements. Our framework employs a control-aware Echo State Network for data-driven prediction of the dynamics, and integrates data assimilation with an Ensemble Kalman Filter for real-time state estimation. An off-policy actor-critic algorithm is employed to learn optimal control strategies from state estimates. The framework is tested on the Kuramoto-Sivashinsky equation, demonstrating its effectiveness in stabilizing a spatiotemporally chaotic flow from noisy and partial measurements.
This paper investigates unmanned aerial vehicle (UAV)-mounted intelligent reflecting surfaces (IRS) to leverage the benefits of this technology for future communication networks, such as 6G. Key advantages include enhanced spectral and energy efficiency, expanded network coverage, and flexible deployment. One of the main challenges in employing UAV-mounted IRS (UMI) technology is the random fluctuations of hovering UAVs. Focusing on this challenge, this paper explores the capabilities of UMI with passive/active elements affected by UAV fluctuations in both horizontal and vertical angles, considering the three-dimensional (3D) radiation pattern of the IRS. The relationship between UAV fluctuations and IRS pattern is investigated by taking into account the random angular vibrations of UAVs. A tractable and closed-form distribution function for the IRS pattern is derived, using linear approximation and by dividing it into several sectors. In addition, closed-form expressions for outage probability (OP) are obtained using central limit theorem (CLT) and Gamma approximation. The theoretical expressions are validated through Monte Carlo simulations. The findings indicate that the random fluctuations of hovering UAVs have a notable impact on the performance of UMI systems. To avoid link interruptions due to UAV instability, IRS should utilize fewer elements, even though this leads to a decrease in directivity. As a result, unlike terrestrial IRS, incorporating more elements into aerial IRS systems does not necessarily improve performance due to the fluctuations in UAV. Numerical results show that the OP can be minimized by selecting the optimal number of IRS elements and using active elements.
Array structures based on the sum and difference co-arrays provide more degrees of freedom (DOF). However, since the growth of DOF is limited by a single case of sum and difference co-arrays, the paper aims to design a sparse linear array (SLA) with higher DOF via exploring different cases of second-order cumulants. We present a mathematical framework based on second-order cumulant to devise a second-order extended co-array (SO-ECA) and define the redundancy of SO-ECA. Based on SO-ECA, a novel array is proposed, namely low redundancy sum and difference array (LR-SDA), which can provide closed-form expressions for the sensor positions and enhance DOF in order to resolve more signal sources in the direction of arrival (DOA) estimation of non-circular (NC) signals. For LR-SDA, the maximum DOF under the given number of total physical sensors can be derived and the SO-ECA of LR-SDA is hole-free. Further, the corresponding necessary and sufficient conditions of signal reconstruction for LR-SDA are derived. Additionally, the redundancy and weight function of LR-SDA are defined, and the lower band of the redundancy for LR-SDA is derived. The proposed LR-SDA achieves higher DOF and lower redundancy than those of existing DCAs designed based on sum and difference co-arrays. Numerical simulations are conducted to verify the superiority of LR-SDA on DOA estimation performance and enhanced DOF over other existing DCAs.
The mean square error (MSE)-optimal estimator is known to be the conditional mean estimator (CME). This paper introduces a parametric channel estimation technique based on Bayesian estimation. This technique uses the estimated channel parameters to parameterize the well-known LMMSE channel estimator. We first derive an asymptotic CME formulation that holds for a wide range of priors on the channel parameters. Based on this, we show that parametric Bayesian channel estimation is MSE-optimal for high signal-to-noise ratio (SNR) and/or long coherence intervals, i.e., many noisy observations provided within one coherence interval. Numerical simulations validate the derived formulations.
Accurately forecasting sea ice concentration (SIC) in the Arctic is critical to global ecosystem health and navigation safety. However, current methods still is confronted with two challenges: 1) these methods rarely explore the long-term feature dependencies in the frequency domain. 2) they can hardly preserve the high-frequency details, and the changes in the marginal area of the sea ice cannot be accurately captured. To this end, we present a Frequency-Compensated Network (FCNet) for Arctic SIC prediction on a daily basis. In particular, we design a dual-branch network, including branches for frequency feature extraction and convolutional feature extraction. For frequency feature extraction, we design an adaptive frequency filter block, which integrates trainable layers with Fourier-based filters. By adding frequency features, the FCNet can achieve refined prediction of edges and details. For convolutional feature extraction, we propose a high-frequency enhancement block to separate high and low-frequency information. Moreover, high-frequency features are enhanced via channel-wise attention, and temporal attention unit is employed for low-frequency feature extraction to capture long-range sea ice changes. Extensive experiments are conducted on a satellite-derived daily SIC dataset, and the results verify the effectiveness of the proposed FCNet. Our codes and data will be made public available at: https://github.com/oucailab/FCNet .
The examination of chest X-ray images is a crucial component in detecting various thoracic illnesses. This study introduces a new image description generation model that integrates a Vision Transformer (ViT) encoder with cross-modal attention and a GPT-4-based transformer decoder. The ViT captures high-quality visual features from chest X-rays, which are fused with text data through cross-modal attention to improve the accuracy, context, and richness of image descriptions. The GPT-4 decoder transforms these fused features into accurate and relevant captions. The model was tested on the National Institutes of Health (NIH) and Indiana University (IU) Chest X-ray datasets. On the IU dataset, it achieved scores of 0.854 (B-1), 0.883 (CIDEr), 0.759 (METEOR), and 0.712 (ROUGE-L). On the NIH dataset, it achieved the best performance on all metrics: BLEU 1--4 (0.825, 0.788, 0.765, 0.752), CIDEr (0.857), METEOR (0.726), and ROUGE-L (0.705). This framework has the potential to enhance chest X-ray evaluation, assisting radiologists in more precise and efficient diagnosis.
This paper studies a passive source localization system, where a single base station (BS) is employed to estimate the positions and attitudes of multiple mobile stations (MSs). The BS and the MSs are equipped with uniform rectangular arrays, and the MSs are located in the near-field region of the BS array. To avoid the difficulty of tackling the problem directly based on the near-field signal model, we establish a subarray-wise far-field received signal model. In this model, the entire BS array is divided into multiple subarrays to ensure that each MS is in the far-field region of each BS subarray. By exploiting the angles of arrival (AoAs) of an MS antenna at different BS subarrays, we formulate the attitude and location estimation problem under the Bayesian inference framework. Based on the factor graph representation of the probabilistic problem model, a message passing algorithm named array partitioning based pose and location estimation (APPLE) is developed to solve this problem. An estimation-error lower bound is obtained as a performance benchmark of the proposed algorithm. Numerical results demonstrate that the proposed APPLE algorithm outperforms other baseline methods in the accuracy of position and attitude estimation.
Multiobject tracking provides situational awareness that enables new applications for modern convenience, applied ocean sciences, public safety, and homeland security. In many multiobject tracking applications, including radar and sonar tracking, after coherent prefiltering of the received signal, measurement data is typically structured in cells, where each cell represent, e.g., a different range and bearing value. While conventional detect-then-track (DTT) multiobject tracking approaches convert the cell-structured data within a detection phase into so-called point measurements in order to reduce the amount of data, track-before-detect (TBD) methods process the cell-structured data directly, avoiding a potential information loss. However, many TBD tracking methods are computationally intensive and achieve a reduced tracking accuracy when objects interact, i.e., when they come into close proximity. We here counteract these difficulties by introducing the concept of probabilistic object-to-cell contributions. As many conventional DTT methods, our approach uses a probabilistic association of objects with data cells, and a new object contribution model with corresponding object contribution probabilities to further associate cell contributions to objects that occupy the same data cell. Furthermore, to keep the computational complexity and filter runtimes low, we here use an efficient Poisson multi-Bernoulli filtering approach in combination with the application of belief propagation for fast probabilistic data association. We demonstrate numerically that our method achieves significantly increased tracking performance compared to state-of-the-art TBD tracking approaches, where performance differences are particularly pronounced when multiple objects interact.
This paper introduces a Distributed Unknown Input Observer (D-UIO) design methodology that uses a technique called node-wise detectability decomposition to estimate the state of a discrete-time linear time-invariant (LTI) system in a distributed way, even when there are noisy measurements and unknown inputs. In the considered scenario, sensors are associated to nodes of an underlying communication graph. Each node has a limited scope as it can only access local measurements and share data with its neighbors. The problem of designing the observer gains is divided into two separate sub-problems: (i) design local output injection gains to mitigate the impact of measurement noise, and (ii) design diffusive gains to compensate for the lack of information through a consensus protocol. A direct and computationally efficient synthesis strategy is formulated by linear matrix inequalities (LMIs) and solved via semidefinite programming. Finally, two simulative scenarios are presented to illustrate the effectiveness of the distributed observer when two different node-wise decompositions are adopted.
Reconfigurable intelligent surfaces (RISs) are emerging as key enablers of reliable industrial automation in the millimeter-wave (mmWave) band, particularly in environments with frequent line-of-sight (LoS) blockage. While prior works have largely focused on theoretical aspects, real-time validation under user mobility remains underexplored. In this work, we propose and experimentally evaluate a self-adaptive beamforming algorithm that enables RIS reconfiguration based on a low-rate feedback link from the mobile user equipment (UE) to the RIS controller. The algorithm maintains received signal power above a predefined threshold without requiring UE position knowledge. Using a hexagonal RIS with 127 elements operating at 23.8 GHz, we validate our approach in a semi-anechoic environment over a 60 cm*100 cm observation area. The results demonstrate up to 24 dB gain in received power compared to the baseline with inactive RIS elements, highlighting the practical benefits of adaptive RIS control in mobile non-line-of-sight (NLoS) scenarios.
To provide safety guarantees for learning-based control systems, recent work has developed formal verification methods to apply after training ends. However, if the trained policy does not meet the specifications, or there is conservatism in the verification algorithm, establishing these guarantees may not be possible. Instead, this work proposes to perform verification throughout training to ultimately aim for policies whose properties can be evaluated throughout runtime with lightweight, relaxed verification algorithms. The approach is to use differentiable reachability analysis and incorporate new components into the loss function. Numerical experiments on a quadrotor model and unicycle model highlight the ability of this approach to lead to learned control policies that satisfy desired reach-avoid and invariance specifications.
Robust GNSS positioning in urban environments is still plagued by multipath effects, particularly due to the complex signal propagation induced by ubiquitous surfaces with varied radio frequency reflectivities. Current 3D Mapping Aided (3DMA) GNSS techniques show great potentials in mitigating multipath but face a critical trade-off between computational efficiency and modeling accuracy. Most approaches often rely on offline outdated or oversimplified 3D maps, while real-time LiDAR-based reconstruction boasts high accuracy, it is problematic in low laser reflectivity conditions; camera 3DMA is a good candidate to balance accuracy and efficiency but current methods suffer from extremely low reconstruction speed, a far cry from real-time multipath-mitigated navigation. This paper proposes an accelerated framework incorporating camera multi-view stereo (MVS) reconstruction and ray tracing. By hypothesizing on surface textures, an orthogonal visual feature fusion framework is proposed, which robustly addresses both texture-rich and texture-poor surfaces, lifting off the reflectivity challenges in visual reconstruction. A polygonal surface modeling scheme is further integrated to accurately delineate complex building boundaries, enhancing the reconstruction granularity. To avoid excessively accurate reconstruction, reprojected point cloud multi-plane fitting and two complexity control strategies are proposed, thus improving upon multipath estimation speed. Experiments were conducted in Lujiazui, Shanghai, a typical multipath-prone district. The results show that the method achieves an average reconstruction accuracy of 2.4 meters in dense urban environments featuring glass curtain wall structures, a traditionally tough case for reconstruction, and achieves a ray-tracing-based multipath correction rate of 30 image frames per second, 10 times faster than the contemporary benchmarks.
This paper presents a performance analysis of the Non-Primary Channel Access (NPCA) mechanism, a new feature introduced in IEEE 802.11bn to enhance spectrum utilization in Wi-Fi networks. NPCA enables devices to contend for and transmit on the secondary channel when the primary channel is occupied by transmissions from an Overlapping Basic Service Set (OBSS). We develop a Continuous-Time Markov Chain (CTMC) model that captures the interactions among OBSSs in dense WLAN environments when NPCA is enabled, incorporating new NPCA-specific states and transitions. In addition to the analytical insights offered by the model, we conduct numerical evaluations and simulations to quantify NPCA's impact on throughput and channel access delay across various scenarios. Our results show that NPCA can significantly improve throughput and reduce access delays in favorable conditions for BSSs that support the mechanism. Moreover, NPCA helps mitigate the OBSS performance anomaly, where low-rate OBSS transmissions degrade network performance for all nearby devices. However, we also observe trade-offs: NPCA may increase contention on secondary channels, potentially reducing transmission opportunities for BSSs operating there.
We train and deploy a quantized 1D convolutional neural network model to conduct speech recognition on a highly resource-constrained IoT edge device. This can be useful in various Internet of Things (IoT) applications, such as smart homes and ambient assisted living for the elderly and people with disabilities, just to name a few examples. In this paper, we first create a new dataset with over one hour of audio data that enables our research and will be useful to future studies in this field. Second, we utilize the technologies provided by Edge Impulse to enhance our model's performance and achieve a high Accuracy of up to 97% on our dataset. For the validation, we implement our prototype using the Arduino Nano 33 BLE Sense microcontroller board. This microcontroller board is specifically designed for IoT and AI applications, making it an ideal choice for our target use case scenarios. While most existing research focuses on a limited set of keywords, our model can process 23 different keywords, enabling complex commands.
Handling objects with unknown or changing masses is a common challenge in robotics, often leading to errors or instability if the control system cannot adapt in real-time. In this paper, we present a novel approach that enables a six-degrees-of-freedom robotic manipulator to reliably follow waypoints while automatically estimating and compensating for unknown payload weight. Our method integrates an admittance control framework with a mass estimator, allowing the robot to dynamically update an excitation force to compensate for the payload mass. This strategy mitigates end-effector sagging and preserves stability when handling objects of unknown weights. We experimentally validated our approach in a challenging pick-and-place task on a shelf with a crossbar, improved accuracy in reaching waypoints and compliant motion compared to a baseline admittance-control scheme. By safely accommodating unknown payloads, our work enhances flexibility in robotic automation and represents a significant step forward in adaptive control for uncertain environments.
This paper explores the idea of using phonemes as a textual representation within a conventional multilingual simultaneous speech-to-speech translation pipeline, as opposed to the traditional reliance on text-based language representations. To investigate this, we trained an open-source sequence-to-sequence model on the WMT17 dataset in two formats: one using standard textual representation and the other employing phonemic representation. The performance of both approaches was assessed using the BLEU metric. Our findings shows that the phonemic approach provides comparable quality but offers several advantages, including lower resource requirements or better suitability for low-resource languages.
This paper presents the design and implementation of an AI vision-controlled orthotic hand exoskeleton to enhance rehabilitation and assistive functionality for individuals with hand mobility impairments. The system leverages a Google Coral Dev Board Micro with an Edge TPU to enable real-time object detection using a customized MobileNet\_V2 model trained on a six-class dataset. The exoskeleton autonomously detects objects, estimates proximity, and triggers pneumatic actuation for grasp-and-release tasks, eliminating the need for user-specific calibration needed in traditional EMG-based systems. The design prioritizes compactness, featuring an internal battery. It achieves an 8-hour runtime with a 1300 mAh battery. Experimental results demonstrate a 51ms inference speed, a significant improvement over prior iterations, though challenges persist in model robustness under varying lighting conditions and object orientations. While the most recent YOLO model (YOLOv11) showed potential with 15.4 FPS performance, quantization issues hindered deployment. The prototype underscores the viability of vision-controlled exoskeletons for real-world assistive applications, balancing portability, efficiency, and real-time responsiveness, while highlighting future directions for model optimization and hardware miniaturization.
Modern control algorithms require tuning of square weight/penalty matrices appearing in quadratic functions/costs to improve performance and/or stability output. Due to simplicity in gain-tuning and enforcing positive-definiteness, diagonal penalty matrices are used extensively in control methods such as linear quadratic regulator (LQR), model predictive control, and Lyapunov-based control. In this paper, we propose an eigendecomposition approach to parameterize penalty matrices, allowing positive-definiteness with non-zero off-diagonal entries to be implicitly satisfied, which not only offers notable computational and implementation advantages, but broadens the class of achievable controls. We solve three control problems: 1) a variation of Zermelo's navigation problem, 2) minimum-energy spacecraft attitude control using both LQR and Lyapunov-based methods, and 3) minimum-fuel and minimum-time Lyapunov-based low-thrust trajectory design. Particle swarm optimization is used to optimize the decision variables, which will parameterize the penalty matrices. The results demonstrate improvements of up to 65% in the performance objective in the example problems utilizing the proposed method.
Data-driven model predictive control (MPC) has demonstrated significant potential for improving robot control performance in the presence of model uncertainties. However, existing approaches often require extensive offline data collection and computationally intensive training, limiting their ability to adapt online. To address these challenges, this paper presents a fast online adaptive MPC framework that leverages neural networks integrated with Model-Agnostic Meta-Learning (MAML). Our approach focuses on few-shot adaptation of residual dynamics - capturing the discrepancy between nominal and true system behavior - using minimal online data and gradient steps. By embedding these meta-learned residual models into a computationally efficient L4CasADi-based MPC pipeline, the proposed method enables rapid model correction, enhances predictive accuracy, and improves real-time control performance. We validate the framework through simulation studies on a Van der Pol oscillator, a Cart-Pole system, and a 2D quadrotor. Results show significant gains in adaptation speed and prediction accuracy over both nominal MPC and nominal MPC augmented with a freshly initialized neural network, underscoring the effectiveness of our approach for real-time adaptive robot control.
Cattle lameness is often caused by hoof injuries or interdigital dermatitis, leads to pain and significantly impacts essential physiological activities such as walking, feeding, and drinking. This study presents a deep learning-based model for detecting cattle lameness, sickness, or gait abnormalities using publicly available video data. The dataset consists of 50 unique videos from 40 individual cattle, recorded from various angles in both indoor and outdoor environments. Half of the dataset represents naturally walking (normal/non-lame) cattle, while the other half consists of cattle exhibiting gait abnormalities (lame). To enhance model robustness and generalizability, data augmentation was applied to the training data. The pre-processed videos were then classified using two deep learning models: ConvLSTM2D and 3D CNN. A comparative analysis of the results demonstrates strong classification performance. Specifically, the 3D CNN model achieved a video-level classification accuracy of 90%, with precision, recall, and f1-score of 90.9%, 90.9%, and 90.91% respectively. The ConvLSTM2D model exhibited a slightly lower accuracy of 85%. This study highlights the effectiveness of directly applying classification models to learn spatiotemporal features from video data, offering an alternative to traditional multi-stage approaches that typically involve object detection, pose estimation, and feature extraction. Besides, the findings demonstrate that the proposed deep learning models, particularly the 3D CNN, effectively classify and detect lameness in cattle while simplifying the processing pipeline.
In the online non-stochastic control problem, an agent sequentially selects control inputs for a linear dynamical system when facing unknown and adversarially selected convex costs and disturbances. A common metric for evaluating control policies in this setting is policy regret, defined relative to the best-in-hindsight linear feedback controller. However, for general convex costs, this benchmark may be less meaningful since linear controllers can be highly suboptimal. To address this, we introduce an alternative, more suitable benchmark--the performance of the best fixed input. We show that this benchmark can be viewed as a natural extension of the standard benchmark used in online convex optimization and propose a novel online control algorithm that achieves sublinear regret with respect to this new benchmark. We also discuss the connections between our method and the original one proposed by Agarwal et al. in their seminal work introducing the online non-stochastic control problem, and compare the performance of both approaches through numerical simulations.
In this research, numerical analysis of nonlinear pulse propagation is carried out. This is done mainly by solving the nonlinear Schrodinger equation using the split step algorithm. In a nonlinear media, dispersive effects exist simultaneously with nonlinear effects. Refractive index dependence on intensity results in optical Kerr effect which causes narrowing of transmitted pulses by inducing self-phase modulation while second order group velocity dispersion causes the pulses to spread. In this project, group velocity dispersion is discussed followed by self-phase modulation. These individually detrimental effects are shown to combine beneficially for propagation of pulses here. Gaussian pulse is studied and propagated by using them as input in to the nonlinear Schrodinger equation. The split step algorithm is described in depth. Explanation of each step is included along with the relevant equations defining these steps.
Modeling path loss in indoor LoRaWAN technology deployments is inherently challenging due to structural obstructions, occupant density and activities, and fluctuating environmental conditions. This study proposes a two-stage approach to capture and analyze these complexities using an extensive dataset of 1,328,334 field measurements collected over six months in a single-floor office at the University of Siegen's Hoelderlinstrasse Campus, Germany. First, we implement a multiple linear regression model that includes traditional propagation metrics (distance, structural walls) and an extension with proposed environmental variables (relative humidity, temperature, carbon dioxide, particulate matter, and barometric pressure). Using analysis of variance, we demonstrate that adding these environmental factors can reduce unexplained variance by 42.32 percent. Secondly, we examine residual distributions by fitting five candidate probability distributions: Normal, Skew-Normal, Cauchy, Student's t, and Gaussian Mixture Models with one to five components. Our results show that a four-component Gaussian Mixture Model captures the residual heterogeneity of indoor signal propagation most accurately, significantly outperforming single-distribution approaches. Given the push toward ultra-reliable, context-aware communications in 6G networks, our analysis shows that environment-aware modeling can substantially improve LoRaWAN network design in dynamic indoor IoT deployments.
Soft continuum arms (SCAs) soft and deformable nature presents challenges in modeling and control due to their infinite degrees of freedom and non-linear behavior. This work introduces a reinforcement learning (RL)-based framework for visual servoing tasks on SCAs with zero-shot sim-to-real transfer capabilities, demonstrated on a single section pneumatic manipulator capable of bending and twisting. The framework decouples kinematics from mechanical properties using an RL kinematic controller for motion planning and a local controller for actuation refinement, leveraging minimal sensing with visual feedback. Trained entirely in simulation, the RL controller achieved a 99.8% success rate. When deployed on hardware, it achieved a 67% success rate in zero-shot sim-to-real transfer, demonstrating robustness and adaptability. This approach offers a scalable solution for SCAs in 3D visual servoing, with potential for further refinement and expanded applications.
High-speed off-road autonomous driving presents unique challenges due to complex, evolving terrain characteristics and the difficulty of accurately modeling terrain-vehicle interactions. While dynamics models used in model-based control can be learned from real-world data, they often struggle to generalize to unseen terrain, making real-time adaptation essential. We propose a novel framework that combines a Kalman filter-based online adaptation scheme with meta-learned parameters to address these challenges. Offline meta-learning optimizes the basis functions along which adaptation occurs, as well as the adaptation parameters, while online adaptation dynamically adjusts the onboard dynamics model in real time for model-based control. We validate our approach through extensive experiments, including real-world testing on a full-scale autonomous off-road vehicle, demonstrating that our method outperforms baseline approaches in prediction accuracy, performance, and safety metrics, particularly in safety-critical scenarios. Our results underscore the effectiveness of meta-learned dynamics model adaptation, advancing the development of reliable autonomous systems capable of navigating diverse and unseen environments. Video is available at: https://youtu.be/cCKHHrDRQEA