New articles on Electrical Engineering and Systems Science


[1] 2205.06315

Interface Networks for Failure Localization in Power Systems

Transmission power systems usually consist of interconnected sub-grids that are operated relatively independently. When a failure happens, it is desirable to localize its impact within the sub-grid where the failure occurs. This paper introduces three interface networks to connect sub-grids, achieving better failure localization while maintaining robust network connectivity. The proposed interface networks are validated with numerical experiments on the IEEE 118-bus test network under both DC and AC power flow models.


[2] 2205.06327

Image Gradient Decomposition for Parallel and Memory-Efficient Ptychographic Reconstruction

Ptychography is a popular microscopic imaging modality for many scientific discoveries and sets the record for highest image resolution. Unfortunately, the high image resolution for ptychographic reconstruction requires significant amount of memory and computations, forcing many applications to compromise their image resolution in exchange for a smaller memory footprint and a shorter reconstruction time. In this paper, we propose a novel image gradient decomposition method that significantly reduces the memory footprint for ptychographic reconstruction by tessellating image gradients and diffraction measurements into tiles. In addition, we propose a parallel image gradient decomposition method that enables asynchronous point-to-point communications and parallel pipelining with minimal overhead on a large number of GPUs. Our experiments on a Titanate material dataset (PbTiO3) with 16632 probe locations show that our Gradient Decomposition algorithm reduces memory footprint by 51 times. In addition, it achieves time-to-solution within 2.2 minutes by scaling to 4158 GPUs with a super-linear speedup at 364% efficiency. This performance is 2.7 times more memory efficient, 9 times more scalable and 86 times faster than the state-of-the-art algorithm.


[3] 2205.06396

Learning Based User Scheduling in Reconfigurable Intelligent Surface Assisted Multiuser Downlink

Reconfigurable intelligent surface (RIS) is capable of intelligently manipulating the phases of the incident electromagnetic wave to improve the wireless propagation environment between the base-station (BS) and the users. This paper addresses the joint user scheduling, RIS configuration, and BS beamforming problem in an RIS-assisted downlink network with limited pilot overhead. We show that graph neural networks (GNN) with permutation invariant and equivariant properties can be used to appropriately schedule users and to design RIS configurations to achieve high overall throughput while accounting for fairness among the users. As compared to the conventional methodology of first estimating the channels then optimizing the user schedule, RIS configuration and the beamformers, this paper shows that an optimized user schedule can be obtained directly from a very short set of pilots using a GNN, then the RIS configuration can be optimized using a second GNN, and finally the BS beamformers can be designed based on the overall effective channel. Numerical results show that the proposed approach can utilize the received pilots more efficiently than the conventional channel estimation based approach, and can generalize to systems with an arbitrary number of users.


[4] 2205.06445

Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition

Despite the rapid progress of automatic speech recognition (ASR) technologies targeting normal speech, accurate recognition of dysarthric and elderly speech remains highly challenging tasks to date. It is difficult to collect large quantities of such data for ASR system development due to the mobility issues often found among these users. To this end, data augmentation techniques play a vital role. In contrast to existing data augmentation techniques only modifying the speaking rate or overall shape of spectral contour, fine-grained spectro-temporal differences between dysarthric, elderly and normal speech are modelled using a novel set of speaker dependent (SD) generative adversarial networks (GAN) based data augmentation approaches in this paper. These flexibly allow both: a) temporal or speed perturbed normal speech spectra to be modified and closer to those of an impaired speaker when parallel speech data is available; and b) for non-parallel data, the SVD decomposed normal speech spectral basis features to be transformed into those of a target elderly speaker before being re-composed with the temporal bases to produce the augmented data for state-of-the-art TDNN and Conformer ASR system training. Experiments are conducted on four tasks: the English UASpeech and TORGO dysarthric speech corpora; the English DementiaBank Pitt and Cantonese JCCOCC MoCA elderly speech datasets. The proposed GAN based data augmentation approaches consistently outperform the baseline speed perturbation method by up to 0.91% and 3.0% absolute (9.61% and 6.4% relative) WER reduction on the TORGO and DementiaBank data respectively. Consistent performance improvements are retained after applying LHUC based speaker adaptation.


[5] 2205.06450

A microstructure estimation Transformer inspired by sparse representation for diffusion MRI

Diffusion magnetic resonance imaging (dMRI) is an important tool in characterizing tissue microstructure based on biophysical models, which are complex and highly non-linear. Resolving microstructures with optimization techniques is prone to estimation errors and requires dense sampling in the q-space. Deep learning based approaches have been proposed to overcome these limitations. Motivated by the superior performance of the Transformer, in this work, we present a learning-based framework based on Transformer, namely, a Microstructure Estimation Transformer with Sparse Coding (METSC) for dMRI-based microstructure estimation with downsampled q-space data. To take advantage of the Transformer while addressing its limitation in large training data requirements, we explicitly introduce an inductive bias - model bias into the Transformer using a sparse coding technique to facilitate the training process. Thus, the METSC is composed with three stages, an embedding stage, a sparse representation stage, and a mapping stage. The embedding stage is a Transformer-based structure that encodes the signal to ensure the voxel is represented effectively. In the sparse representation stage, a dictionary is constructed by solving a sparse reconstruction problem that unfolds the Iterative Hard Thresholding (IHT) process. The mapping stage is essentially a decoder that computes the microstructural parameters from the output of the second stage, based on the weighted sum of normalized dictionary coefficients where the weights are also learned. We tested our framework on two dMRI models with downsampled q-space data, including the intravoxel incoherent motion (IVIM) model and the neurite orientation dispersion and density imaging (NODDI) model. The proposed method achieved up to 11.25 folds of acceleration in scan time and outperformed the other state-of-the-art learning-based methods.


[6] 2205.06465

A New Hybrid Multi-Objective Scheduling Model for Hierarchical Hub and Flexible Flow Shop Problems

Technologies and lifestyles have been increasingly geared toward consumerism in recent years. Accordingly, it is both the price and the delivery time that matter most to the ultimate customers of commercial enterprises. Consequently, the importance of having an optimal delivery time is becoming increasingly evident these days. Scheduling can be used to optimize supply chains and production systems in this manner, which is one practical method for lowering costs and boosting productivity. This paper suggests a multi-objective scheduling model for hierarchical hub structures (HHS) with three levels of service. The factory and customers hub (second level) and central are on the first level in which the factory has a Flexible Flow Shop (FFS) environment. The noncentral hub (third level) is responsible for the delivery of products made in the factory to customers. Customer nodes and factories are connected separately to the second level, and the non-central hubs are connected to the third level. The model's objective is to minimize transportation and production costs and product arrival times. To validate and evaluate the model, small instances have been solved and analyzed in detail with the weighted sum and e-constraint methods. Consequently, based on the ideal mean distance (MID) metric, the two methods were compared for the designed instances. As NP-hardness causes the previously proposed methods to solve large-scale problems to be time-consuming, a meta-heuristic method was developed to solve the large-scale problem.


[7] 2205.06473

Joint Acoustic Echo Cancellation and Blind Source Extraction based on Independent Vector Extraction

We describe a joint acoustic echo cancellation (AEC) and blind source extraction (BSE) approach for multi-microphone acoustic frontends. The proposed algorithm blindly estimates AEC and beamforming filters by maximizing the statistical independence of a non-Gaussian source of interest and a stationary Gaussian background modeling interfering signals and residual echo. Double talk-robust and fast-converging parameter updates are derived from a global maximum-likelihood objective function resulting in a computationally efficient Newton-type update rule. Evaluation with simulated acoustic data confirms the benefit of the proposed joint AEC and beamforming filter estimation in comparison to updating both filters individually.


[8] 2205.06486

A Survey of Left Atrial Appendage Segmentation and Analysis in 3D and 4D Medical Images

Atrial fibrillation (AF) is a cardiovascular disease identified as one of the main risk factors for stroke. The majority of strokes due to AF are caused by clots originating in the left atrial appendage (LAA). LAA occlusion is an effective procedure for reducing stroke risk. Planning the procedure using pre-procedural imaging and analysis has shown benefits. The analysis is commonly done by manually segmenting the appendage on 2D slices. Automatic LAA segmentation methods could save an expert's time and provide insightful 3D visualizations and accurate automatic measurements to aid in medical procedures. Several semi- and fully-automatic methods for segmenting the appendage have been proposed. This paper provides a review of automatic LAA segmentation methods on 3D and 4D medical images, including CT, MRI, and echocardiogram images. We classify methods into heuristic and model-based methods, as well as into semi- and fully-automatic methods. We summarize and compare the proposed methods, evaluate their effectiveness, and present current challenges in the field and approaches to overcome them.


[9] 2205.06489

Joint Power Allocation and Beamformer for mmW-NOMA Downlink Systems by Deep Reinforcement Learning

The high demand for data rate in the next generation of wireless communication could be ensured by Non-Orthogonal Multiple Access (NOMA) approach in the millimetre-wave (mmW) frequency band. Joint power allocation and beamforming of mmW-NOMA systems is mandatory which could be met by optimization approaches. To this end, we have exploited Deep Reinforcement Learning (DRL) approach due to policy generation leading to an optimized sum-rate of users. Actor-critic phenomena are utilized to measure the immediate reward and provide the new action to maximize the overall Q-value of the network. The immediate reward has been defined based on the summation of the rate of two users regarding the minimum guaranteed rate for each user and the sum of consumed power as the constraints. The simulation results represent the superiority of the proposed approach rather than the Time-Division Multiple Access (TDMA) and another NOMA optimized strategy in terms of sum-rate of users.


[10] 2205.06496

Application of NOMA in Vehicular Visible Light Communication Systems

In the context of an increasing interest toward reducing the number of traffic accidents and of associated victims, communication-based vehicle safety applications have emerged as one of the best solutions to enhance road safety. In this area, visible light communications (VLC) have a great potential for applications due to their relatively simple design for basic functioning, efficiency, and large geographical distribution. Vehicular Visible Light Communication (VVLC) is preferred as a vehicle to everything (V2X) communications scheme. Due to its highly secure, low complexity, and radio frequency (RF) interference-free characteristics, exploiting the line of sight (LoS) propagation of visible light and usage of already existing vehicle light-emitting diodes (LEDs). This research is addressing the application of the Non-Orthogonal Multiple Access (NOMA) technique in VLC based Vehicle-to- Vehicle (V2V) communication. The proposed system is simulated in almost realistic conditions and the performance of the system is analyzed under different scenarios.


[11] 2205.06501

Robust Deep Neural Object Detection and Segmentation for Automotive Driving Scenario with Compressed Image Data

Deep neural object detection or segmentation networks are commonly trained with pristine, uncompressed data. However, in practical applications the input images are usually deteriorated by compression that is applied to efficiently transmit the data. Thus, we propose to add deteriorated images to the training process in order to increase the robustness of the two state-of-the-art networks Faster and Mask R-CNN. Throughout our paper, we investigate an autonomous driving scenario by evaluating the newly trained models on the Cityscapes dataset that has been compressed with the upcoming video coding standard Versatile Video Coding (VVC). When employing the models that have been trained with the proposed method, the weighted average precision of the R-CNNs can be increased by up to 3.68 percentage points for compressed input images, which corresponds to bitrate savings of nearly 48 %.


[12] 2205.06511

Analysis of Neural Image Compression Networks for Machine-to-Machine Communication

Video and image coding for machines (VCM) is an emerging field that aims to develop compression methods resulting in optimal bitstreams when the decoded frames are analyzed by a neural network. Several approaches already exist improving classic hybrid codecs for this task. However, neural compression networks (NCNs) have made an enormous progress in coding images over the last years. Thus, it is reasonable to consider such NCNs, when the information sink at the decoder side is a neural network as well. Therefore, we build-up an evaluation framework analyzing the performance of four state-of-the-art NCNs, when a Mask R-CNN is segmenting objects from the decoded image. The compression performance is measured by the weighted average precision for the Cityscapes dataset. Based on that analysis, we find that networks with leaky ReLU as non-linearity and training with SSIM as distortion criteria results in the highest coding gains for the VCM task. Furthermore, it is shown that the GAN-based NCN architecture achieves the best coding performance and even out-performs the recently standardized Versatile Video Coding (VVC) for the given scenario.


[13] 2205.06519

Evaluation of Video Coding for Machines without Ground Truth

In the emerging field of video coding for machines, video datasets with pristine video quality and high-quality annotations are required for a comprehensive evaluation. However, existing video datasets with detailed annotations are severely limited in size and video quality. Thus, current methods have to either evaluate their codecs on still images or on already compressed data. To mitigate this problem, we propose an evaluation method based on pseudo ground-truth data from the field of semantic segmentation to the evaluation of video coding for machines. Through extensive evaluation, this paper shows that the proposed ground-truth-agnostic evaluation method results in an acceptable absolute measurement error below 0.7 percentage points on the Bjontegaard Delta Rate compared to using the true ground truth for mid-range bitrates. We evaluate on the three tasks of semantic segmentation, instance segmentation, and object detection. Lastly, we utilize the ground-truth-agnostic method to measure the coding performances of the VVC compared against HEVC on the Cityscapes sequences. This reveals that the coding position has a significant influence on the task performance.


[14] 2205.06540

Accelerometry-based classification of circulatory states during out-of-hospital cardiac arrest

Objective: During cardiac arrest treatment, a reliable detection of spontaneous circulation, usually performed by manual pulse checks, is both vital for patient survival and practically challenging. Methods: We developed a machine learning algorithm to automatically predict the circulatory state during cardiac arrest treatment from 4-second-long snippets of accelerometry and electrocardiogram data from real-world defibrillator records. The algorithm was trained based on 917 cases from the German Resuscitation Registry, for which ground truth labels were created by a manual annotation of physicians. It uses a kernelized Support Vector Machine classifier based on 14 features, which partially reflect the correlation between accelerometry and electrocardiogram data. Results: On a test data set, the proposed algorithm exhibits an accuracy of 94.4 (93.6, 95.2)%, a sensitivity of 95.0 (93.9, 96.1)%, and a specificity of 93.9 (92.7, 95.1)%. Conclusion and significance: In application, the algorithm may be used to simplify retrospective annotation for quality management and, moreover, to support clinicians to assess circulatory state during cardiac arrest treatment.


[15] 2205.06576

Distribution-Aware Graph Representation Learning for Transient Stability Assessment of Power System

The real-time transient stability assessment (TSA) plays a critical role in the secure operation of the power system. Although the classic numerical integration method, \textit{i.e.} time-domain simulation (TDS), has been widely used in industry practice, it is inevitably trapped in a high computational complexity due to the high latitude sophistication of the power system. In this work, a data-driven power system estimation method is proposed to quickly predict the stability of the power system before TDS reaches the end of simulating time windows, which can reduce the average simulation time of stability assessment without loss of accuracy. As the topology of the power system is in the form of graph structure, graph neural network based representation learning is naturally suitable for learning the status of the power system. Motivated by observing the distribution information of crucial active power and reactive power on the power system's bus nodes, we thus propose a distribution-aware learning~(DAL) module to explore an informative graph representation vector for describing the status of a power system. Then, TSA is re-defined as a binary classification task, and the stability of the system is determined directly from the resulting graph representation without numerical integration. Finally, we apply our method to the online TSA task. The case studies on the IEEE 39-bus system and Polish 2383-bus system demonstrate the effectiveness of our proposed method.


[16] 2205.06612

Event-Based Control for Synchronization of Stochastic Linear Systems with Application to Distributed Estimation

This paper studies the synchronization of stochastic linear systems which are subject to a general class of noises, in the sense that the noises are bounded in covariance but might be correlated with the states of agents and among each other. We propose an event-based control protocol for achieving the synchronization among agents in the mean square sense and theoretically analyze the performance of it by using a stochastic Lyapunov function, where the stability of $c$-martingales is particularly developed to handle the challenges brought by the general model of noises and the event-triggering mechanism. The proposed event-based synchronization algorithm is then applied to solve the problem of distributed estimation in sensor network. Specifically, by losslessly decomposing the optimal Kalman filter, it is shown that the problem of distributed estimation can be resolved by using the algorithms designed for achieving the synchronization of stochastic linear systems. As such, an event-based distributed estimation algorithm is developed, where each sensor performs local filtering solely using its own measurement, together with the proposed event-based synchronization algorithm to fuse the local estimates of neighboring nodes. With the reduced communication frequency, the designed estimator is proved to be stable under the minimal requirements of network connectivity and collective system observability.


[17] 2205.06662

Polarization Tracking in the Presence of PDL and Fast Temporal Drift

In this paper, we analyze the effectiveness of polarization tracking algorithms in optical transmission systems suffering from fast state of polarization (SOP) rotations and polarization-dependent loss (PDL). While most of the gradient descent (GD)-based algorithms in the literature may require step size adjustment when the channel condition changes, we propose tracking algorithms that can perform similarly or better without parameter tuning. Numerical simulation results show higher robustness of the proposed algorithms to SOP and PDL drift compared to GD-based algorithms, making them promising candidates to be used in aerial fiber links where the SOP can potentially drift rapidly, and therefore becomes challenging to track.


[18] 2205.06676

VesNet-RL: Simulation-based Reinforcement Learning for Real-World US Probe Navigation

Ultrasound (US) is one of the most common medical imaging modalities since it is radiation-free, low-cost, and real-time. In freehand US examinations, sonographers often navigate a US probe to visualize standard examination planes with rich diagnostic information. However, reproducibility and stability of the resulting images often suffer from intra- and inter-operator variation. Reinforcement learning (RL), as an interaction-based learning method, has demonstrated its effectiveness in visual navigating tasks; however, RL is limited in terms of generalization. To address this challenge, we propose a simulation-based RL framework for real-world navigation of US probes towards the standard longitudinal views of vessels. A UNet is used to provide binary masks from US images; thereby, the RL agent trained on simulated binary vessel images can be applied in real scenarios without further training. To accurately characterize actual states, a multi-modality state representation structure is introduced to facilitate the understanding of environments. Moreover, considering the characteristics of vessels, a novel standard view recognition approach based on the minimum bounding rectangle is proposed to terminate the searching process. To evaluate the effectiveness of the proposed method, the trained policy is validated virtually on 3D volumes of a volunteer's in-vivo carotid artery, and physically on custom-designed gel phantoms using robotic US. The results demonstrate that proposed approach can effectively and accurately navigate the probe towards the longitudinal view of vessels.


[19] 2205.06695

STAR-RIS-Assisted Hybrid NOMA mmWave Communication: Optimization and Performance Analysis

Simultaneously reflecting and transmitting reconfigurable intelligent surfaces (STAR-RIS) has recently emerged as prominent technology that exploits the transmissive property of RIS to mitigate the half-space coverage limitation of conventional RIS operating on millimeter-wave (mmWave). In this paper, we study a downlink STAR-RIS-based multi-user multiple-input single-output (MU-MISO) mmWave hybrid non-orthogonal multiple access (H-NOMA) wireless network, where a sum-rate maximization problem has been formulated. The design of active and passive beamforming vectors, time and power allocation for H-NOMA is a highly coupled non-convex problem. To handle the problem, we propose an optimization framework based on alternating optimization (AO) that iteratively solves active and passive beamforming sub-problems. Channel correlations and channel strength-based techniques have been proposed for a specific case of two-user optimal clustering and decoding order assignment, respectively, for which analytical solutions to joint power and time allocation for H-NOMA have also been derived. Simulation results show that: 1) the proposed framework leveraging H-NOMA outperforms conventional OMA and NOMA to maximize the achievable sum-rate; 2) using the proposed framework, the supported number of clusters for the given design constraints can be increased considerably; 3) through STAR-RIS, the number of elements can be significantly reduced as compared to conventional RIS to ensure a similar quality-of-service (QoS).


[20] 2205.06727

Energy return on investment analysis of the 2035 Belgian energy system

Planning the defossilization of energy systems by facilitating high penetration of renewables and maintaining access to abundant and affordable primary energy resources is a nontrivial multi-objective problem encompassing economic, technical, environmental, and social aspects. However, so far, most long-term policies to decrease the carbon footprint of our societies consider the cost of the system as the leading indicator in the energy system models. To address this gap, we developed a new approach by adding the energy return on investment (EROI) in a whole-energy system model. We built the database with all EROI technologies and resources considered. This novel model is applied to the Belgian energy system in 2035 for several greenhouse gas emissions targets. However, moving away from fossil-based to carbon-neutral energy systems raises the issue of the uncertainty of low-carbon technologies and resource data. Thus, we conduct a global sensitivity analysis to identify the main parameters driving the variations in the EROI of the system. In this case study, the main results are threefold: (i) the EROI of the system decreases from 8.9 to 3.9 when greenhouse gas emissions are reduced by 5; (ii) the renewable fuels - mainly imported renewable gas - represent the largest share of the system primary energy mix due to the lack of endogenous renewable resources such as wind and solar; (iii) in the sensitivity analysis, the renewable fuels drive 67% of the variation of the EROI of the system for low greenhouse gas emissions scenarios. The decrease in the EROI of the system raises questions about meeting the climate targets without adverse socio-economic impact. Thus, accounting for other criteria in energy planning models that nuance the cost-based results is essential to guide policy-makers in addressing the challenges of the energy transition.


[21] 2205.06754

Slimmable Video Codec

Neural video compression has emerged as a novel paradigm combining trainable multilayer neural networks and machine learning, achieving competitive rate-distortion (RD) performances, but still remaining impractical due to heavy neural architectures, with large memory and computational demands. In addition, models are usually optimized for a single RD tradeoff. Recent slimmable image codecs can dynamically adjust their model capacity to gracefully reduce the memory and computation requirements, without harming RD performance. In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder. Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency, all being important requirements for practical video compression.


[22] 2205.06774

Controlled Mobility for C-V2X Road Safety Reception Optimization

The use case of C-V2X for road safety requires real-time network connection and information exchanging between vehicles. In order to improve the reliability and safety of the system, intelligent networked vehicles need to move cooperatively to achieve network optimization. In this paper, we use the C-V2X sidelink mode 4 abstraction and the regression results of C-V2X network level simulation to formulate the optimization of packet reception rate (PRR) with fairness in the road safety scenario. Under the optimization framework, we design a controlled mobility algorithm for the transmission node to adaptively adjust its position to maximize the aggregated PRR using only one-hop information. Simulation result shows that the algorithm converges and improve the aggregated PRR and fairness for C-V2X mode broadcast messages.


[23] 2205.06776

Prototype Development and Validation of a Beam-Divergence Control System for Free-Space Laser Communications

Being able to dynamically control the transmitted-beam divergence can bring important advantages in free-space optical communications. Specifically, this technique can help to optimize the overall communications performance when the optimum laser-beam divergence is not fixed or known. This is the case in most realistic space laser communication systems, since the optimum beam divergence depends on multiple factors that can vary with time, such as the link distance, or cannot be accurately known, such as the actual pointing accuracy. A dynamic beam-divergence control allows to optimize the link performance for every platform, scenario, and condition. NICT is currently working towards the development of a series of versatile lasercom terminals that can fit a variety of conditions, for which the adaptive element of the transmitted beam divergence is a key element. This manuscript presents a prototype of a beam-divergence control system designed and developed by NICT and Tamron to evaluate this technique and to be later integrated within the lasercom terminals. The basic design of the prototype is introduced as well as the first validation tests that demonstrate its performance.


[24] 2205.06808

High-Frequency Tunable Resistorless Memcapacitor Emulator and Application

In this paper, a new design has been proposed for the realization of high-frequency memcapacitor emulators built with three OTAs. This paper also proposes the application of memcapacitor as an amplitude modulator. Furthermore, applications of memcapacitor as a filter, Oscillator point attractor, and periodic doubler are also shown. The proposed circuits can be configured in both incremental and decremental topology. The proposed circuits and their application claim that the circuit is much simpler in design and can be utilized in both topologies. The performance of all the proposed circuits has been verified on Cadence Virtuoso Spectre using standard CMOS 180nm. Furthermore, post-layout simulations and their comparison have been carried out.


[25] 2205.06306

Probabilistic Estimation of Chirp Instantaneous Frequency Using Gaussian Processes

We present a probabilistic approach for estimating chirp signal and its instantaneous frequency function when the true forms of the chirp and instantaneous frequency are unknown. To do so, we represent them by joint cascading Gaussian processes governed by a non-linear stochastic differential equation, and estimate their posterior distribution by using stochastic filters and smoothers. The model parameters are determined via maximum likelihood estimation. Theoretical results show that the estimation method has a bounded mean squared error. Experiments show that the method outperforms a number of baseline methods on a synthetic model, and we also apply the method to analyse a gravitational wave data.


[26] 2205.06392

Efficient Path Planning and Tracking for Multi-Modal Legged-Aerial Locomotion Using Integrated Probabilistic Road Maps (PRM) and Reference Governors (RG)

There have been several successful implementations of bio-inspired legged robots that can trot, walk, and hop robustly even in the presence of significant unplanned disturbances. Despite all of these accomplishments, practical control and high-level decision-making algorithms in multi-modal legged systems are overlooked. In nature, animals such as birds impressively showcase multiple modes of mobility including legged and aerial locomotion. They are capable of performing robust locomotion over large walls, tight spaces, and can recover from unpredictable situations such as sudden gusts or slippery surfaces. Inspired by these animals' versatility and ability to combine legged and aerial mobility to negotiate their environment, our main goal is to design and control legged robots that integrate two completely different forms of locomotion, ground and aerial mobility, in a single platform. Our robot, the Husky Carbon, is being developed to integrate aerial and legged locomotion and to transform between legged and aerial mobility. This work utilizes a Reference Governor (RG) based on low-level control of Husky's dynamical model to maintain the efficiency of legged locomotion, uses Probabilistic Road Maps (PRM) and 3D A* algorithms to generate an optimal path based on the energetic cost of transport for legged and aerial mobility


[27] 2205.06395

Bang-Bang Control Of A Tail-less Morphing Wing Flight

Bats' dynamic morphing wings are known to be extremely high-dimensional, and they employ the combination of inertial dynamics and aerodynamics manipulations to showcase extremely agile maneuvers. Bats heavily rely on their highly flexible wings and are capable of dynamically morphing their wings to adjust aerodynamic and inertial forces applied to their wing and perform sharp banking turns. There are technical hardware and control challenges in copying the morphing wing flight capabilities of flying animals. This work is majorly focused on the modeling and control aspects of stable, tail-less, morphing wing flight. A classical control approach using bang-bang control is proposed to stabilize a bio-inspired morphing wing robot called Aerobat. Robot-environment interactions based on horseshoe vortex shedding and Wagner functions is derived to realistically evaluate the feasibility of the bang-bang control, which is then implemented on the robot in experiments to demonstrate first-time closed-loop stable flights of Aerobat.


[28] 2205.06407

Tensor Decompositions for Hyperspectral Data Processing in Remote Sensing: A Comprehensive Review

Owing to the rapid development of sensor technology, hyperspectral (HS) remote sensing (RS) imaging has provided a significant amount of spatial and spectral information for the observation and analysis of the Earth's surface at a distance of data acquisition devices, such as aircraft, spacecraft, and satellite. The recent advancement and even revolution of the HS RS technique offer opportunities to realize the full potential of various applications, while confronting new challenges for efficiently processing and analyzing the enormous HS acquisition data. Due to the maintenance of the 3-D HS inherent structure, tensor decomposition has aroused widespread concern and research in HS data processing tasks over the past decades. In this article, we aim at presenting a comprehensive overview of tensor decomposition, specifically contextualizing the five broad topics in HS data processing, and they are HS restoration, compressed sensing, anomaly detection, super-resolution, and spectral unmixing. For each topic, we elaborate on the remarkable achievements of tensor decomposition models for HS RS with a pivotal description of the existing methodologies and a representative exhibition on the experimental results. As a result, the remaining challenges of the follow-up research directions are outlined and discussed from the perspective of the real HS RS practices and tensor decomposition merged with advanced priors and even with deep neural networks. This article summarizes different tensor decomposition-based HS data processing methods and categorizes them into different classes from simple adoptions to complex combinations with other priors for the algorithm beginners. We also expect this survey can provide new investigations and development trends for the experienced researchers who understand tensor decomposition and HS RS to some extent.


[29] 2205.06412

Optimal Order of Encoding for Gaussian MIMO Multi-Receiver Wiretap Channel

The Gaussian multiple-input multiple-output (MIMO) multi-receiver wiretap channel is studied in this paper. The base station broadcasts confidential messages to K intended users while keeping the messages secret from an eavesdropper. The capacity of this channel has already been characterized by applying dirty-paper coding and stochastic encoding. However, K factorial encoding orders may need to be enumerated for that, which makes the problem intractable. We prove that there exists one optimal encoding order and reduced the K factorial times to a one-time encoding. The optimal encoding order is proved by forming a secrecy weighted sum rate (WSR) maximization problem. The optimal order is the same as that for the MIMO broadcast channel without secrecy constraint, that is, the weight of users' rate in the WSR maximization problem determines the optimal encoding order. Numerical results verify the optimal encoding order.


[30] 2205.06460

Blind Deconvolution with Non-smooth Regularization via Bregman Proximal DCAs

Blind deconvolution is a technique to recover an original signal without knowing a convolving filter. It is naturally formulated as a minimization of a quartic objective function under some assumption. Because its differentiable part does not have a Lipschitz continuous gradient, existing first-order methods are not theoretically supported. In this letter, we employ the Bregman-based proximal methods, whose convergence is theoretically guaranteed under the $L$-smad property. We first reformulate the objective function as a difference of convex (DC) functions and apply the Bregman proximal DC algorithm (BPDCA). This DC decomposition satisfies the $L$-smad property. The method is extended to the BPDCA with extrapolation (BPDCAe) for faster convergence. When our regularizer has a sufficiently simple structure, each iteration is solved in a closed-form expression, and thus our algorithms solve large-scale problems efficiently. We also provide the stability analysis of the equilibriums and demonstrate the proposed methods through numerical experiments on image deblurring. The results show that BPDCAe successfully recovered the original image and outperformed other existing algorithms.


[31] 2205.06471

Data-Driven Upper Bounds on Channel Capacity

We consider the problem of estimating an upper bound on the capacity of a memoryless channel with unknown channel law and continuous output alphabet. A novel data-driven algorithm is proposed that exploits the dual representation of capacity where the maximization over the input distribution is replaced with a minimization over a reference distribution on the channel output. To efficiently compute the required divergence maximization between the conditional channel and the reference distribution, we use a modified mutual information neural estimator that takes the channel input as an additional parameter. We evaluate our approach on different memoryless channels and show that the estimated upper bounds closely converge either to the channel capacity or to best-known lower bounds.


[32] 2205.06515

An Information-theoretic Method for Collaborative Distributed Learning with Limited Communication

In this paper, we study the information transmission problem under the distributed learning framework, where each worker node is merely permitted to transmit a $m$-dimensional statistic to improve learning results of the target node. Specifically, we evaluate the corresponding expected population risk (EPR) under the regime of large sample sizes. We prove that the performance can be enhanced since the transmitted statistics contribute to estimating the underlying distribution under the mean square error measured by the EPR norm matrix. Accordingly, the transmitted statistics correspond to the eigenvectors of this matrix, and the desired transmission allocates these eigenvectors among the statistics such that the EPR is minimal. Moreover, we provide the analytical solution of the desired statistics for single-node and two-node transmission, where a geometrical interpretation is given to explain the eigenvector selection. For the general case, an efficient algorithm that can output the allocation solution is developed based on the node partitions.


[33] 2205.06637

Energy-Delay Minimization of Task Migration Based on Game Theory in MEC-assisted Vehicular Networks

Roadside units (RSUs), which have strong computing capability and are close to vehicle nodes, have been widely used to process delay- and computation-intensive tasks of vehicle nodes. However, due to their high mobility, vehicles may drive out of the coverage of RSUs before receiving the task processing results. In this paper, we propose a mobile edge computing-assisted vehicular network, where vehicles can offload their tasks to a nearby vehicle via a vehicle-to-vehicle (V2V) link or a nearby RSU via a vehicle-to-infrastructure link. These tasks are also migrated by a V2V link or an infrastructure-to-infrastructure (I2I) link to avoid the scenario where the vehicles cannot receive the processed task from the RSUs. Considering mutual interference from the same link of offloading tasks and migrating tasks, we construct a vehicle offloading decision-based game to minimize the computation overhead. We prove that the game can always achieve Nash equilibrium and convergence by exploiting the finite improvement property. We then propose a task migration (TM) algorithm that includes three task-processing methods and two task-migration methods. Based on the TM algorithm, computation overhead minimization offloading (COMO) algorithm is presented. Extensive simulation results show that the proposed TM and COMO algorithms reduce the computation overhead and increase the success rate of task processing.


[34] 2205.06655

Unified Modeling of Multi-Domain Multi-Device ASR Systems

Modern Automatic Speech Recognition (ASR) systems often use a portfolio of domain-specific models in order to get high accuracy for distinct user utterance types across different devices. In this paper, we propose an innovative approach that integrates the different per-domain per-device models into a unified model, using a combination of domain embedding, domain experts, mixture of experts and adversarial training. We run careful ablation studies to show the benefit of each of these innovations in contributing to the accuracy of the overall unified model. Experiments show that our proposed unified modeling approach actually outperforms the carefully tuned per-domain models, giving relative gains of up to 10% over a baseline model with negligible increase in the number of parameters.


[35] 2205.06799

The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes

The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch sensor data; and in the Mosquitoes Sub-Challenge, mosquitoes need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the usual ComPaRE and BoAW features, the auDeep toolkit, and deep feature extraction from pre-trained CNNs using the DeepSpectRum toolkit; in addition, we add end-to-end sequential modelling, and a log-mel-128-BNN.