New articles on cs

[1] 2007.01299

Generating Adversarial Examples withControllable Non-transferability

Adversarial attacks against Deep Neural Networks have been widely studied. One significant feature that makes such attacks particularly powerful is transferability, where the adversarial examples generated from one model can be effective against other similar models as well. A large number of works have been done to increase the transferability. However, how to decrease the transferability and craft malicious samples only for specific target models are not explored yet. In this paper, we design novel attack methodologies to generate adversarial examples with controllable non-transferability. With these methods, an adversary can efficiently produce precise adversarial examples to attack a set of target models he desires, while keeping benign to other models. The first method is Reversed Loss Function Ensemble, where the adversary can craft qualified examples from the gradients of a reversed loss function. This approach is effective for the white-box and gray-box settings. The second method is Transferability Classification: the adversary trains a transferability-aware classifier from the perturbations of adversarial examples. This classifier further provides the guidance for the generation of non-transferable adversarial examples. This approach can be applied to the black-box scenario. Evaluation results demonstrate the effectiveness and efficiency of our proposed methods. This work opens up a new route for generating adversarial examples with new features and applications.

[2] 2007.01327

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization

In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data. However, the local estimates on each machine are typically biased, relative to the full solution on all of the data, and this can limit the effectiveness of averaging. Here, we introduce a new technique for debiasing the local estimates, which leads to both theoretical and empirical improvements in the convergence rate of distributed second order methods. Our technique has two novel components: (1) modifying standard sketching techniques to obtain what we call a surrogate sketch; and (2) carefully scaling the global regularization parameter for local computations. Our surrogate sketches are based on determinantal point processes, a family of distributions for which the bias of an estimate of the inverse Hessian can be computed exactly. Based on this computation, we show that when the objective being minimized is $l_2$-regularized with parameter $\lambda$ and individual machines are each given a sketch of size $m$, then to eliminate the bias, local estimates should be computed using a shrunk regularization parameter given by $\lambda^{\prime}=\lambda\cdot(1-\frac{d_{\lambda}}{m})$, where $d_{\lambda}$ is the $\lambda$-effective dimension of the Hessian (or, for quadratic problems, the data matrix).

[3] 2007.01330

A priori and a posteriori error estimates for the quad-curl eigenvalue problem

In this paper, we propose a new family of H(curl^2)-conforming elements for the quad-curl eigenvalue problem in 2D. The accuracy of this family is one order higher than that in [32]. We prove a priori and a posteriori error estimates. The a priori estimate of the eigenvalue with a convergence order 2(s-1) is obtained if the eigenvector u\in H^{s+1}(\Omega). For the a posteriori estimate, by analyzing the associated source problem, we obtain lower and upper bounds for the eigenvector in an energy norm and an upper bound for the eigenvalues. Numerical examples are presented for validation.

[4] 2007.01346

Spectral Methods for Ranking with Scarce Data

Given a number of pairwise preferences of items, a common task is to rank all the items. Examples include pairwise movie ratings, New Yorker cartoon caption contests, and many other consumer preferences tasks. What these settings have in common is two-fold: a scarcity of data (it may be costly to get comparisons for all the pairs of items) and additional feature information about the items (e.g., movie genre, director, and cast). In this paper we modify a popular and well studied method, RankCentrality for rank aggregation to account for few comparisons and that incorporates additional feature information. This method returns meaningful rankings even under scarce comparisons. Using diffusion based methods, we incorporate feature information that outperforms state-of-the-art methods in practice. We also provide improved sample complexity for RankCentrality in a variety of sampling schemes.

[5] 2007.01348

Efficient Neural Network Deployment for Microcontroller

Edge computing for neural networks is getting important especially for low power applications and offline devices. TensorFlow Lite and PyTorch Mobile were released for this purpose. But they mainly support mobile devices instead of microcontroller level yet. Microcontroller support is an emerging area now. There are many approaches to reduce network size and compute load like pruning, binarization and layer manipulation i.e. operator reordering. This paper is going to explore and generalize convolution neural network deployment for microcontrollers with two novel optimization proposals offering memory saving and compute efficiency in 2D convolutions as well as fully connected layers. The first one is in-place max-pooling, if the stride is greater than or equal to pooling kernel size. The second optimization is to use ping-pong buffers between layers to reduce memory consumption significantly. The memory savings and performance will be compared with CMSIS-NN framework developed for ARM Cortex-M CPUs. The final purpose is to develop a tool consuming PyTorch model with trained network weights, and it turns into an optimized inference engine(forward pass) in C/C++ for low memory(kilobyte level) and limited computing capable microcontrollers.

[6] 2007.01350

Uncertainty Prediction for Deep Sequential Regression Using Meta Models

Generating high quality uncertainty estimates for sequential regression, particularly deep recurrent networks, remains a challenging and open problem. Existing approaches often make restrictive assumptions (such as stationarity) yet still perform poorly in practice, particularly in presence of real world non-stationary signals and drift. This paper describes a flexible method that can generate symmetric and asymmetric uncertainty estimates, makes no assumptions about stationarity, and outperforms competitive baselines on both drift and non drift scenarios. This work helps make sequential regression more effective and practical for use in real-world applications, and is a powerful new addition to the modeling toolbox for sequential uncertainty quantification in general.

[7] 2007.01359

Bayesian multilingual topic model for zero-shot cross-lingual topic identification

This paper presents a Bayesian multilingual topic model for learning language-independent document embeddings. Our model learns to represent the documents in the form of Gaussian distributions, thereby encoding the uncertainty in its covariance. We propagate the learned uncertainties through linear classifiers for zero-shot cross-lingual topic identification. Our experiments on 5 language Europarl and Reuters (MLDoc) corpora show that the proposed model outperforms multi-lingual word embedding and BiLSTM sentence encoder based systems with significant margins in the majority of the transfer directions. Moreover, our system trained under a single day on a single GPU with much lower amounts of data performs competitively as compared to the state-of-the-art universal BiLSTM sentence encoder trained on 93 languages. Our experimental analysis shows that the amount of parallel data improves the overall performance of embeddings. Nonetheless, exploiting the uncertainties is always beneficial.

[8] 2007.01369

Low-Power Object Counting with Hierarchical Neural Networks

Deep Neural Networks (DNNs) can achieve state-of-the-art accuracy in many computer vision tasks, such as object counting. Object counting takes two inputs: an image and an object query and reports the number of occurrences of the queried object. To achieve high accuracy on such tasks, DNNs require billions of operations, making them difficult to deploy on resource-constrained, low-power devices. Prior work shows that a significant number of DNN operations are redundant and can be eliminated without affecting the accuracy. To reduce these redundancies, we propose a hierarchical DNN architecture for object counting. This architecture uses a Region Proposal Network (RPN) to propose regions-of-interest (RoIs) that may contain the queried objects. A hierarchical classifier then efficiently finds the RoIs that actually contain the queried objects. The hierarchy contains groups of visually similar object categories. Small DNNs are used at each node of the hierarchy to classify between these groups. The RoIs are incrementally processed by the hierarchical classifier. If the object in an RoI is in the same group as the queried object, then the next DNN in the hierarchy processes the RoI further; otherwise, the RoI is discarded. By using a few small DNNs to process each image, this method reduces the memory requirement, inference time, energy consumption, and number of operations with negligible accuracy loss when compared with the existing object counters.

[9] 2007.01374

Smt-Switch: a solver-agnostic C++ API for SMT Solving

This extended abstract describes work in progress on Smt-Switch, an open-source, solver-agnostic API for SMT solving. Smt-Switch provides an abstract interface, which can be implemented by different SMT solvers. Smt-Switch provides simple, uniform, and high-performance access to SMT solving for applications in areas such as automated reasoning, planning, and formal verification. The interface allows the user to create, traverse, and manipulate terms, as well as to dynamically dispatch queries to different underlying SMT solvers.

[10] 2007.01375

LSTFCoDel: CoDel with LSTF-Style Priority Queuing

Congestion control is vastly important in computer networks. Arising naturally from the bursty nature of Internet traffic, congestion plagues not only the network edge, but also the network core. Many remedies have been proposed to fight congestion; active queue management (AQM) is one such proposal. AQM seeks to prevent congestion by actively avoiding it. Some queuing disciplines such as Random Early Detection (RED) will prematurely drop a random packet (with some probability) when the queue nears capacity to signal the sender to back off. However, RED utilizes queue length as a mechanism to indicate congestion. On the other hand, the Controlled Delay (CoDel) queuing discipline uses queuing delay as an indication of congestion. The problem with both RED and CoDel are that they indiscriminately treat all packets the same. Normally implemented using a FIFO queue, CoDel simply enqueues and dequeues packets in a first-come, first-served manner. Priority queuing can be carefully utilized to selectively service packets utilizing the very same metric CoDel uses for AQM, queuing delay. That said, Least Slack Time First (LSTF), a multi-processor scheduling algorithm employs priority scheduling, which coincidentally, is also based on delay. In the context of computer networks LSTF can be applied in the control plane or in the data plane. At the control plane, LSTF functions across the entire network, but in doing so requires all intermediary routers to implement it; LSTF also requires support at the packet level in terms of a slack entry. Within the data plane, LSTF can be implemented as a queuing mechanism based on delay spent in the router (just like CoDel AQM). This paper applies data plane level LSTF to CoDel AQM to enable delay-based packet classification within the confines of the CoDel AQM algorithm.

[11] 2007.01376

Improved bounds for noisy group testing with constant tests per item

The group testing problem is concerned with identifying a small set of infected individuals in a large population. At our disposal is a testing procedure that allows us to test several individuals together. In an idealized setting, a test is positive if and only if at least one infected individual is included and negative otherwise. Significant progress was made in recent years towards understanding the information-theoretic and algorithmic properties in this noiseless setting. In this paper, we consider a noisy variant of group testing where test results are flipped with certain probability, including the realistic scenario where sensitivity and specificity can take arbitrary values. Using a test design where each individual is assigned to a fixed number of tests, we derive explicit algorithmic bounds for two commonly considered inference algorithms and thereby improve on results by Scarlett \& Cevher (SODA 2016) and Scarlett \& Johnson (2020) and providing the strongest performance guarantees currently proved for these noisy group testing models.

[12] 2007.01377

DATE: Defense Against TEmperature Side-Channel Attacks in DVFS Enabled MPSoCs

Given the constant rise in utilizing embedded devices in daily life, side channels remain a challenge to information flow control and security in such systems. One such important security flaw could be exploited through temperature side-channel attacks, where heat dissipation and propagation from the processing elements are observed over time in order to deduce security flaws. In our proposed methodology, DATE: Defense Against TEmperature side-channel attacks, we propose a novel approach of reducing spatial and temporal thermal gradient, which makes the system more secure against temperature side-channel attacks, and at the same time increases the reliability of the device in terms of lifespan. In this paper, we have also introduced a new metric, Thermal-Security-in-Multi-Processors (TSMP), which is capable of quantifying the security against temperature side-channel attacks on computing systems, and DATE is evaluated to be 139.24% more secure at the most for certain applications than the state-of-the-art, while reducing thermal cycle by 67.42% at the most.

[13] 2007.01379

Improving Event Detection using Contextual Word and Sentence Embeddings

The task of Event Detection (ED) is a subfield of Information Extraction (IE) that consists in recognizing event mentions in natural language texts. Several applications can take advantage of an ED system, including alert systems, text summarization, question-answering systems, and any system that needs to extract structured information about events from unstructured texts. ED is a complex task, which is hampered by two main challenges: the lack of a dataset large enough to train and test the developed models and the variety of event type definitions that exist in the literature. These problems make generalization hard to achieve, resulting in poor adaptation to different domains and targets. The main contribution of this paper is the design, implementation and evaluation of a recurrent neural network model for ED that combines several features. In particular, the paper makes the following contributions: (1) it uses BERT embeddings to define contextual word and contextual sentence embeddings as attributes, which to the best of our knowledge were never used before for the ED task; (2) the proposed model has the ability to use its first layer to learn good feature representations; (3) a new public dataset with a general definition of event; (4) an extensive empirical evaluation that includes (i) the exploration of different architectures and hyperparameters, (ii) an ablation test to study the impact of each attribute, and (iii) a comparison with a replication of a state-of-the-art model. The results offer several insights into the importance of contextual embeddings and indicate that the proposed approach is effective in the ED task, outperforming the baseline models.

[14] 2007.01380

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints

Determination of inspection and maintenance policies for minimizing long-term risks and costs in deteriorating engineering environments constitutes a complex optimization problem. Major computational challenges include the (i) curse of dimensionality, due to exponential scaling of state/action set cardinalities with the number of components; (ii) curse of history, related to exponentially growing decision-trees with the number of decision-steps; (iii) presence of state uncertainties, induced by inherent environment stochasticity and variability of inspection/monitoring measurements; (iv) presence of constraints, pertaining to stochastic long-term limitations, due to resource scarcity and other infeasible/undesirable system responses. In this work, these challenges are addressed within a joint framework of constrained Partially Observable Markov Decision Processes (POMDP) and multi-agent Deep Reinforcement Learning (DRL). POMDPs optimally tackle (ii)-(iii), combining stochastic dynamic programming with Bayesian inference principles. Multi-agent DRL addresses (i), through deep function parametrizations and decentralized control assumptions. Challenge (iv) is herein handled through proper state augmentation and Lagrangian relaxation, with emphasis on life-cycle risk-based constraints and budget limitations. The underlying algorithmic steps are provided, and the proposed framework is found to outperform well-established policy baselines and facilitate adept prescription of inspection and intervention actions, in cases where decisions must be made in the most resource- and risk-aware manner.

[15] 2007.01381

D-NetPAD: An Explainable and Interpretable Iris Presentation Attack Detector

An iris recognition system is vulnerable to presentation attacks, or PAs, where an adversary presents artifacts such as printed eyes, plastic eyes, or cosmetic contact lenses to circumvent the system. In this work, we propose an effective and robust iris PA detector called D-NetPAD based on the DenseNet convolutional neural network architecture. It demonstrates generalizability across PA artifacts, sensors and datasets. Experiments conducted on a proprietary dataset and a publicly available dataset (LivDet-2017) substantiate the effectiveness of the proposed method for iris PA detection. The proposed method results in a true detection rate of 98.58\% at a false detection rate of 0.2\% on the proprietary dataset and outperfoms state-of-the-art methods on the LivDet-2017 dataset. We visualize intermediate feature distributions and fixation heatmaps using t-SNE plots and Grad-CAM, respectively, in order to explain the performance of D-NetPAD. Further, we conduct a frequency analysis to explain the nature of features being extracted by the network. The source code and trained model are available at

[16] 2007.01382

WattScale: A Data-driven Approach for Energy Efficiency Analytics of Buildings at Scale

Buildings consume over 40% of the total energy in modern societies, and improving their energy efficiency can significantly reduce our energy footprint. In this paper, we present \texttt{WattScale}, a data-driven approach to identify the least energy-efficient buildings from a large population of buildings in a city or a region. Unlike previous methods such as least-squares that use point estimates, \texttt{WattScale} uses Bayesian inference to capture the stochasticity in the daily energy usage by estimating the distribution of parameters that affect a building. Further, it compares them with similar homes in a given population. \texttt{WattScale} also incorporates a fault detection algorithm to identify the underlying causes of energy inefficiency. We validate our approach using ground truth data from different geographical locations, which showcases its applicability in various settings. \texttt{WattScale} has two execution modes -- (i) individual, and (ii) region-based, which we highlight using two case studies. For the individual execution mode, we present results from a city containing >10,000 buildings and show that more than half of the buildings are inefficient in one way or another indicating a significant potential from energy improvement measures. Additionally, we provide probable cause of inefficiency and find that 41\%, 23.73\%, and 0.51\% homes have poor building envelope, heating, and cooling system faults, respectively. For the region-based execution mode, we show that \texttt{WattScale} can be extended to millions of homes in the US due to the recent availability of representative energy datasets.

[17] 2007.01386

Posterior Model Adaptation With Updated Priors

Classification approaches based on the direct estimation and analysis of posterior probabilities will degrade if the original class priors begin to change. We prove that a unique (up to scale) solution is possible to recover the data likelihoods for a test example from its original class posteriors and dataset priors. Given the recovered likelihoods and a set of new priors, the posteriors can be re-computed using Bayes' Rule to reflect the influence of the new priors. The method is simple to compute and allows a dynamic update of the original posteriors.

[18] 2007.01387

An Algebraic Approach for the Stability Analysis of BLDC Motor Controllers

This paper presents an algebraic technique to compute the maximum time-delay that can be accepted in the control loop of a Brushless DC Motor (BLDCM) speed controller before the closed loop response becomes unstable. Using a recently proposed time-delay stability analysis methodology, we derive accurate stability conditions for the BLDCM speed controller. The results of applying the new method show that tuning the PI controller for very fast response in the order of magnitude of the BLDCM mechanical time constant cause the time-delay to significantly affect the system stability.

[19] 2007.01388

Learn Faster and Forget Slower via Fast and Stable Task Adaptation

Training Deep Neural Networks (DNNs) is still highly time-consuming and compute-intensive. It has been shown that adapting a pretrained model may significantly accelerate this process. With a focus on classification, we show that current fine-tuning techniques make the pretrained models catastrophically forget the transferred knowledge even before anything about the new task is learned. Such rapid knowledge loss undermines the merits of transfer learning and may result in a much slower convergence rate compared to when the maximum amount of knowledge is exploited. We investigate the source of this problem from different perspectives and to alleviate it, introduce Fast And Stable Task-adaptation (FAST), an easy to apply fine-tuning algorithm. The paper provides a novel geometric perspective on how the loss landscape of source and target tasks are linked in different transfer learning strategies. We empirically show that compared to prevailing fine-tuning practices, FAST learns the target task faster and forgets the source task slower. The code is available at

[20] 2007.01391

Secure Beamforming and Ergodic Secrecy Rate Analysis for Amplify-and-Forward Relay Networks with Wireless Powered Jammer

In this correspondence, we consider an amplify-and-forward relay network in which relayed information is overheard by an eavesdropper. In order to confound the eavesdropper, a wireless-powered jammer is also considered which harvests energy from a multiple-antenna source. We proposed a new secure beamforming scheme in which beamforming vector is a linear combination of the energy beamforming (EB) and information beamforming (IB) vectors. We also present a new closed-form solution for the proposed beamforming vector which is shown to achieve a higher secrecy rate as compared to the trivial EB and IB vectors. Moreover, a tight closed-form approximation for the ergodic secrecy rate is also derived for the asymptotic regime of a large number of antennas at the source. Finally, numerical examples and simulations are provided which validate our analytical results.

[21] 2007.01395

Scalable Comparative Visualization of Ensembles of Call Graphs

Optimizing the performance of large-scale parallel codes is critical for efficient utilization of computing resources. Code developers often explore various execution parameters, such as hardware configurations, system software choices, and application parameters, and are interested in detecting and understanding bottlenecks in different executions. They often collect hierarchical performance profiles represented as call graphs, which combine performance metrics with their execution contexts. The crucial task of exploring multiple call graphs together is tedious and challenging because of the many structural differences in the execution contexts and significant variability in the collected performance metrics (e.g., execution runtime). In this paper, we present an enhanced version of CallFlow to support the exploration of ensembles of call graphs using new types of visualizations, analysis, graph operations, and features. We introduce ensemble-Sankey, a new visual design that combines the strengths of resource-flow (Sankey) and box-plot visualization techniques. Whereas the resource-flow visualization can easily and intuitively describe the graphical nature of the call graph, the box plots overlaid on the nodes of Sankey convey the performance variability within the ensemble. Our interactive visual interface provides linked views to help explore ensembles of call graphs, e.g., by facilitating the analysis of structural differences, and identifying similar or distinct call graphs. We demonstrate the effectiveness and usefulness of our design through case studies on large-scale parallel codes.

[22] 2007.01397

Adaptive Braking for Mitigating Gradient Delay

Neural network training is commonly accelerated by using multiple synchronized workers to compute gradient updates in parallel. Asynchronous methods remove synchronization overheads and improve hardware utilization at the cost of introducing gradient delay, which impedes optimization and can lead to lower final model performance. We introduce Adaptive Braking (AB), a modification for momentum-based optimizers that mitigates the effects of gradient delay. AB dynamically scales the gradient based on the alignment of the gradient and the velocity. This can dampen oscillations along high curvature directions of the loss surface, stabilizing and accelerating asynchronous training. We show that applying AB on top of SGD with momentum enables training ResNets on CIFAR-10 and ImageNet-1k with delays $D \geq$ 32 update steps with minimal drop in final test accuracy.

[23] 2007.01404

Crowdfunding for Design Innovation: Prediction Model with Critical Factors

Online reward-based crowdfunding campaigns have emerged as an innovative approach for validating demands, discovering early adopters, and seeking learning and feedback in the design processes of innovative products. However, crowdfunding campaigns for innovative products are faced with a high degree of uncertainty and suffer meager rates of success to fulfill their values for design. To guide designers and innovators for crowdfunding campaigns, this paper presents a data-driven methodology to build a prediction model with critical factors for crowdfunding success, based on public online crowdfunding campaign data. Specifically, the methodology filters 26 candidate factors in the Real-Win-Worth framework and identifies the critical ones via step-wise regression to predict the amount of crowdfunding. We demonstrate the methodology via deriving prediction models and identifying essential factors from 3D printer and smartwatch campaign data on Kickstarter and Indiegogo. The critical factors can guide campaign developments, and the prediction model may evaluate crowdfunding potential of innovations in contexts, to increase the chance of crowdfunding success of innovative products.

[24] 2007.01408

Creating a content delivery network for general science on the internet backbone using XCaches

A general problem faced by computing on the grid for opportunistic users is that delivering cycles is simpler than delivering data to those cycles. In this project we show how we integrated XRootD caches placed on the internet backbone to implement a content delivery network for general science workflows. We will show that for some workflows on different science domains like high energy physics, gravitational waves, and others the combination of data reuse from the workflows together with the use of caches increases CPU efficiency while decreasing network bandwidth use.

[25] 2007.01409

A (Slightly) Improved Approximation Algorithm for Metric TSP

For some $\epsilon > 10^{-36}$ we give a $3/2-\epsilon$ approximation algorithm for metric TSP.

[26] 2007.01410

Weighted estimates of the Cayley transform method for boundary value problems in a Banach space

We consider the boundary value problems (BVPs) for linear secondorder ODEs with a strongly positive operator coefficient in a Banach space. The solutions are given in the form of the infinite series by means of the Cayley transform of the operator, the Meixner type polynomials of the independent variable, the operator Green function and the Fourier series representation for the right-hand side of the equation. The approximate solution of each problem is a partial sum of N (or expressed through N) summands. We prove the weighted error estimates depending on the discretization parameter N, the distance of the independent variable to the boundary points of the interval and some smoothness properties of the input data.

[27] 2007.01416

An order-adaptive compact approximation Taylor method for systems of conservation laws

We present a new family of high-order shock-capturing finite difference numerical methods for systems of conservation laws. These methods, called Adaptive Compact Approximation Taylor (ACAT) schemes, use centered $(2p + 1)$-point stencils, where $p$ may take values in $\{1, 2, \dots, P\}$ according to a new family of smoothness indicators in the stencils. The methods are based on a combination of a robust first order scheme and the Compact Approximate Taylor (CAT) methods of order $2p$-order, $p=1,2,\dots, P$ so that they are first order accurate near discontinuities and have order $2p$ in smooth regions, where $(2p +1)$ is the size of the biggest stencil in which large gradients are not detected. CAT methods, introduced in \cite{CP2019}, are an extension to nonlinear problems of the Lax-Wendroff methods in which the Cauchy-Kovalesky (CK) procedure is circumvented following the strategy introduced in \cite{ZBM2017} that allows one to compute time derivatives in a recursive way using high-order centered differentiation formulas combined with Taylor expansions in time. The expression of ACAT methods for 1D and 2D systems of balance laws are given and the performance is tested in a number of test cases for several linear and nonlinear systems of conservation laws, including Euler equations for gas dynamics.

[28] 2007.01418

Learning Orientation Distributions for Object Pose Estimation

For robots to operate robustly in the real world, they should be aware of their uncertainty. However, most methods for object pose estimation return a single point estimate of the object's pose. In this work, we propose two learned methods for estimating a distribution over an object's orientation. Our methods take into account both the inaccuracies in the pose estimation as well as the object symmetries. Our first method, which regresses from deep learned features to an isotropic Bingham distribution, gives the best performance for orientation distribution estimation for non-symmetric objects. Our second method learns to compare deep features and generates a non-parameteric histogram distribution. This method gives the best performance on objects with unknown symmetries, accurately modeling both symmetric and non-symmetric objects, without any requirement of symmetry annotation. We show that both of these methods can be used to augment an existing pose estimator. Our evaluation compares our methods to a large number of baseline approaches for uncertainty estimation across a variety of different types of objects.

[29] 2007.01419

Persistent Neurons

Most algorithms used in neural networks(NN)-based leaning tasks are strongly affected by the choices of initialization. Good initialization can avoid sub-optimal solutions and alleviate saturation during training. However, designing improved initialization strategies is a difficult task and our understanding of good initialization is still very primitive. Here, we propose persistent neurons, a strategy that optimizes the learning trajectory using information from previous converged solutions. More precisely, we let the parameters explore new landscapes by penalizing the model from converging to the previous solutions under the same initialization. Specifically, we show that persistent neurons, under certain data distribution, is able to converge to more optimal solutions while initializations under popular framework find bad local minima. We further demonstrate that persistent neurons helps improve the model's performance under both good and poor initializations. Moreover, we evaluate full and partial persistent model and show it can be used to boost the performance on a range of NN structures, such as AlexNet and residual neural network. Saturation of activation functions during persistent training is also studied.

[30] 2007.01420

Learning Neural Networks with Competing Physics Objectives: An Application in Quantum Mechanics

Physics-guided Machine Learning (PGML) is an emerging field of research in machine learning (ML) that aims to harness the power of ML advances without ignoring the rich knowledge of physics underlying scientific phenomena. One of the promising directions in PGML is to modify the objective function of neural networks by adding physics-guided (PG) loss functions that measure the violation of physics objectives in the ANN outputs. Existing PGML approaches generally focus on incorporating a single physics objective as a PG loss, using constant trade-off parameters. However, in the presence of multiple physics objectives with competing non-convex PG loss terms, there is a need to adaptively tune the importance of competing PG loss terms during the process of neural network training. We present a novel approach to handle competing PG loss terms in the illustrative application of quantum mechanics, where the two competing physics objectives are minimizing the energy while satisfying the Schrodinger equation. We conducted a systematic evaluation of the effects of PG loss on the generalization ability of neural networks in comparison with several baseline methods in PGML. All the code and data used in this work is available at

[31] 2007.01423

Maximizing Cohesion and Separation in Graph Representation Learning: A Distance-aware Negative Sampling Approach

The objective of unsupervised graph representation learning (GRL) is to learn a low-dimensional space of node embeddings that reflect the structure of a given unlabeled graph. Existing algorithms for this task rely on negative sampling objectives that maximize the similarity in node embeddings at nearby nodes (referred to as "cohesion") by maintaining positive and negative corpus of node pairs. While positive samples are drawn from node pairs that co-occur in short random walks, conventional approaches construct negative corpus by uniformly sampling random pairs, thus ignoring valuable information about structural dissimilarity among distant node pairs (referred to as "separation"). In this paper, we present a novel Distance-aware Negative Sampling (DNS) which maximizes the separation of distant node-pairs while maximizing cohesion at nearby node-pairs by setting the negative sampling probability proportional to the pair-wise shortest distances. Our approach can be used in conjunction with any GRL algorithm and we demonstrate the efficacy of our approach over baseline negative sampling methods over downstream node classification tasks on a number of benchmark datasets and GRL algorithms. All our codes and datasets are available at \url{}.

[32] 2007.01428

Medial Axis Isoperimetric Profiles

Recently proposed as a stable means of evaluating geometric compactness, the isoperimetric profile of a planar domain measures the minimum perimeter needed to inscribe a shape with prescribed area varying from 0 to the area of the domain. While this profile has proven valuable for evaluating properties of geographic partitions, existing algorithms for its computation rely on aggressive approximations and are still computationally expensive. In this paper, we propose a practical means of approximating the isoperimetric profile and show that for domains satisfying a "thick neck" condition, our approximation is exact. For more general domains, we show that our bound is still exact within a conservative regime and is otherwise an upper bound. Our method is based on a traversal of the medial axis which produces efficient and robust results. We compare our technique with the state-of-the-art approximation to the isoperimetric profile on a variety of domains and show significantly tighter bounds than were previously achievable.

[33] 2007.01429

The Global Landscape of Neural Networks: An Overview

One of the major concerns for neural network training is that the non-convexity of the associated loss functions may cause bad landscape. The recent success of neural networks suggests that their loss landscape is not too bad, but what specific results do we know about the landscape? In this article, we review recent findings and results on the global landscape of neural networks. First, we point out that wide neural nets may have sub-optimal local minima under certain assumptions. Second, we discuss a few rigorous results on the geometric properties of wide networks such as "no bad basin", and some modifications that eliminate sub-optimal local minima and/or decreasing paths to infinity. Third, we discuss visualization and empirical explorations of the landscape for practical neural nets. Finally, we briefly discuss some convergence results and their relation to landscape results.

[34] 2007.01431

Graphs without gap-vertex-labellings: families and bounds

A proper labelling of a graph $G$ is a pair $({\pi},c_{\pi})$ in which ${\pi}$ is an assignment of numeric labels to some elements of $G$, and $c_{\pi}$ is a colouring induced by ${\pi}$ through some mathematical function over the set of labelled elements. In this work, we consider gap-vertex-labellings, in which the colour of a vertex is determined by a function considering the largest difference between the labels assigned to its neighbours. We present the first upper-bound for the vertex-gap number of arbitrary graphs, which is the least number of labels required to properly label a graph. We investigate families of graphs which do not admit any gap-vertex-labelling, regardless of the number of labels. Furthermore, we introduce a novel parameter associated with this labelling and provide bounds for it for complete graphs ${K_n}$.

[35] 2007.01434

In Search of Lost Domain Generalization

The goal of domain generalization algorithms is to predict well on distributions different from those seen during training. While a myriad of domain generalization algorithms exist, inconsistencies in experimental conditions -- datasets, architectures, and model selection criteria -- render fair and realistic comparisons difficult. In this paper, we are interested in understanding how useful domain generalization algorithms are in realistic settings. As a first step, we realize that model selection is non-trivial for domain generalization tasks. Contrary to prior work, we argue that domain generalization algorithms without a model selection strategy should be regarded as incomplete. Next, we implement DomainBed, a testbed for domain generalization including seven multi-domain datasets, nine baseline algorithms, and three model selection criteria. We conduct extensive experiments using DomainBed and find that, when carefully implemented, empirical risk minimization shows state-of-the-art performance across all datasets. Looking forward, we hope that the release of DomainBed, along with contributions from fellow researchers, will streamline reproducible and rigorous research in domain generalization.

[36] 2007.01435

A variational formulation for motion design of adaptive compliant structures

Adaptive structures are characterized by their ability to adjust their geometrical and other properties to changing loads or requirements during service. This contribution deals with a method for the design of quasi-static motions of structures between two prescribed geometrical configurations that are optimal with regard to a specified quality function while taking large deformations into account. It is based on a variational formulation and the solution by two finite element discretizations, the spatial discretization (the standard finite element mesh) and an additional discretization of the deformation path or trajectory. For the investigations, an exemplary objective function, the minimization of the internal energy, integrated along the deformation path, is used. The method for motion design presented herein uses the Newton-Raphson method as a second order optimization algorithm and allows for analytical sensitivity analysis. The proposed method is verified and its properties are investigated by benchmark examples including rigid body motions, instability phenomena and determination of inextensible deformations of shells.

[37] 2007.01441

Joint Frequency- and Image-Space Learning for Fourier Imaging

We propose a neural network layer structure that combines frequency and image feature representations for robust Fourier image reconstruction. Our work is motivated by the challenges in magnetic resonance imaging (MRI) where the acquired signal is a corrupted Fourier transform of the desired image. The proposed layer structure enables both correction of artifacts native to the frequency-space and manipulation of image-space representations to reconstruct coherent image structures. This is in contrast to the current deep learning approaches for image reconstruction that manipulate data solely in the frequency-space or solely in the image-space. We demonstrate the advantages of the proposed joint learning on three diverse tasks including image reconstruction from undersampled acquisitions, motion correction, and image denoising in brain MRI. Unlike purely image based and purely frequency based architectures, the proposed joint model produces consistently high quality output images. The resulting joint frequency- and image-space feature representations promise to significantly improve modeling and reconstruction of images acquired in the frequency-space. Our code is available at

[38] 2007.01442

Multi-Agent Low-Dimensional Linear Bandits

We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown vector $\theta^* \in \mathbb{R}^d$. The side information consists of a finite collection of low-dimensional subspaces, one of which contains $\theta^*$. In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other, and each agent plays a projected variant of LinUCB on the corresponding (low-dimensional) subspace. Through a combination of collaborative best subspace identification, and per-agent learning of an unknown vector in the corresponding low-dimensional subspace, we show that the per-agent regret is much smaller than the case when agents do not communicate. By collaborating to identify the subspace containing $\theta^*$, we show that each agent effectively solves an easier instance of the linear bandit (compared to the case of no collaboration), thus leading to the reduced per-agent regret. We finally complement these results through simulations.

[39] 2007.01445

Minimizing Convex Functions with Integral Minimizers

Given a separation oracle $\mathsf{SO}$ for a convex function $f$ that has an integral minimizer inside a box with radius $R$, we show how to efficiently find a minimizer of $f$ using at most $O(n (n + \log(R)))$ calls to $\mathsf{SO}$. When the set of minimizers of $f$ has integral extreme points, our algorithm outputs an integral minimizer of $f$. This improves upon the previously best oracle complexity of $O(n^2 (n + \log(R)))$ obtained by an elegant application of [Frank and Tardos, Combinatorica 1987] due to Dadush. We conjecture that our oracle complexity is tight up to constant factors. Our result immediately implies a strongly polynomial algorithm for the Submodular Function Minimization problem that makes at most $O(n^3)$ calls to an evaluation oracle. This improves upon the previously best $O(n^3 \log^2(n))$ oracle complexity for strongly polynomial algorithms given in [Lee, Sidford and Wong, FOCS 2015] and [Dadush, V{\'e}gh and Zambelli, SODA 2018], and an exponential time algorithm with oracle complexity $O(n^3 \log(n))$ given in the former work, answering two open problems posted therein. Our result is achieved by an application of the LLL algorithm [Lenstra, Lenstra and Lov\'asz, Math. Ann. 1982] for the shortest lattice vector problem. We show how an approximately shortest vector of certain lattice can be used to reduce the dimension of the problem, and how the oracle complexity of such a procedure is advantageous compared with the method that uses the Frank-Tardos framework. Our analysis of the oracle complexity is based on a potential function that captures simultaneously the size of the search set and the density of the lattice. To achieve the $O(n^2)$ term in the oracle complexity, technical ingredients from convex geometry are applied.

[40] 2007.01447

Improved Preterm Prediction Based on Optimized Synthetic Sampling of EHG Signal

Preterm labor is the leading cause of neonatal morbidity and mortality and has attracted research efforts from many scientific areas. The inter-relationship between uterine contraction and the underlying electrical activities makes uterine electrohysterogram (EHG) a promising direction for preterm detection and prediction. Due the scarcity of EHG signals, especially those of preterm patients, synthetic algorithms are applied to create artificial samples of preterm type in order to remove prediction bias towards term, at the expense of a reduction of the feature effectiveness in machine-learning based automatic preterm detecting. To address such problem, we quantify the effect of synthetic samples (balance coefficient) on features' effectiveness, and form a general performance metric by utilizing multiple feature scores with relevant weights that describe their contributions to class separation. Combined with the activation/inactivation functions that characterizes the effect of the abundance of training samples in term and preterm prediction precision, we obtain an optimal sample balance coefficient that compromise the effect of synthetic samples in removing bias towards the majority and the side-effect of reducing features' importance. Substantial improvement in prediction precision has been achieved through a set of numerical tests on public available TPEHG database, and it verifies the effectiveness of the proposed method.

[41] 2007.01455

Text-based Emotion Aware Recommender

We extend the concept of using an active user's emotion embeddings and movies' emotion embeddings to evaluate a Recommender top-N recommendation list as illustrated in a previous paper to encompass the emotional features of a film as a component of building Emotion Aware Recommender Systems. Using textual movie metadata, we develop a comparative platform that consists of five recommenders based on content-based and collaborative filtering algorithms. We then apply the movie emotion embeddings obtained from classifying the emotional features of movie overviews by the Tweets Emotion Classifier, which we have developed to add an emotional dimension of embeddings for the Recommender. Emotion Aware Recommender's top-N recommendations list shows intrigue results which are quite different from its peer. We reckon that the Emotion Aware Recommender top-N list, which matches the active user's emotional profile, is useful for providing serendipity recommendations and remedying the cold start problem commonly present in Recommender.

[42] 2007.01458

Confidence-Aware Learning for Deep Neural Networks

Despite the power of deep neural networks for a wide range of tasks, an overconfident prediction issue has limited their practical use in many safety-critical applications. Many recent works have been proposed to mitigate this issue, but most of them require either additional computational costs in training and/or inference phases or customized architectures to output confidence estimates separately. In this paper, we propose a method of training deep neural networks with a novel loss function, named Correctness Ranking Loss, which regularizes class probabilities explicitly to be better confidence estimates in terms of ordinal ranking according to confidence. The proposed method is easy to implement and can be applied to the existing architectures without any modification. Also, it has almost the same computational costs for training as conventional deep classifiers and outputs reliable predictions by a single inference. Extensive experimental results on classification benchmark datasets indicate that the proposed method helps networks to produce well-ranked confidence estimates. We also demonstrate that it is effective for the tasks closely related to confidence estimation, out-of-distribution detection and active learning.

[43] 2007.01459

A New Theoretical Framework of Pyramid Markov Processes for Blockchain Selfish Mining

In this paper, we provide a new theoretical framework of pyramid Markov processes to solve some open and fundamental problems of blockchain selfish mining. To this end, we first describe a more general blockchain selfish mining with both a two-block leading competitive criterion and a new economic incentive, and establish a pyramid Markov process to express the dynamic behavior of the selfish mining from both consensus protocol and economic incentive. Then we show that the pyramid Markov process is stable and so is the blockchain, and its stationary probability vector is matrix-geometric with an explicitly representable rate matrix. Furthermore, we use the stationary probability vector to be able to analyze the waste of computational resource due to generating a lot of orphan (or stale) blocks. Nextly, we set up a pyramid Markov reward process to investigate the long-run average profits of the honest and dishonest mining pools, respectively. Specifically, we show that the long-run average profits are multivariate linear such that we can measure the improvement of mining efficiency of the dishonest mining pool comparing to the honest mining pool. As a by-product, we build three approximative Markov processes when the system states are described as the block-number difference of two forked block branches. Also, by using their special cases with non network latency, we can further provide some useful interpretation for both the Markov chain (Figure 1) and the revenue analysis ((1) to (3)) of the seminal work by Eyal and Sirer (2014). Finally, we use some numerical examples to verify the correctness and computability of our theoretical results. We hope that the methodology and results developed in this paper shed light on the blockchain selfish mining such that a series of promising research can be produced potentially.

[44] 2007.01462

Rigorous Quantum Formulation of Parity-Time Symmetric Coupled Resonators

Rigorous quantum formulation of the Parity-Time (PT) symmetry phenomenon in the RF/microwave regime for a coupled coil resonators with lump elements has been presented. The coil resonator is described by the lump-element model that consists of an inductor (L), a resistor (R) and a capacitor (C). Rigorous quantum Hamiltonian for the coupled LRC coil resonators system has been derived through twice basis transforms of the original basis. The first basis transform rotates the original basis such that off-diagonal terms of the governing matrix of the equation system of the coupled coil resonators reduces to constants. Then a second basis transform obtains the quantum Hamiltonian, including the diagonal effective complex frequencies and the off-diagonal coupling terms, together with the transformed basis. With the obtain quantum Hamiltonian, the eigenvalues and eigenvectors of the coupled coil resonators can be obtained as usual as the quantum Hamiltonian. Finally, numerical simulation verifies the correctness of the theory. The quantum formulation of the coupled coil resonators can provide better guideline to design a better PT-symmetric system.

[45] 2007.01463

The prolonged service time at non-dedicated servers in a pooling system

In this paper, we investigate the effect of the prolonged service time at the non-dedicated servers in a pooling system on the system performance. We consider the two-server loss model with exponential interarrival and service times. We show that if the ratio of the mean service time at the dedicated server and the mean prolonged service time at the non-dedicated server exceeds a certain threshold, pooling would become unfavourable. In particular, the threshold is explicitly provided. Moreover, when the degree of the prolonged service time is pre-specified, we show that the pooling system with prolonged service time at non-dedicated servers is not preferred when the work load in the system is greater than a certain threshold.

[46] 2007.01464

Anatomy-Aware Siamese Network: Exploiting Semantic Asymmetry for Accurate Pelvic Fracture Detection in X-ray Images

Visual cues of enforcing bilaterally symmetric anatomies as normal findings are widely used in clinical practice to disambiguate subtle abnormalities from medical images. So far, inadequate research attention has been received on effectively emulating this practice in CAD methods. In this work, we exploit semantic anatomical symmetry or asymmetry analysis in a complex CAD scenario, i.e., anterior pelvic fracture detection in trauma PXRs, where semantically pathological (refer to as fracture) and non-pathological (e.g., pose) asymmetries both occur. Visually subtle yet pathologically critical fracture sites can be missed even by experienced clinicians, when limited diagnosis time is permitted in emergency care. We propose a novel fracture detection framework that builds upon a Siamese network enhanced with a spatial transformer layer to holistically analyze symmetric image features. Image features are spatially formatted to encode bilaterally symmetric anatomies. A new contrastive feature learning component in our Siamese network is designed to optimize the deep image features being more salient corresponding to the underlying semantic asymmetries (caused by pelvic fracture occurrences). Our proposed method have been extensively evaluated on 2,359 PXRs from unique patients (the largest study to-date), and report an area under ROC curve score of 0.9771. This is the highest among state-of-the-art fracture detection methods, with improved clinical indications.

[47] 2007.01465

Deep-PowerX: A Deep Learning-Based Framework for Low-Power Approximate Logic Synthesis

This paper aims at integrating three powerful techniques namely Deep Learning, Approximate Computing, and Low Power Design into a strategy to optimize logic at the synthesis level. We utilize advances in deep learning to guide an approximate logic synthesis engine to minimize the dynamic power consumption of a given digital CMOS circuit, subject to a predetermined error rate at the primary outputs. Our framework, Deep-PowerX, focuses on replacing or removing gates on a technology-mapped network and uses a Deep Neural Network (DNN) to predict error rates at primary outputs of the circuit when a specific part of the netlist is approximated. The primary goal of Deep-PowerX is to reduce the dynamic power whereas area reduction serves as a secondary objective. Using the said DNN, Deep-PowerX is able to reduce the exponential time complexity of standard approximate logic synthesis to linear time. Experiments are done on numerous open source benchmark circuits. Results show significant reduction in power and area by up to 1.47 times and 1.43 times compared to exact solutions and by up to 22% and 27% compared to state-of-the-art approximate logic synthesis tools while having orders of magnitudes lower run-time.

[48] 2007.01466

Task-agnostic Temporally Consistent Facial Video Editing

Recent research has witnessed the advances in facial image editing tasks. For video editing, however, previous methods either simply apply transformations frame by frame or utilize multiple frames in a concatenated or iterative fashion, which leads to noticeable visual flickers. In addition, these methods are confined to dealing with one specific task at a time without any extensibility. In this paper, we propose a task-agnostic temporally consistent facial video editing framework. Based on a 3D reconstruction model, our framework is designed to handle several editing tasks in a more unified and disentangled manner. The core design includes a dynamic training sample selection mechanism and a novel 3D temporal loss constraint that fully exploits both image and video datasets and enforces temporal consistency. Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.

[49] 2007.01471

An adaptive multiresolution ultra-weak discontinuous Galerkin method for nonlinear Schrodinger equations

This paper develops a high order adaptive scheme for solving nonlinear Schrodinger equations. The solutions to such equations often exhibit solitary wave and local structures, which makes adaptivity essential in improving the simulation efficiency. Our scheme uses the ultra-weak discontinuous Galerkin (DG) formulation and belongs to the framework of adaptive multiresolution schemes. Various numerical experiments are presented to demonstrate the excellent capability of capturing the soliton waves and the blow-up phenomenon.

[50] 2007.01472

Increasing Trustworthiness of Deep Neural Networks via Accuracy Monitoring

Inference accuracy of deep neural networks (DNNs) is a crucial performance metric, but can vary greatly in practice subject to actual test datasets and is typically unknown due to the lack of ground truth labels. This has raised significant concerns with trustworthiness of DNNs, especially in safety-critical applications. In this paper, we address trustworthiness of DNNs by using post-hoc processing to monitor the true inference accuracy on a user's dataset. Concretely, we propose a neural network-based accuracy monitor model, which only takes the deployed DNN's softmax probability output as its input and directly predicts if the DNN's prediction result is correct or not, thus leading to an estimate of the true inference accuracy. The accuracy monitor model can be pre-trained on a dataset relevant to the target application of interest, and only needs to actively label a small portion (1% in our experiments) of the user's dataset for model transfer. For estimation robustness, we further employ an ensemble of monitor models based on the Monte-Carlo dropout method. We evaluate our approach on different deployed DNN models for image classification and traffic sign detection over multiple datasets (including adversarial samples). The result shows that our accuracy monitor model provides a close-to-true accuracy estimation and outperforms the existing baseline methods.

[51] 2007.01475

ODE-CNN: Omnidirectional Depth Extension Networks

Omnidirectional 360{\deg} camera proliferates rapidly for autonomous robots since it significantly enhances the perception ability by widening the field of view(FoV). However, corresponding 360{\deg} depth sensors, which are also critical for the perception system, are still difficult or expensive to have. In this paper, we propose a low-cost 3D sensing system that combines an omnidirectional camera with a calibrated projective depth camera, where the depth from the limited FoV can be automatically extended to the rest of the recorded omnidirectional image. To accurately recover the missing depths, we design an omnidirectional depth extension convolutional neural network(ODE-CNN), in which a spherical feature transform layer(SFTL) is embedded at the end of feature encoding layers, and a deformable convolutional spatial propagation network(D-CSPN) is appended at the end of feature decoding layers. The former resamples the neighborhood of each pixel in the omnidirectional coordination to the projective coordination, which reduces the difficulty of feature learning, and the later automatically finds a proper context to well align the structures in the estimated depths via CNN w.r.t. the reference image, which significantly improves the visual quality. Finally, we demonstrate the effectiveness of proposed ODE-CNN over the popular 360D dataset and show that ODE-CNN significantly outperforms (relatively 33% reduction in-depth error) other state-of-the-art (SoTA) methods.

[52] 2007.01476

Interactive Knowledge Distillation

Knowledge distillation is a standard teacher-student learning framework to train a light-weight student network under the guidance of a well-trained large teacher network. As an effective teaching strategy, interactive teaching has been widely employed at school to motivate students, in which teachers not only provide knowledge but also give constructive feedback to students upon their responses, to improve their learning performance. In this work, we propose an InterActive Knowledge Distillation (IAKD) scheme to leverage the interactive teaching strategy for efficient knowledge distillation. In the distillation process, the interaction between teacher and student networks is implemented by a swapping-in operation: randomly replacing the blocks in the student network with the corresponding blocks in the teacher network. In the way, we directly involve the teacher's powerful feature transformation ability to largely boost the student's performance. Experiments with typical settings of teacher-student networks demonstrate that the student networks trained by our IAKD achieve better performance than those trained by conventional knowledge distillation methods on diverse image classification datasets.

[53] 2007.01480

RSAC: Regularized Subspace Approximation Classifier for Lightweight Continuous Learning

Continuous learning seeks to perform the learning on the data that arrives from time to time. While prior works have demonstrated several possible solutions, these approaches require excessive training time as well as memory usage. This is impractical for applications where time and storage are constrained, such as edge computing. In this work, a novel training algorithm, regularized subspace approximation classifier (RSAC), is proposed to achieve lightweight continuous learning. RSAC contains a feature reduction module and classifier module with regularization. Extensive experiments show that RSAC is more efficient than prior continuous learning works and outperforms these works on various experimental settings.

[54] 2007.01481

Ordinary Facet Angles of a Stroked Path Tessellated by Uniform Tangent Angle Steps Are Bounded by Twice the Step Angle

We explain geometrically why ordinary facet angles of a stroked path tessellated from uniform tangent angle steps are bounded by twice the step angle. This fact means---excluding a small number of extraordinary facet angles straddling offset cusps---our polar stroking method bounds the facet angle size to less than $2 \theta$ where $\theta$ is the tangent step angle.

[55] 2007.01483

A decentralized framework for simultaneous calibration, localization and mapping with multiple LiDARs

LiDAR is playing a more and more essential role in autonomous driving vehicles for objection detection, self localization and mapping. A single LiDAR frequently suffers from hardware failure (e.g., temporary loss of connection) due to the harsh vehicle environment (e.g., temperature, vibration, etc.), or performance degradation due to the lack of sufficient geometry features, especially for solid-state LiDARs with small field of view (FoV). To improve the system robustness and performance in self-localization and mapping, we develop a decentralized framework for simultaneous calibration, localization and mapping with multiple LiDARs. Our proposed framework is based on an extended Kalman filter (EKF), but is specially formulated for decentralized implementation. Such an implementation could potentially distribute the intensive computation among smaller computing devices or resources dedicated for each LiDAR and remove the single point of failure problem. Then this decentralized formulation is implemented on an unmanned ground vehicle (UGV) carrying 5 low-cost LiDARs and moving at $1.3m/s$ in urban environments. Experiment results show that the proposed method can successfully and simultaneously estimate the vehicle state (i.e., pose and velocity) and all LiDAR extrinsic parameters. The localization accuracy is up to 0.2% on the two datasets we collected. To share our findings and to make contributions to the community, meanwhile enable the readers to verify our work, we will release all our source codes and hardware design blueprint on our Github.

[56] 2007.01486

Learning to Prune in Training via Dynamic Channel Propagation

In this paper, we propose a novel network training mechanism called "dynamic channel propagation" to prune the neural networks during the training period. In particular, we pick up a specific group of channels in each convolutional layer to participate in the forward propagation in training time according to the significance level of channel, which is defined as channel utility. The utility values with respect to all selected channels are updated simultaneously with the error back-propagation process and will adaptively change. Furthermore, when the training ends, channels with high utility values are retained whereas those with low utility values are discarded. Hence, our proposed scheme trains and prunes neural networks simultaneously. We empirically evaluate our novel training scheme on various representative benchmark datasets and advanced convolutional neural network (CNN) architectures, including VGGNet and ResNet. The experiment results verify the superior performance and robust effectiveness of our approach.

[57] 2007.01488

On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation

The goal of text generation models is to fit the underlying real probability distribution of text. For performance evaluation, quality and diversity metrics are usually applied. However, it is still not clear to what extend can the quality-diversity evaluation reflect the distribution-fitting goal. In this paper, we try to reveal such relation in a theoretical approach. We prove that under certain conditions, a linear combination of quality and diversity constitutes a divergence metric between the generated distribution and the real distribution. We also show that the commonly used BLEU/Self-BLEU metric pair fails to match any divergence metric, thus propose CR/NRR as a substitute for quality/diversity metric pair.

[58] 2007.01491

Self-Supervised GAN Compression

Deep learning's success has led to larger and larger models to handle more and more complex tasks; trained models can contain millions of parameters. These large models are compute- and memory-intensive, which makes it a challenge to deploy them with minimized latency, throughput, and storage requirements. Some model compression methods have been successfully applied to image classification and detection or language models, but there has been very little work compressing generative adversarial networks (GANs) performing complex tasks. In this paper, we show that a standard model compression technique, weight pruning, cannot be applied to GANs using existing methods. We then develop a self-supervised compression technique which uses the trained discriminator to supervise the training of a compressed generator. We show that this framework has a compelling performance to high degrees of sparsity, can be easily applied to new tasks and models, and enables meaningful comparisons between different pruning granularities.

[59] 2007.01493

On Symbolically Encoding the Behavior of Random Forests

Recent work has shown that the input-output behavior of some machine learning systems can be captured symbolically using Boolean expressions or tractable Boolean circuits, which facilitates reasoning about the behavior of these systems. While most of the focus has been on systems with Boolean inputs and outputs, we address systems with discrete inputs and outputs, including ones with discretized continuous variables as in systems based on decision trees. We also focus on the suitability of encodings for computing prime implicants, which have recently played a central role in explaining the decisions of machine learning systems. We show some key distinctions with encodings for satisfiability, and propose an encoding that is sound and complete for the given task.

[60] 2007.01496

Few-Shot Semantic Segmentation Augmented with Image-Level Weak Annotations

Despite the great progress made by deep neural networks in the semantic segmentation task, traditional neural network-based methods typically suffer from a shortage of large amounts of pixel-level annotations. Recent progress in few-shot semantic segmentation tackles the issue by utilizing only a few pixel-level annotated examples. However, these few-shot approaches cannot easily be applied to utilize image-level weak annotations, which can easily be obtained and considerably improve performance in the semantic segmentation task. In this paper, we advance the few-shot segmentation paradigm towards a scenario where image-level annotations are available to help the training process of a few pixel-level annotations. Specifically, we propose a new framework to learn the class prototype representation in the metric space by integrating image-level annotations. Furthermore, a soft masked average pooling strategy is designed to handle distractions in image-level annotations. Extensive empirical results on PASCAL-5i show that our method can achieve 5.1% and 8.2% increases of mIoU score for one-shot settings with pixel-level and scribble annotations, respectively.

[61] 2007.01498

Temporal-Logic-Based Reward Shaping for Continuing Learning Tasks

In continuing tasks, average-reward reinforcement learning may be a more appropriate problem formulation than the more common discounted reward formulation. As usual, learning an optimal policy in this setting typically requires a large amount of training experiences. Reward shaping is a common approach for incorporating domain knowledge into reinforcement learning in order to speed up convergence to an optimal policy. However, to the best of our knowledge, the theoretical properties of reward shaping have thus far only been established in the discounted setting. This paper presents the first reward shaping framework for average-reward learning and proves that, under standard assumptions, the optimal policy under the original reward function can be recovered. In order to avoid the need for manual construction of the shaping function, we introduce a method for utilizing domain knowledge expressed as a temporal logic formula. The formula is automatically translated to a shaping function that provides additional reward throughout the learning process. We evaluate the proposed method on three continuing tasks. In all cases, shaping speeds up the average-reward learning rate without any reduction in the performance of the learned policy compared to relevant baselines.

[62] 2007.01499

A Competence-aware Curriculum for Visual Concepts Learning via Question Answering

Humans can progressively learn visual concepts from easy to hard questions. To mimic this efficient learning ability, we propose a competence-aware curriculum for visual concept learning in a question-answering manner. Specifically, we design a neural-symbolic concept learner for learning the visual concepts and a multi-dimensional Item Response Theory (mIRT) model for guiding the learning process with an adaptive curriculum. The mIRT effectively estimates the concept difficulty and the model competence at each learning step from accumulated model responses. The estimated concept difficulty and model competence are further utilized to select the most profitable training samples. Experimental results on CLEVR show that with a competence-aware curriculum, the proposed method achieves state-of-the-art performances with superior data efficiency and convergence speed. Specifically, the proposed model only uses 40% of training data and converges three times faster compared with other state-of-the-art methods.

[63] 2007.01500

Self-supervised Neural Architecture Search

Neural Architecture Search (NAS) has been used recently to achieve improved performance in various tasks and most prominently in image classification. Yet, current search strategies rely on large labeled datasets, which limit their usage in the case where only a smaller fraction of the data is annotated. Self-supervised learning has shown great promise in training neural networks using unlabeled data. In this work, we propose a self-supervised neural architecture search (SSNAS) that allows finding novel network models without the need for labeled data. We show that such a search leads to comparable results to supervised training with a "fully labeled" NAS and that it can improve the performance of self-supervised learning. Moreover, we demonstrate the advantage of the proposed approach when the number of labels in the search is relatively small.

[64] 2007.01502

DICE: Automatic Emulation of DMA Input Channels for Dynamic Firmware Analysis

Microcontroller-based embedded devices are at the core of Internet-of-Things and Cyber-Physical Systems. The security of these devices is of paramount importance. Among the approaches to securing embedded devices, dynamic firmware analysis gained great attention lately, thanks to its offline nature and low false-positive rates. However, regardless of the analysis and emulation techniques used, existing dynamic firmware analyzers share a major limitation, namely the inability to handle firmware using DMA. It severely limits the types of devices supported and firmware code coverage. We present DICE, a drop-in solution for firmware analyzers to emulate DMA input channels and generate or manipulate DMA inputs. DICE is designed to be hardware-independent, and compatible with common MCU firmware and embedded architectures. DICE identifies DMA input channels as the firmware writes the source and destination DMA transfer pointers into the DMA controller. Then DICE manipulates the input transferred through DMA on behalf of the firmware analyzer. We integrated DICE to the firmware analyzer P2IM (Cortex-M architecture) and a PIC32 emulator (MIPS M4K/M-Class architecture). We evaluated it on 83 benchmarks and sample firmware, representing 9 different DMA controllers from 5 different vendors. DICE detected 33 out of 37 DMA input channels, with 0 false positives. It correctly supplied DMA inputs to 21 out of 22 DMA buffers, which previous firmware analyzers cannot achieve due to the lack of DMA emulation. DICE's overhead is fairly low, it adds 3.4% on average to P2IM execution time. We also fuzz-tested 7 real-world firmware using DICE and compared the results with the original P2IM. DICE uncovered tremendously more execution paths (as much as 79X) and found 5 unique previously-unknown bugs that are unreachable without DMA emulation. All our source code and dataset are publicly available.

[65] 2007.01503

Mathematical Perspective of Machine Learning

We take a closer look at some theoretical challenges of Machine Learning as a function approximation, gradient descent as the default optimization algorithm, limitations of fixed length and width networks and a different approach to RNNs from a mathematical perspective.

[66] 2007.01504

A Similarity Inference Metric for RGB-Infrared Cross-Modality Person Re-identification

RGB-Infrared (IR) cross-modality person re-identification (re-ID), which aims to search an IR image in RGB gallery or vice versa, is a challenging task due to the large discrepancy between IR and RGB modalities. Existing methods address this challenge typically by aligning feature distributions or image styles across modalities, whereas the very useful similarities among gallery samples of the same modality (i.e. intra-modality sample similarities) is largely neglected. This paper presents a novel similarity inference metric (SIM) that exploits the intra-modality sample similarities to circumvent the cross-modality discrepancy targeting optimal cross-modality image matching. SIM works by successive similarity graph reasoning and mutual nearest-neighbor reasoning that mine cross-modality sample similarities by leveraging intra-modality sample similarities from two different perspectives. Extensive experiments over two cross-modality re-ID datasets (SYSU-MM01 and RegDB) show that SIM achieves significant accuracy improvement but with little extra training as compared with the state-of-the-art.

[67] 2007.01507

Towards Robust Deep Learning with Ensemble Networks and Noisy Layers

In this paper we provide an approach for deep learning that protects against adversarial examples in image classification-type networks. The approach relies on two mechanisms:1) a mechanism that increases robustness at the expense of accuracy, and, 2) a mechanism that improves accuracy but does not always increase robustness. We show that an approach combining the two mechanisms can provide protection against adversarial examples while retaining accuracy. We formulate potential attacks on our approach and provide experimental results to demonstrate the effectiveness of our approach.

[68] 2007.01510

MIRA: Leveraging Multi-Intention Co-click Information in Web-scale Document Retrieval using Deep Neural Networks

We study the problem of deep recall model in industrial web search, which is, given a user query, retrieve hundreds of most relevance documents from billions of candidates. The common framework is to train two encoding models based on neural embedding which learn the distributed representations of queries and documents separately and match them in the latent semantic space. However, all the exiting encoding models only leverage the information of the document itself, which is often not sufficient in practice when matching with query terms, especially for the hard tail queries. In this work we aim to leverage the additional information for each document from its co-click neighbour to help document retrieval. The challenges include how to effectively extract information and eliminate noise when involving co-click information in deep model while meet the demands of billion-scale data size for real time online inference. To handle the noise in co-click relations, we firstly propose a web-scale Multi-Intention Co-click document Graph(MICG) which builds the co-click connections between documents on click intention level but not on document level. Then we present an encoding framework MIRA based on Bert and graph attention networks which leverages a two-factor attention mechanism to aggregate neighbours. To meet the online latency requirements, we only involve neighbour information in document side, which can save the time-consuming query neighbor search in real time serving. We conduct extensive offline experiments on both public dataset and private web-scale dataset from two major commercial search engines demonstrating the effectiveness and scalability of the proposed method compared with several baselines. And a further case study reveals that co-click relations mainly help improve web search quality from two aspects: key concept enhancing and query term complementary.

[69] 2007.01513

Joint Beam Training and Data Transmission Design for Covert Millimeter-Wave Communication

Covert communication prevents legitimate transmission from being detected by a warden while maintaining certain covert rate at the intended user. Prior works have considered the design of covert communication over conventional low-frequency bands, but few works so far have explored the higher-frequency millimeter-wave (mmWave) spectrum. The directional nature of mmWave communication makes it attractive for covert transmission. However, how to establish such directional link in a covert manner in the first place remains as a significant challenge. In this paper, we consider a covert mmWave communication system, where legitimate parties Alice and Bob adopt beam training approach for directional link establishment. Accounting for the training overhead, we develop a new design framework that jointly optimizes beam training duration, training power and data transmission power to maximize the effective throughput of Alice-Bob link while ensuring the covertness constraint at warden Willie is met. We further propose a dual-decomposition successive convex approximation algorithm to solve the problem efficiently. Numerical studies demonstrate interesting tradeoff among the key design parameters considered and also the necessity of joint design of beam training and data transmission for covert mmWave communication.

[70] 2007.01514

Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking Results of Two Cameras

This paper proposes a process that uses two cameras to obtain three-dimensional (3D) information of a target object for human tracking. Results of human detection and tracking from two cameras are integrated to obtain the 3D information. OpenPose is used for human detection. In the case of a general processing a stereo camera, a range image of the entire scene is acquired as precisely as possible, and then the range image is processed. However, there are problems such as incorrect matching and computational cost for the calibration process. A new stereo vision framework is proposed to cope with the problems. The effectiveness of the proposed framework and the method is verified through target-tracking experiments.

[71] 2007.01516

Deep interpretability for GWAS

Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases. In these studies, association testing is done on a variant-by-variant basis, possibly missing out on non-linear interaction effects between variants. Deep networks can be used to model these interactions, but they are difficult to train and interpret on large genetic datasets. We propose a method that uses the gradient based deep interpretability technique named DeepLIFT to show that known diabetes genetic risk factors can be identified using deep models along with possibly novel associations.

[72] 2007.01519

Overall Evaluations on Benefits of Influence When Disturbed by Rivals

Influence maximization (IM) is a representative and classic problem that has been studied extensively before. The most important application derived from the IM problem is viral marketing. Take us as a promoter, we want to get benefits from the influence diffusion in a given social network, where each influenced (activated) user is associated with a benefit. However, there is often competing information initiated by our rivals diffusing in the same social network at the same time. Consider such a scenario, a user is influenced by both my information and my rivals' information. Here, the benefit from this user should be weakened to certain degree. How to quantify the degree of weakening? Based on that, we propose an overall evaluations on benefits of influence (OEBI) problem. We prove the objective function of the OEBI problem is not monotone, not submodular, and not supermodular. Fortunately, we can decompose this objective function into the difference of two submodular functions and adopt a modular-modular procedure to approximate it with a data-dependent approximation guarantee. Because of the difficulty to compute the exact objective value, we design a group of unbiased estimators by exploiting the idea of reverse influence sampling, which can improve time efficiency significantly without losing its approximation ratio. Finally, numerical experiments on real datasets verified the effectiveness of our approaches regardless of performance and efficiency.

[73] 2007.01520

First Steps: Latent-Space Control with Semantic Constraints for Quadruped Locomotion

Traditional approaches to quadruped control frequently employ simplified, hand-derived models. This significantly reduces the capability of the robot since its effective kinematic range is curtailed. In addition, kinodynamic constraints are often non-differentiable and difficult to implement in an optimisation approach. In this work, these challenges are addressed by framing quadruped control as optimisation in a structured latent space. A deep generative model captures a statistical representation of feasible joint configurations, whilst complex dynamic and terminal constraints are expressed via high-level, semantic indicators and represented by learned classifiers operating upon the latent space. As a consequence, complex constraints are rendered differentiable and evaluated an order of magnitude faster than analytical approaches. We validate the feasibility of locomotion trajectories optimised using our approach both in simulation and on a real-world ANYmal quadruped. Our results demonstrate that this approach is capable of generating smooth and realisable trajectories. To the best of our knowledge, this is the first time latent space control has been successfully applied to a complex, real robot platform.

[74] 2007.01522

Dueling Deep Q-Network for Unsupervised Inter-frame Eye Movement Correction in Optical Coherence Tomography Volumes

In optical coherence tomography (OCT) volumes of retina, the sequential acquisition of the individual slices makes this modality prone to motion artifacts, misalignments between adjacent slices being the most noticeable. Any distortion in OCT volumes can bias structural analysis and influence the outcome of longitudinal studies. On the other hand, presence of speckle noise that is characteristic of this imaging modality, leads to inaccuracies when traditional registration techniques are employed. Also, the lack of a well-defined ground truth makes supervised deep-learning techniques ill-posed to tackle the problem. In this paper, we tackle these issues by using deep reinforcement learning to correct inter-frame movements in an unsupervised manner. Specifically, we use dueling deep Q-network to train an artificial agent to find the optimal policy, i.e. a sequence of actions, that best improves the alignment by maximizing the sum of reward signals. Instead of relying on the ground-truth of transformation parameters to guide the rewarding system, for the first time, we use a combination of intensity based image similarity metrics. Further, to avoid the agent bias towards speckle noise, we ensure the agent can see retinal layers as part of the interacting environment. For quantitative evaluation, we simulate the eye movement artifacts by applying 2D rigid transformations on individual B-scans. The proposed model achieves an average of 0.985 and 0.914 for normalized mutual information and correlation coefficient, respectively. We also compare our model with elastix intensity based medical image registration approach, where significant improvement is achieved by our model for both noisy and denoised volumes.

[75] 2007.01524

Domain Adaptation without Source Data

Domain adaptation assumes that samples from source and target domains are freely accessible during a training phase. However, such an assumption is rarely plausible in real cases and possibly causes data-privacy issues, especially when the label of the source domain can be a sensitive attribute as an identifier. To avoid accessing source data which may contain sensitive information, we introduce source data-free domain adaptation (SFDA). Our key idea is to leverage a pre-trained model from the source domain and progressively update the target model in a self-learning manner. We observe that target samples with lower self-entropy measured by the pre-trained source model are more likely to be classified correctly. From this, we select the reliable samples with the self-entropy criterion and define these as class prototypes. We then assign pseudo labels for every target sample based on the similarity score with class prototypes. Further, to reduce the uncertainty from the pseudo labeling process, we propose set-to-set distance-based filtering which does not require any tunable hyperparameters. Finally, we train the target model with the filtered pseudo labels with regularization from the pre-trained source model. Surprisingly, without direct usage of labeled source samples, our SFDA outperforms conventional domain adaptation methods on benchmark datasets. Our code is publicly available at

[76] 2007.01528

On-The-Fly Information Retrieval Augmentation for Language Models

Here we experiment with the use of information retrieval as an augmentation for pre-trained language models. The text corpus used in information retrieval can be viewed as form of episodic memory which grows over time. By augmenting GPT 2.0 with information retrieval we achieve a zero shot 15% relative reduction in perplexity on Gigaword corpus without any re-training. We also validate our IR augmentation on an event co-reference task.

[77] 2007.01530

FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

The slowdown of Moore's law and the power wall necessitates a shift towards finely tunable precision (a.k.a. transprecision) computing to reduce energy footprint. Hence, we need circuits capable of performing floating-point operations on a wide range of precisions with high energy-proportionality. We present FPnew, a highly configurable open-source transprecision floating-point unit (TP-FPU) capable of supporting a wide range of standard and custom FP formats. To demonstrate the flexibility and efficiency of FPnew in general-purpose processor architectures, we extend the RISC-V ISA with operations on half-precision, bfloat16, and an 8bit FP format, as well as SIMD vectors and multi-format operations. Integrated into a 32-bit RISC-V core, our TP-FPU can speed up execution of mixed-precision applications by 1.67x w.r.t. an FP32 baseline, while maintaining end-to-end precision and reducing system energy by 37%. We also integrate FPnew into a 64-bit RISC-V core, supporting five FP formats on scalars or 2, 4, or 8-way SIMD vectors. For this core, we measured the silicon manufactured in Globalfoundries 22FDX technology across a wide voltage range from 0.45V to 1.2V. The unit achieves leading-edge measured energy efficiencies between 178 Gflop/sW (on FP64) and 2.95 Tflop/sW (on 8-bit mini-floats), and a performance between 3.2 Gflop/s and 25.3 Gflop/s.

[78] 2007.01533

Finding Densest $k$-Connected Subgraphs

Dense subgraph discovery is an important graph-mining primitive with a variety of real-world applications. One of the most well-studied optimization problems for dense subgraph discovery is the densest subgraph problem, where given an edge-weighted undirected graph $G=(V,E,w)$, we are asked to find $S\subseteq V$ that maximizes the density $d(S)$, i.e., half the weighted average degree of the induced subgraph $G[S]$. This problem can be solved exactly in polynomial time and well-approximately in almost linear time. However, a densest subgraph has a structural drawback, namely, the subgraph may not be robust to vertex/edge failure. Indeed, a densest subgraph may not be well-connected, which implies that the subgraph may be disconnected by removing only a few vertices/edges within it. In this paper, we provide an algorithmic framework to find a dense subgraph that is well-connected in terms of vertex/edge connectivity. Specifically, we introduce the following problems: given a graph $G=(V,E,w)$ and a positive integer/real $k$, we are asked to find $S\subseteq V$ that maximizes the density $d(S)$ under the constraint that $G[S]$ is $k$-vertex/edge-connected. For both problems, we propose polynomial-time (bicriteria and ordinary) approximation algorithms, using classic Mader's theorem in graph theory and its extensions.

[79] 2007.01536

Cross-layer Path Selection in Multi-path Transport Protocol for Mobile Devices

MPTCP is a new transport protocol that enables mobile devices to use multiple physical paths simultaneously through several network interfaces, such as WiFi and Cellular. However, wireless path capacities change frequently in the mobile environments, causing challenges for path selection. For example, WiFi associated paths often become poor as devices walk away, since WiFi has intermittent connectivity caused by the short signal coverage and stochastic interference. MPTCP's native decision based on hysteretic TCP-layer estimation will miss the real switching point of wireless quality, which may cumulate packets on the broken path and causes serious packets reinjection. Through analyzing a unique dataset in the wild, we quantitatively study the impact of MAC-layer factors on the aggregated performance of MPTCP. We then propose a decision tree approach for cross-layer path selection that decides which path to carry the incoming packets dynamically according to the prior learned schemes. A prototype of the path selection system named SmartPS, which proactively probes the wireless environments, is realized and deployed in Linux and Android. Evaluation results demonstrate that our SmartPS can efficiently utilize the faster path, with goodput improvements of up to 29%.

[80] 2007.01541

A fast direct solver for nonlocal operators in wavelet coordinates

In this article, we consider fast direct solvers for nonlocal operators. The pivotal idea is to combine a wavelet representation of the system matrix, yielding a quasi-sparse matrix, with the nested dissection ordering scheme. The latter drastically reduces the fill-in during the factorization of the system matrix by means of a Cholesky decomposition or an LU decomposition, respectively. This way, we end up with the exact inverse of the compressed system matrix with only a moderate increase of the number of nonzero entries in the matrix. To illustrate the efficacy of the approach, we conduct numerical experiments for different highly relevant applications of nonlocal operators: We consider (i) the direct solution of boundary integral equations in three spatial dimensions, issuing from the polarizable continuum model, (ii) a parabolic problem for the fractional Laplacian in integral form and (iii) the fast simulation of Gaussian random fields.

[81] 2007.01542

Strategies for Using Proximal Policy Optimization in Mobile Puzzle Games

While traditionally a labour intensive task, the testing of game content is progressively becoming more automated. Among the many directions in which this automation is taking shape, automatic play-testing is one of the most promising thanks also to advancements of many supervised and reinforcement learning (RL) algorithms. However these type of algorithms, while extremely powerful, often suffer in production environments due to issues with reliability and transparency in their training and usage. In this research work we are investigating and evaluating strategies to apply the popular RL method Proximal Policy Optimization (PPO) in a casual mobile puzzle game with a specific focus on improving its reliability in training and generalization during game playing. We have implemented and tested a number of different strategies against a real-world mobile puzzle game (Lily's Garden from Tactile Games). We isolated the conditions that lead to a failure in either training or generalization during testing and we identified a few strategies to ensure a more stable behaviour of the algorithm in this game genre.

[82] 2007.01544

A Conceptual Framework for Externally-influenced Agents: An Assisted Reinforcement Learning Review

A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering such collaboration by classifying and comparing various methods that use external information in the learning process. The proposed taxonomy details the relationship between the external information source and the learner agent, highlighting the process of information decomposition, structure, retention, and how it can be used to influence agent learning. As well as reviewing state-of-the-art methods, we identify current streams of reinforcement learning that use external information in order to improve the agent's performance and its decision-making process. These include heuristic reinforcement learning, interactive reinforcement learning, learning from demonstration, transfer learning, and learning from multiple sources, among others. These streams of reinforcement learning operate with the shared objective of scaffolding the learner agent. Lastly, we discuss further possibilities for future work in the field of assisted reinforcement learning systems.

[83] 2007.01546

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored. In this paper, we propose a multiple expert brainstorming network (MEB-Net) for domain adaptive person re-ID, opening up a promising direction about model ensemble problem under unsupervised conditions. MEB-Net adopts a mutual learning strategy, where multiple networks with different architectures are pre-trained within a source domain as expert models equipped with specific features and knowledge, while the adaptation is then accomplished through brainstorming (mutual learning) among expert models. MEB-Net accommodates the heterogeneity of experts learned with different architectures and enhances discrimination capability of the adapted re-ID model, by introducing a regularization scheme about authority of experts. Extensive experiments on large-scale datasets (Market-1501 and DukeMTMC-reID) demonstrate the superior performance of MEB-Net over the state-of-the-arts.

[84] 2007.01547

Descending through a Crowded Valley -- Benchmarking Deep Learning Optimizers

Choosing the optimizer is among the most crucial decisions of deep learning engineers, and it is not an easy one. The growing literature now lists literally hundreds of optimization methods. In the absence of clear theoretical guidance and conclusive empirical evidence, the decision is often done according to personal anecdotes. In this work, we aim to replace these anecdotes, if not with evidence, then at least with heuristics. To do so, we perform an extensive, standardized benchmark of more than a dozen particularly popular deep learning optimizers while giving a concise overview of the wide range of possible choices. Analyzing almost 35 000 individual runs, we contribute the following three points: Optimizer performance varies greatly across tasks. We observe that evaluating multiple optimizers with default parameters works approximately as well as tuning the hyperparameters of a single, fixed optimizer. While we can not identify an individual optimization method clearly dominating across all tested tasks, we identify a significantly reduced subset of specific algorithms and parameter choices that generally provided competitive results in our experiments. This subset includes popular favorites and some less well-known contenders. We have open-sourced all our experimental results, making it available to use as well-tuned baselines when evaluating novel optimization methods and therefore reducing the necessary computational efforts.

[85] 2007.01548

Multiple Instance-Based Video Anomaly Detection using Deep Temporal Encoding-Decoding

In this paper, we propose a weakly supervised deep temporal encoding-decoding solution for anomaly detection in surveillance videos using multiple instance learning. The proposed approach uses both abnormal and normal video clips during the training phase which is developed in the multiple instance framework where we treat video as a bag and video clips as instances in the bag. Our main contribution lies in the proposed novel approach to consider temporal relations between video instances. We deal with video instances (clips) as a sequential visual data rather than independent instances. We employ a deep temporal and encoder network that is designed to capture spatial-temporal evolution of video instances over time. We also propose a new loss function that is smoother than similar loss functions recently presented in the computer vision literature, and therefore; enjoys faster convergence and improved tolerance to local minima during the training phase. The proposed temporal encoding-decoding approach with modified loss is benchmarked against the state-of-the-art in simulation studies. The results show that the proposed method performs similar to or better than the state-of-the-art solutions for anomaly detection in video surveillance applications.

[86] 2007.01549

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

Multiple-object tracking and segmentation (MOTS) is a novel computer vision task that aims to jointly perform multiple object tracking (MOT) and instance segmentation. In this work, we present PointTrack++, an effective on-line framework for MOTS, which remarkably extends our recently proposed PointTrack framework. To begin with, PointTrack adopts an efficient one-stage framework for instance segmentation, and learns instance embeddings by converting compact image representations to un-ordered 2D point cloud. Compared with PointTrack, our proposed PointTrack++ offers three major improvements. Firstly, in the instance segmentation stage, we adopt a semantic segmentation decoder trained with focal loss to improve the instance selection quality. Secondly, to further boost the segmentation performance, we propose a data augmentation strategy by copy-and-paste instances into training images. Finally, we introduce a better training strategy in the instance association stage to improve the distinguishability of learned instance embeddings. The resulting framework achieves the state-of-the-art performance on the 5th BMTT MOTChallenge.

[87] 2007.01550

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

Current multi-object tracking and segmentation (MOTS) methods follow the tracking-by-detection paradigm and adopt convolutions for feature extraction. However, as affected by the inherent receptive field, convolution based feature extraction inevitably mixes up the foreground features and the background features, resulting in ambiguities in the subsequent instance association. In this paper, we propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation. Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images. Furthermore, multiple informative data modalities are converted into point-wise representations to enrich point-wise features. The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods including 3D tracking methods by large margins (5.4% higher MOTSA and 18 times faster over MOTSFusion) with the near real-time speed (22 FPS). Evaluations across three datasets demonstrate both the effectiveness and efficiency of our method. Moreover, based on the observation that current MOTS datasets lack crowded scenes, we build a more challenging MOTS dataset named APOLLO MOTS with higher instance density. Both APOLLO MOTS and our codes are publicly available at

[88] 2007.01555

MQT-TZ: Secure MQTT Broker for Biomedical Signal Processing on the Edge

Physical health records belong to healthcare providers, but the information contained within belongs to each patient. In an increasing manner, more health-related data is being acquired by wearables and other IoT devices following the ever-increasing trend of the "Quantified Self". Even though data protection regulations (e.g., GDPR) encourage the usage of privacy-preserving processing techniques, most of the current IoT infrastructure was not originally conceived for such purposes. One of the most used communication protocols, MQTT, is a lightweight publish-subscribe protocol commonly used in the Edge and IoT applications. In MQTT, the broker must process data on clear text, hence exposing a large attack surface for a malicious agent to steal/tamper with this health-related data. In this paper, we introduce MQT-TZ, a secure MQTT broker leveraging Arm TrustZone, a popular Trusted Execution Environment (TEE). We define a mutual TLS-based handshake and a two-layer encryption for end-to-end security using the TEE as a trusted proxy. We provide quantitative evaluation of our open-source PoC on streaming ECGs in real time and highlight the trade-offs.

[89] 2007.01556

Surrogate-assisted Particle Swarm Optimisation for Evolving Variable-length Transferable Blocks for Image Classification

Deep convolutional neural networks have demonstrated promising performance on image classification tasks, but the manual design process becomes more and more complex due to the fast depth growth and the increasingly complex topologies of convolutional neural networks. As a result, neural architecture search has emerged to automatically design convolutional neural networks that outperform handcrafted counterparts. However, the computational cost is immense, e.g. 22,400 GPU-days and 2,000 GPU-days for two outstanding neural architecture search works named NAS and NASNet, respectively, which motivates this work. A new effective and efficient surrogate-assisted particle swarm optimisation algorithm is proposed to automatically evolve convolutional neural networks. This is achieved by proposing a novel surrogate model, a new method of creating a surrogate dataset and a new encoding strategy to encode variable-length blocks of convolutional neural networks, all of which are integrated into a particle swarm optimisation algorithm to form the proposed method. The proposed method shows its effectiveness by achieving competitive error rates of 3.49% on the CIFAR-10 dataset, 18.49% on the CIFAR-100 dataset, and 1.82% on the SVHN dataset. The convolutional neural network blocks are efficiently learned by the proposed method from CIFAR-10 within 3 GPU-days due to the acceleration achieved by the surrogate model and the surrogate dataset to avoid the training of 80.1% of convolutional neural network blocks represented by the particles. Without any further search, the evolved blocks from CIFAR-10 can be successfully transferred to CIFAR-100 and SVHN, which exhibits the transferability of the block learned by the proposed method.

[90] 2007.01560

GRANDPA: a Byzantine Finality Gadget

Classic Byzantine fault-tolerant consensus protocols forfeit liveness in the face of asynchrony in order to preserve safety, whereas most deployed blockchain protocols forfeit safety in order to remain live. In this work, we achieve the best of both worlds by proposing a novel abstractions called the finality gadget. A finality gadget allows for transactions to always optimistically commit but informs the clients that these transactions might be unsafe. As a result, a blockchain can execute transactions optimistically and only commit them after they have been sufficiently and provably audited. In this work, we formally model the finality gadget abstraction, prove that it is impossible to solve it deterministically in full asynchrony (even though it is stronger than consensus) and provide a partially synchronous protocol which is currently securing a major blockchain. This way we show that the protocol designer can decouple safety and liveness in order to speed up recovery from failures. We believe that there can be other types of finality gadgets that provide weaker safety (e.g., probabilistic) in order to gain more efficiency and this can depend on the probability that the network is not in synchrony.

[91] 2007.01561

Users' Concern for Privacy in Context-Aware Reasoning Systems

Context-aware reasoning systems allow drawing sophisticated inferences about users' behaviour and physiological condition, by aggregating data from seemingly unrelated sources. We conducted a general population online survey to evaluate users' concern about the privacy of data gathered by these systems. We found that people are more concerned about third parties accessing data gathered by environmental sensors as compared to physiological sensors. Participants also indicated greater concern about unfamiliar third parties (e.g., private companies) as opposed to familiar third parties (e.g., relatives). We further found that these concerns are predicted and (to a lesser degree) causally affected by people's beliefs about how much can be inferred from these types of data, as well as by their background in computer science.

[92] 2007.01562

An Edge Computing-based Photo Crowdsourcing Framework for Real-time 3D Reconstruction

Image-based three-dimensional (3D) reconstruction utilizes a set of photos to build 3D model and can be widely used in many emerging applications such as augmented reality (AR) and disaster recovery. Most of existing 3D reconstruction methods require a mobile user to walk around the target area and reconstruct objectives with a hand-held camera, which is inefficient and time-consuming. To meet the requirements of delay intensive and resource hungry applications in 5G, we propose an edge computing-based photo crowdsourcing (EC-PCS) framework in this paper. The main objective is to collect a set of representative photos from ubiquitous mobile and Internet of Things (IoT) devices at the network edge for real-time 3D model reconstruction, with network resource and monetary cost considerations. Specifically, we first propose a photo pricing mechanism by jointly considering their freshness, resolution and data size. Then, we design a novel photo selection scheme to dynamically select a set of photos with the required target coverage and the minimum monetary cost. We prove the NP-hardness of such problem, and develop an efficient greedy-based approximation algorithm to obtain a near-optimal solution. Moreover, an optimal network resource allocation scheme is presented, in order to minimize the maximum uploading delay of the selected photos to the edge server. Finally, a 3D reconstruction algorithm and a 3D model caching scheme are performed by the edge server in real time. Extensive experimental results based on real-world datasets demonstrate the superior performance of our EC-PCS system over the existing mechanisms.

[93] 2007.01563

Correction of BDFk for fractional Feynman-Kac equation with Lévy flight

In this work, we present the correction formulas of the $k$-step BDF convolution quadrature at the starting $k-1$ steps for the fractional Feynman-Kac equation with L\'{e}vy flight. The desired $k$th-order convergence rate can be achieved with nonsmooth data. Based on the idea of [{\sc Jin, Li, and Zhou}, SIAM J. Sci. Comput., 39 (2017), A3129--A3152], we provide a detailed convergence analysis for the correction BDF$k$ scheme. The numerical experiments with spectral method are given to illustrate the effectiveness of the presented method. To the best of our knowledge, this is the first proof of the convergence analysis and numerical verified the sapce fractional evolution equation with correction BDF$k$.

[94] 2007.01568

Identification and Remediation of Self-Admitted Technical Debt in Issue Trackers

Technical debt refers to taking shortcuts to achieve short-term goals, which might negatively influence software maintenance in the long-term. There is increasing attention on technical debt that is admitted by developers in source code comments (termed as self-admitted technical debt or SATD). But SATD in issue trackers is relatively unexplored. We performed a case study, where we manually examined 500 issues from two open source projects (i.e. Hadoop and Camel), which contained 152 SATD items. We found that: 1) eight types of technical debt are identified in issues, namely architecture, build, code, defect, design, documentation, requirement, and test debt; 2) developers identify technical debt in issues in three different points in time, and a small part is identified by its creators; 3) the majority of technical debt is paid off, 4) mostly by those who identified it or created it; 5) the median time and average time to repay technical debt are 872.3 and 25.0 hours respectively.

[95] 2007.01570

Scaling Graph Neural Networks with Approximate PageRank

Graph neural networks (GNNs) have emerged as a powerful approach for solving many network mining tasks. However, learning on large graphs remains a challenge - many recently proposed scalable GNN approaches rely on an expensive message-passing procedure to propagate information through the graph. We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs resulting in significant speed gains while maintaining state-of-the-art prediction performance. In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings. We demonstrate that PPRGo outperforms baselines in both distributed and single-machine training environments on a number of commonly used academic graphs. To better analyze the scalability of large-scale graph learning methods, we introduce a novel benchmark graph with 12.4 million nodes, 173 million edges, and 2.8 million node features. We show that training PPRGo from scratch and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph. We discuss the practical application of PPRGo to solve large-scale node classification problems at Google.

[96] 2007.01571

Domain Adaptive Object Detection via Asymmetric Tri-way Faster-RCNN

Conventional object detection models inevitably encounter a performance drop as the domain disparity exists. Unsupervised domain adaptive object detection is proposed recently to reduce the disparity between domains, where the source domain is label-rich while the target domain is label-agnostic. The existing models follow a parameter shared siamese structure for adversarial domain alignment, which, however, easily leads to the collapse and out-of-control risk of the source domain and brings negative impact to feature adaption. The main reason is that the labeling unfairness (asymmetry) between source and target makes the parameter sharing mechanism unable to adapt. Therefore, in order to avoid the source domain collapse risk caused by parameter sharing, we propose an asymmetric tri-way Faster-RCNN (ATF) for domain adaptive object detection. Our ATF model has two distinct merits: 1) A ancillary net supervised by source label is deployed to learn ancillary target features and simultaneously preserve the discrimination of source domain, which enhances the structural discrimination (object classification vs. bounding box regression) of domain alignment. 2) The asymmetric structure consisting of a chief net and an independent ancillary net essentially overcomes the parameter sharing aroused source risk collapse. The adaption safety of the proposed ATF detector is guaranteed. Extensive experiments on a number of datasets, including Cityscapes, Foggy-cityscapes, KITTI, Sim10k, Pascal VOC, Clipart and Watercolor, demonstrate the SOTA performance of our method.

[97] 2007.01575

Ground Truth Free Denoising by Optimal Transport

We present a learned unsupervised denoising method for arbitrary types of data, which we explore on images and one-dimensional signals. The training is solely based on samples of noisy data and examples of noise, which -- critically -- do not need to come in pairs. We only need the assumption that the noise is independent and additive (although we describe how this can be extended). The method rests on a Wasserstein Generative Adversarial Network setting, which utilizes two critics and one generator.

[98] 2007.01580

On the Similarity between the Laplace and Neural Tangent Kernels

Recent theoretical work has shown that massively overparameterized neural networks are equivalent to kernel regressors that use Neural Tangent Kernels(NTK). Experiments show that these kernel methods perform similarly to real neural networks. Here we show that NTK for fully connected networks is closely related to the standard Laplace kernel. We show theoretically that for normalized data on the hypersphere both kernels have the same eigenfunctions and their eigenvalues decay polynomially at the same rate, implying that their Reproducing Kernel Hilbert Spaces (RKHS) include the same sets of functions. This means that both kernels give rise to classes of functions with the same smoothness properties. The two kernels differ for data off the hypersphere, but experiments indicate that when data is properly normalized these differences are not significant. Finally, we provide experiments on real data comparing NTK and the Laplace kernel, along with a larger class of{\gamma}-exponential kernels. We show that these perform almost identically. Our results suggest that much insight about neural networks can be obtained from analysis of the well-known Laplace kernel, which has a simple closed-form.

[99] 2007.01587

Privacy Threats Against Federated Matrix Factorization

Matrix Factorization has been very successful in practical recommendation applications and e-commerce. Due to data shortage and stringent regulations, it can be hard to collect sufficient data to build performant recommender systems for a single company. Federated learning provides the possibility to bridge the data silos and build machine learning models without compromising privacy and security. Participants sharing common users or items collaboratively build a model over data from all the participants. There have been some works exploring the application of federated learning to recommender systems and the privacy issues in collaborative filtering systems. However, the privacy threats in federated matrix factorization are not studied. In this paper, we categorize federated matrix factorization into three types based on the partition of feature space and analyze privacy threats against each type of federated matrix factorization model. We also discuss privacy-preserving approaches. As far as we are aware, this is the first study of privacy threats of the matrix factorization method in the federated learning framework.

[100] 2007.01594

Adaptive Graph Encoder for Attributed Graph Embedding

Attributed graph embedding, which learns vector representations from graph topology and node features, is a challenging task for graph analysis. Recently, methods based on graph convolutional networks (GCNs) have made great progress on this task. However,existing GCN-based methods have three major drawbacks. Firstly,our experiments indicate that the entanglement of graph convolutional filters and weight matrices will harm both the performance and robustness. Secondly, we show that graph convolutional filters in these methods reveal to be special cases of generalized Laplacian smoothing filters, but they do not preserve optimal low-pass characteristics. Finally, the training objectives of existing algorithms are usually recovering the adjacency matrix or feature matrix, which are not always consistent with real-world applications. To address these issues, we propose Adaptive Graph Encoder (AGE), a novel attributed graph embedding framework. AGE consists of two modules: (1) To better alleviate the high-frequency noises in the node features, AGE first applies a carefully-designed Laplacian smoothing filter. (2) AGE employs an adaptive encoder that iteratively strengthens the filtered features for better node embeddings. We conduct experiments using four public benchmark datasets to validate AGE on node clustering and link prediction tasks. Experimental results show that AGE consistently outperforms state-of-the-art graph embedding methods considerably on these tasks.

[101] 2007.01595

LOL: Lidar-Only Odometry and Localization in 3D Point Cloud Maps

In this paper we deal with the problem of odometry and localization for Lidar-equipped vehicles driving in urban environments, where a premade target map exists to localize against. In our problem formulation, to correct the accumulated drift of the Lidar-only odometry we apply a place recognition method to detect geometrically similar locations between the online 3D point cloud and the a priori offline map. In the proposed system, we integrate a state-of-the-art Lidar-only odometry algorithm with a recently proposed 3D point segment matching method by complementing their advantages. Also, we propose additional enhancements in order to reduce the number of false matches between the online point cloud and the target map, and to refine the position estimation error whenever a good match is detected. We demonstrate the utility of the proposed LOL system on several Kitti datasets of different lengths and environments, where the relocalization accuracy and the precision of the vehicle's trajectory were significantly improved in every case, while still being able to maintain real-time performance.

[102] 2007.01597

Living without Beth and Craig: Explicit Definitions and Interpolants in the Guarded Fragment

The guarded fragment of FO fails to have the Craig Interpolation Property (CIP) and the Projective Beth Definability Property (PBDP). Thus, not every valid implication between guarded formulas has a guarded interpolant, and not every implicitly definable relation has an explicit guarded definition. In this article, we show that nevertheless the existence of guarded interpolants and explicit definitions is decidable. Moreover, it is 3ExpTime-complete in general, and 2ExpTime-complete if the arity of relation symbols is bounded by a constant. Deciding the existence of guarded interpolants and explicit definitions is thus by one exponential harder than validity in the guarded fragment.

[103] 2007.01598

Weakly Supervised Temporal Action Localization with Segment-Level Labels

Temporal action localization presents a trade-off between test performance and annotation-time cost. Fully supervised methods achieve good performance with time-consuming boundary annotations. Weakly supervised methods with cheaper video-level category label annotations result in worse performance. In this paper, we introduce a new segment-level supervision setting: segments are labeled when annotators observe actions happening here. We incorporate this segment-level supervision along with a novel localization module in the training. Specifically, we devise a partial segment loss regarded as a loss sampling to learn integral action parts from labeled segments. Since the labeled segments are only parts of actions, the model tends to overfit along with the training process. To tackle this problem, we first obtain a similarity matrix from discriminative features guided by a sphere loss. Then, a propagation loss is devised based on the matrix to act as a regularization term, allowing implicit unlabeled segments propagation during training. Experiments validate that our method can outperform the video-level supervision methods with almost same the annotation time.

[104] 2007.01599

An Autonomous Free Airspace En-route Controller using Deep Reinforcement Learning Techniques

Air traffic control is becoming a more and more complex task due to the increasing number of aircraft. Current air traffic control methods are not suitable for managing this increased traffic. Autonomous air traffic control is deemed a promising alternative. In this paper an air traffic control model is presented that guides an arbitrary number of aircraft across a three-dimensional, unstructured airspace while avoiding conflicts and collisions. This is done utilizing the power of graph based deep learning approaches. These approaches offer significant advantages over current approaches to this task, such as invariance to the input ordering of aircraft and the ability to easily cope with a varying number of aircraft. Results acquired using these approaches show that the air traffic control model performs well on realistic traffic densities; it is capable of managing the airspace by avoiding 100% of potential collisions and preventing 89.8% of potential conflicts.

[105] 2007.01601

Scalar auxiliary variable finite element scheme for the parabolic-parabolic Keller-Segel model

We describe and analyze a finite element numerical scheme for the parabolic-parabolic Keller-Segel model. The scalar auxiliary variable method is used to retrieve the monotonic decay of the energy associated with the system at the discrete level. This method relies on the interpretation of the Keller-Segel model as a gradient flow. The resulting numerical scheme is efficient and easy to implement. We show the existence of a unique non-negative solution and that a modified discrete energy is obtained due to the use of the SAV method. We also prove the convergence of the discrete solutions to the ones of the weak form of the continuous Keller-Segel model.

[106] 2007.01605

Regulation conform DLT-operable payment adapter based on trustless - justified trust combined generalized state channels

Open technologies, decentralized computation and intelligent applications enable the third-generation web, Web 3.0, thereby digitizing whole industries. The emerging Economy of Things (EoT) will be based on software agents running on peer-to-peer trustless networks that require a programmable, regulation conform means of payment. We give an overview of current solutions that differ in their fundamental values and technological possibilities, like e.g. private-issued stablecoins, DLT-issued electronic money and genuine cryptocurrencies. Based on this analysis, we present the concept of justified trust and propose to combine the strengths of the crypto based, decentralized trustless elements with established and well regulated means of payment, based on this concept, via a secure external re-balancing interface. Combining the advantages, e.g. lightweight, trustless, efficient high frequency micro state transfers on the one hand, and ease of use, widely spread, accepted alignment to a multitude of regulative requirements, on the other hand, while neither leading into a lock-in in any of the proposed solutions, nor undermining the basic principles of the crypto-movement or unnecessarily reinforcing the banking system provides a synergy and the necessary flexibility for further evolution alongside the regulative framework. This offers a regulation conform transitional solution that can be implemented in the short term, which enables companies to place their decentralized business operations in a regulated environment. The contribution of our work is twofold: First, we illustrate and discuss different DLT-operable means of payment. Second, our research proposes a novel hybrid payment solution by interfacing trustless with justified trust combined generalized state channels.

[107] 2007.01610

Logical Separability of Incomplete Data under Ontologies

Finding a logical formula that separates positive and negative examples given in the form of labeled data items is fundamental in applications such as concept learning, reverse engineering of database queries, and generating referring expressions. In this paper, we investigate the existence of a separating formula for incomplete data in the presence of an ontology. Both for the ontology language and the separation language, we concentrate on first-order logic and three important fragments thereof: the description logic $\mathcal{ALCI}$, the guarded fragment, and the two-variable fragment. We consider several forms of separability that differ in the treatment of negative examples and in whether or not they admit the use of additional helper symbols to achieve separation. We characterize separability in a model-theoretic way, compare the separating power of the different languages, and determine the computational complexity of separability as a decision problem.

[108] 2007.01612

Online learning in MDPs with linear function approximation and bandit feedback

We consider an online learning problem where the learner interacts with a Markov decision process in a sequence of episodes, where the reward function is allowed to change between episodes in an adversarial manner and the learner only gets to observe the rewards associated with its actions. We allow the state space to be arbitrarily large, but we assume that all action-value functions can be represented as linear functions in terms of a known low-dimensional feature map, and that the learner has access to a simulator of the environment that allows generating trajectories from the true MDP dynamics. Our main contribution is developing a computationally efficient algorithm that we call MDP-LinExp3, and prove that its regret is bounded by $\widetilde{\mathcal{O}}\big(H^2 T^{2/3} (dK)^{1/3}\big)$, where $T$ is the number of episodes, $H$ is the number of steps in each episode, $K$ is the number of actions, and $d$ is the dimension of the feature map. We also show that the regret can be improved to $\widetilde{\mathcal{O}}\big(H^2 \sqrt{TdK}\big)$ under much stronger assumptions on the MDP dynamics. To our knowledge, MDP-LinExp3 is the first provably efficient algorithm for this problem setting.

[109] 2007.01618

Balanced Symmetric Cross Entropy for Large Scale Imbalanced and Noisy Data

Deep convolution neural network has attracted many attentions in large-scale visual classification task, and achieves significant performance improvement compared to traditional visual analysis methods. In this paper, we explore many kinds of deep convolution neural network architectures for large-scale product recognition task, which is heavily class-imbalanced and noisy labeled data, making it more challenged. Extensive experiments show that PNASNet achieves best performance among a variety of convolutional architectures. Together with ensemble technology and negative learning loss for noisy labeled data, we further improve the model performance on online test data. Finally, our proposed method achieves 0.1515 mean top-1 error on online test data.

[110] 2007.01620

Team voyTECH: User Activity Modeling with Boosting Trees

This paper describes our winning solution for the ECML-PKDD ChAT Discovery Challenge 2020. We show that whether or not a Twitch user has subscribed to a channel can be well predicted by modeling user activity with boosting trees. We introduce the connection between target-encodings and boosting trees in the context of high cardinality categoricals and find that modeling user activity is more powerful then direct modeling of content when encoded properly and combined with a suitable optimization approach.

[111] 2007.01623

Hedging using reinforcement learning: Contextual $k$-Armed Bandit versus $Q$-learning

The construction of replication strategies for contingent claims in the presence of risk and market friction is a key problem of financial engineering. In real markets, continuous replication, such as in the model of Black, Scholes and Merton, is not only unrealistic but it is also undesirable due to high transaction costs. Over the last decades stochastic optimal-control methods have been developed to balance between effective replication and losses. More recently, with the rise of artificial intelligence, temporal-difference Reinforcement Learning, in particular variations of $Q$-learning in conjunction with Deep Neural Networks, have attracted significant interest. From a practical point of view, however, such methods are often relatively sample inefficient, hard to train and lack performance guarantees. This motivates the investigation of a stable benchmark algorithm for hedging. In this article, the hedging problem is viewed as an instance of a risk-averse contextual $k$-armed bandit problem, for which a large body of theoretical results and well-studied algorithms are available. We find that the $k$-armed bandit model naturally fits to the $P\&L$ formulation of hedging, providing for a more accurate and sample efficient approach than $Q$-learning and reducing to the Black-Scholes model in the absence of transaction costs and risks.

[112] 2007.01625

Complex Network Construction for Interactive Image Segmentation using Particle Competition and Cooperation: A New Approach

In the interactive image segmentation task, the Particle Competition and Cooperation (PCC) model is fed with a complex network, which is built from the input image. In the network construction phase, a weight vector is needed to define the importance of each element in the feature set, which consists of color and location information of the corresponding pixels, thus demanding a specialist's intervention. The present paper proposes the elimination of the weight vector through modifications in the network construction phase. The proposed model and the reference model, without the use of a weight vector, were compared using 151 images extracted from the Grabcut dataset, the PASCAL VOC dataset and the Alpha matting dataset. Each model was applied 30 times to each image to obtain an error average. These simulations resulted in an error rate of only 0.49\% when classifying pixels with the proposed model while the reference model had an error rate of 3.14\%. The proposed method also presented less error variation in the diversity of the evaluated images, when compared to the reference model.

[113] 2007.01627

Neumann networks: differential programming for supervised learning with missing values

The presence of missing values makes supervised learning much more challenging. Indeed, previous work has shown that even when the response is a linear function of the complete data, the optimal predictor is a complex function of the observed entries and the missingness indicator. As a result, the computational or sample complexities of consistent approaches depend on the number of missing patterns, which can be exponential in the number of dimensions. In this work, we derive the analytical form of the optimal predictor under a linearity assumption and various missing data mechanisms including Missing at Random (MAR) and self-masking (Missing Not At Random). Based on a Neumann series approximation of the optimal predictor, we propose a new principled architecture, named Neumann networks. Their originality and strength comes from the use of a new type of non-linearity: the multiplication by the missingness indicator. We provide an upper bound on the Bayes risk of Neumann networks, and show that they have good predictive accuracy with both a number of parameters and a computational complexity independent of the number of missing data patterns. As a result they scale well to problems with many features, and remain statistically efficient for medium-sized samples. Moreover, we show that, contrary to procedures using EM or imputation, they are robust to the missing data mechanism, including difficult MNAR settings such as self-masking.

[114] 2007.01629

A Discrete Probabilistic Approach to Dense Flow Visualization

Dense flow visualization is a popular visualization paradigm. Traditionally, the various models and methods in this area use a continuous formulation, resting upon the solid foundation of functional analysis. In this work, we examine a discrete formulation of dense flow visualization. From probability theory, we derive a similarity matrix that measures the similarity between different points in the flow domain, leading to the discovery of a whole new class of visualization models. Using this matrix, we propose a novel visualization approach consisting of the computation of spectral embeddings, i.e., characteristic domain maps, defined by particle mixture probabilities. These embeddings are scalar fields that give insight into the mixing processes of the flow on different scales. The approach of spectral embeddings is already well studied in image segmentation, and we see that spectral embeddings are connected to Fourier expansions and frequencies. We showcase the utility of our method using different 2D and 3D flows.

[115] 2007.01634

Social distancing with the Optimal Steps Model

With the Covid-19 pandemic an urgent need to simulate social distancing arises. The Optimal Steps Model (OSM) is a pedestrian locomotion model that operationalizes an individual's need for personal space. We present new parameter values for personal space in the Optimal Steps Model to simulate social distancing in the pedestrian dynamics simulator Vadere. Our approach is pragmatic. We consider two use cases: in the first we demand that a set social distance must never be violated. In the second the social distance must be kept only on average. For each use case we conduct simulation studies in a typical bottleneck scenario and measure contact times, that is, violations of the social distance rule. We derive rules of thumb for suitable parameter choices in dependency of the desired social distance. We test the rules of thumb for the social distances 1.5m and 2.0m and observe that the new parameter values indeed lead to the desired social distancing. Thus, the rules of thumb will quickly enable Vadere users to conduct their own studies without understanding the intricacies of the OSM implementation and without extensive parameter adjustment.

[116] 2007.01637

Active learning of timed automata with unobservable resets

Active learning of timed languages is concerned with the inference of timed automata from observed timed words. The agent can query for the membership of words in the target language, or propose a candidate model and verify its equivalence to the target. The major difficulty of this framework is the inference of clock resets, central to the dynamics of timed automata, but not directly observable. Interesting first steps have already been made by restricting to the subclass of event-recording automata, where clock resets are tied to observations. In order to advance towards learning of general timed automata, we generalize this method to a new class, called reset-free event-recording automata, where some transitions may reset no clocks. This offers the same challenges as generic timed automata while keeping the simpler framework of event-recording automata for the sake of readability. Central to our contribution is the notion of invalidity, and the algorithm and data structures to deal with it, allowing on-the-fly detection and pruning of reset hypotheses that contradict observations, a key to any efficient active-learning procedure for generic timed automata.

[117] 2007.01647

Learning intuitive physics and one-shot imitation using state-action-prediction self-organizing maps

Human learning and intelligence work differently from the supervised pattern recognition approach adopted in most deep learning architectures. Humans seem to learn rich representations by exploration and imitation, build causal models of the world, and use both to flexibly solve new tasks. We suggest a simple but effective unsupervised model which develops such characteristics. The agent learns to represent the dynamical physical properties of its environment by intrinsically motivated exploration, and performs inference on this representation to reach goals. For this, a set of self-organizing maps which represent state-action pairs is combined with a causal model for sequence prediction. The proposed system is evaluated in the cartpole environment. After an initial phase of playful exploration, the agent can execute kinematic simulations of the environment's future, and use those for action planning. We demonstrate its performance on a set of several related, but different one-shot imitation tasks, which the agent flexibly solves in an active inference style.

[118] 2007.01648

Fast Arithmetic Hardware Library For RLWE-Based Homomorphic Encryption

In this work, we propose an open-source, first-of-its-kind, arithmetic hardware library with a focus on accelerating the arithmetic operations involved in Ring Learning with Error (RLWE)-based somewhat homomorphic encryption (SHE). We design and implement a hardware accelerator consisting of submodules like Residue Number System (RNS), Chinese Remainder Theorem (CRT), NTT-based polynomial multiplication, modulo inverse, modulo reduction, and all the other polynomial and scalar operations involved in SHE. For all of these operations, wherever possible, we include a hardware-cost efficient serial and a fast parallel implementation in the library. A modular and parameterized design approach helps in easy customization and also provides flexibility to extend these operations for use in most homomorphic encryption applications that fit well into emerging FPGA-equipped cloud architectures. Using the submodules from the library, we prototype a hardware accelerator on FPGA. The evaluation of this hardware accelerator shows a speed up of approximately 4200x and 2950x to evaluate a homomorphic multiplication and addition respectively when compared to an existing software implementation.

[119] 2007.01649

Stabilizing of a Class of Underactuated Euler Lagrange System Using an Approximate Model

The energy shaping method, Controlled Lagrangian, is a well-known approach to stabilize the under-actuated Euler Lagrange (EL) systems. In this approach, to construct a control rule, some nonlinear, nonhomogeneous partial differential equations (PDEs), which are called matching conditions, must be solved. In this paper, a method is proposed to obtain an approximate solution of these matching conditions for a class of under-actuated EL systems. To develop the method, the potential energy matching condition is transformed to a set of linear PDEs using an approximation of inertia matrices. So the assignable potential energy function and the controlled inertia matrix, both are constructed as a common solution of these PDEs. Afterwards, the gyroscopic and dissipative forces are found as the solution of the kinetic energy matching condition. Finally, the control rule is constructed by adding energy shaping rule and additional dissipation injection to provide asymptotic stability. The stability analysis of the closed loop system which used the control rule derived with the proposed method is also given. To demonstrate the success of the proposed method, the stability problem of the inverted pendulum on a cart is considered.

[120] 2007.01652

Generating Informative Dialogue Responses with Keywords-Guided Networks

Recently, open-domain dialogue systems have attracted growing attention. Most of them use the sequence-to-sequence (Seq2Seq) architecture to generate responses. However, traditional Seq2Seq-based open-domain dialogue models tend to generate generic and safe responses, which are less informative, unlike human responses. In this paper, we propose a simple but effective keywords-guided Sequence-to-Sequence model (KW-Seq2Seq) which uses keywords information as guidance to generate open-domain dialogue responses. Specifically, KW-Seq2Seq first uses a keywords decoder to predict some topic keywords, and then generates the final response under the guidance of them. Extensive experiments demonstrate that the KW-Seq2Seq model produces more informative, coherent and fluent responses, yielding substantive gain in both automatic and human evaluation metrics.

[121] 2007.01653

Analytic solution of system of singular nonlinear differential equations with Neumann-Robin boundary conditions arising in astrophysics

In this paper, we propose a new approach for the approximate analytic solution of system of Lane-Emden-Fowler type equations with Neumann-Robin boundary conditions. The algorithm is based on Green's function and the homotopy analysis method. This approach depends on constructing Green's function before establishing the recursive scheme for the approximate analytic solution of the equivalent system of integral equations. Unlike Adomian decomposition method (ADM) \cite{singh2020solving}, the present method contains adjustable parameters to control the convergence of the approximate series solution. Convergence and error estimation of the present is provided under quite general conditions. Several examples are considered to demonstrate the accuracy of the current algorithm. Computational results reveal that the proposed approach produces better results as compared to some existing iterative methods.

[122] 2007.01658

Playing with Words at the National Library of Sweden -- Making a Swedish BERT

This paper introduces the Swedish BERT ("KB-BERT") developed by the KBLab for data-driven research at the National Library of Sweden (KB). Building on recent efforts to create transformer-based BERT models for languages other than English, we explain how we used KB's collections to create and train a new language-specific BERT model for Swedish. We also present the results of our model in comparison with existing models - chiefly that produced by the Swedish Public Employment Service, Arbetsf\"ormedlingen, and Google's multilingual M-BERT - where we demonstrate that KB-BERT outperforms these in a range of NLP tasks from named entity recognition (NER) to part-of-speech tagging (POS). Our discussion highlights the difficulties that continue to exist given the lack of training data and testbeds for smaller languages like Swedish. We release our model for further exploration and research here: .

[123] 2007.01666

Mapping Flows in Bipartite Networks

Mapping network flows provides insight into the organization of networks, but even though many real-networks are bipartite, no method for mapping flows takes advantage of the bipartite structure. What do we miss by discarding this information and how can we use it to understand the structure of bipartite networks better? The map equation models network flows with a random walk and exploits the information-theoretic duality between compression and finding regularities to detect communities in networks. However, it does not use the fact that random walks in bipartite networks alternate between node types, information worth 1 bit. To make some or all of this information available to the map equation, we developed a coding scheme that remembers node types at different rates. We explored the community landscape of bipartite real-world networks from no node-type information to full node-type information and found that using node types at a higher rate generally leads to deeper community hierarchies and a higher resolution. The corresponding compression of network flows exceeds the amount of extra information provided. Consequently, taking advantage of the bipartite structure increases the resolution and reveals more network regularities.

[124] 2007.01667

Reading Comprehension in Czech via Machine Translation and Cross-lingual Transfer

Reading comprehension is a well studied task, with huge training datasets in English. This work focuses on building reading comprehension systems for Czech, without requiring any manually annotated Czech training data. First of all, we automatically translated SQuAD 1.1 and SQuAD 2.0 datasets to Czech to create training and development data, which we release at this http URL We then trained and evaluated several BERT and XLM-RoBERTa baseline models. However, our main focus lies in cross-lingual transfer models. We report that a XLM-RoBERTa model trained on English data and evaluated on Czech achieves very competitive performance, only approximately 2 percent points worse than a~model trained on the translated Czech data. This result is extremely good, considering the fact that the model has not seen any Czech data during training. The cross-lingual transfer approach is very flexible and provides a reading comprehension in any language, for which we have enough monolingual raw texts.

[125] 2007.01669

Gaussian Process Regression with Local Explanation

Gaussian process regression (GPR) is a fundamental model used in machine learning. Owing to its accurate prediction with uncertainty and versatility in handling various data structures via kernels, GPR has been successfully used in various applications. However, in GPR, how the features of an input contribute to its prediction cannot be interpreted. Herein, we propose GPR with local explanation, which reveals the feature contributions to the prediction of each sample, while maintaining the predictive performance of GPR. In the proposed model, both the prediction and explanation for each sample are performed using an easy-to-interpret locally linear model. The weight vector of the locally linear model is assumed to be generated from multivariate Gaussian process priors. The hyperparameters of the proposed models are estimated by maximizing the marginal likelihood. For a new test sample, the proposed model can predict the values of its target variable and weight vector, as well as their uncertainties, in a closed form. Experimental results on various benchmark datasets verify that the proposed model can achieve predictive performance comparable to those of GPR and superior to that of existing interpretable models, and can achieve higher interpretability than them, both quantitatively and qualitatively.

[126] 2007.01671

Few-Shot Microscopy Image Cell Segmentation

Automatic cell segmentation in microscopy images works well with the support of deep neural networks trained with full supervision. Collecting and annotating images, though, is not a sustainable solution for every new microscopy database and cell type. Instead, we assume that we can access a plethora of annotated image data sets from different domains (sources) and a limited number of annotated image data sets from the domain of interest (target), where each domain denotes not only different image appearance but also a different type of cell segmentation problem. We pose this problem as meta-learning where the goal is to learn a generic and adaptable few-shot learning model from the available source domain data sets and cell segmentation tasks. The model can be afterwards fine-tuned on the few annotated images of the target domain that contains different image appearance and different cell type. In our meta-learning training, we propose the combination of three objective functions to segment the cells, move the segmentation results away from the classification boundary using cross-domain tasks, and learn an invariant representation between tasks of the source domains. Our experiments on five public databases show promising results from 1- to 10-shot meta-learning using standard segmentation neural network architectures.

[127] 2007.01673

On girth and the parameterized complexity of token sliding and token jumping

In the Token Jumping problem we are given a graph $G = (V,E)$ and two independent sets $S$ and $T$ of $G$, each of size $k \geq 1$. The goal is to determine whether there exists a sequence of $k$-sized independent sets in $G$, $\langle S_0, S_1, \ldots, S_\ell \rangle$, such that for every $i$, $|S_i| = k$, $S_i$ is an independent set, $S = S_0$, $S_\ell = T$, and $|S_i \Delta S_{i+1}| = 2$. In other words, if we view each independent set as a collection of tokens placed on a subset of the vertices of $G$, then the problem asks for a sequence of independent sets which transforms $S$ to $T$ by individual token jumps which maintain the independence of the sets. This problem is known to be PSPACE-complete on very restricted graph classes, e.g., planar bounded degree graphs and graphs of bounded bandwidth. A closely related problem is the Token Sliding problem, where instead of allowing a token to jump to any vertex of the graph we instead require that a token slides along an edge of the graph. Token Sliding is also known to be PSPACE-complete on the aforementioned graph classes. We investigate the parameterized complexity of both problems on several graph classes, focusing on the effect of excluding certain cycles from the input graph. In particular, we show that both Token Sliding and Token Jumping are fixed-parameter tractable on $C_4$-free bipartite graphs when parameterized by $k$. For Token Jumping, we in fact show that the problem admits a polynomial kernel on $\{C_3,C_4\}$-free graphs. In the case of Token Sliding, we also show that the problem admits a polynomial kernel on bipartite graphs of bounded degree. We complement these positive results by showing that, for any constant $p \geq 4$, both problems are W[1]-hard on $C_\ell$-free graphs, where $4 \leq \ell \leq p$, and Token Sliding remains W[1]-hard even on bipartite graphs.

[128] 2007.01682

Improving auto-encoder novelty detection using channel attention and entropy minimization

Novelty detection is a important research area which mainly solves the classification problem of inliers which usually consists of normal samples and outliers composed of abnormal samples. We focus on the role of auto-encoder in novelty detection and further improved the performance of such methods based on auto-encoder through two main contributions. Firstly, we introduce attention mechanism into novelty detection. Under the action of attention mechanism, auto-encoder can pay more attention to the representation of inlier samples through adversarial training. Secondly, we try to constrain the expression of the latent space by information entropy. Experimental results on three public datasets show that the proposed method has potential performance for novelty detection.

[129] 2007.01686

Sublinear Explicit Incremental Planar Voronoi Diagrams

A data structure is presented that explicitly maintains the graph of a Voronoi diagram of $N$ point sites in the plane or the dual graph of a convex hull of points in three dimensions while allowing insertions of new sites/points. Our structure supports insertions in $\tilde O (N^{3/4})$ expected amortized time, where $\tilde O$ suppresses polylogarithmic terms. This is the first result to achieve sublinear time insertions; previously it was shown by Allen et al. that $\Theta(\sqrt{N})$ amortized combinatorial changes per insertion could occur in the Voronoi diagram but a sublinear-time algorithm was only presented for the special case of points in convex position.

[130] 2007.01688

Online publication of court records: circumventing the privacy-transparency trade-off

The open data movement is leading to the massive publishing of court records online, increasing transparency and accessibility of justice, and to the design of legal technologies building on the wealth of legal data available. However, the sensitive nature of legal decisions also raises important privacy issues. Current practices solve the resulting privacy versus transparency trade-off by combining access control with (manual or semi-manual) text redaction. In this work, we claim that current practices are insufficient for coping with massive access to legal data (restrictive access control policies is detrimental to openness and to utility while text redaction is unable to provide sound privacy protection) and advocate for a in-tegrative approach that could benefit from the latest developments of the privacy-preserving data publishing domain. We present a thorough analysis of the problem and of the current approaches, and propose a straw man multimodal architecture paving the way to a full-fledged privacy-preserving legal data publishing system.

[131] 2007.01691

The Review Unmanned Surface Vehicle Path Planning: Based on Multi-modality Constraint

The essence of the path planning problems is multi-modality constraint. However, most of the current literature has not mentioned this issue. This paper introduces the research progress of path planning based on the multi-modality constraint. The path planning of multi-modality constraint research can be classified into three stages in terms of its basic ingredients (such as shape, kinematics and dynamics et al.): Route Planning, Trajectory Planning and Motion Planning. It then reviews the research methods and classical algorithms, especially those applied to the Unmanned Surface Vehicle (USV) in every stage. Finally, the paper points out some existing problems in every stage and suggestions for future research.

[132] 2007.01696

Channel Compression: Rethinking Information Redundancy among Channels in CNN Architecture

Model compression and acceleration are attracting increasing attentions due to the demand for embedded devices and mobile applications. Research on efficient convolutional neural networks (CNNs) aims at removing feature redundancy by decomposing or optimizing the convolutional calculation. In this work, feature redundancy is assumed to exist among channels in CNN architectures, which provides some leeway to boost calculation efficiency. Aiming at channel compression, a novel convolutional construction named compact convolution is proposed to embrace the progress in spatial convolution, channel grouping and pooling operation. Specifically, the depth-wise separable convolution and the point-wise interchannel operation are utilized to efficiently extract features. Different from the existing channel compression method which usually introduces considerable learnable weights, the proposed compact convolution can reduce feature redundancy with no extra parameters. With the point-wise interchannel operation, compact convolutions implicitly squeeze the channel dimension of feature maps. To explore the rules on reducing channel redundancy in neural networks, the comparison is made among different point-wise interchannel operations. Moreover, compact convolutions are extended to tackle with multiple tasks, such as acoustic scene classification, sound event detection and image classification. The extensive experiments demonstrate that our compact convolution not only exhibits high effectiveness in several multimedia tasks, but also can be efficiently implemented by benefiting from parallel computation.

[133] 2007.01698

Safe Reinforcement Learning with Mixture Density Network: A Case Study in Autonomous Highway Driving

This paper presents a safe reinforcement learning system for automated driving that benefits from multimodal future trajectory predictions. We propose a safety system that consists of two safety components: a heuristic safety and a learning-based safety. The heuristic safety module is based on common driving rules. On the other hand, the learning-based safety module is a data-driven safety rule that learns safety patterns from driving data. Specifically, it utilizes mixture density recurrent neural networks (MD-RNN) for multimodal future trajectory predictions to accelerate the learning progress. Our simulation results demonstrate that the proposed safety system outperforms previously reported results in terms of average reward and number of collisions.

[134] 2007.01704

Passive Quadrupedal Gait Synchronization for Extra Robotic Legs Using a Dynamically Coupled Double Rimless Wheel Model

The Extra Robotic Legs (XRL) system is a robotic augmentation worn by a human operator consisting of two articulated robot legs that walk with the operator and help bear a heavy backpack payload. It is desirable for the Human-XRL quadruped system to walk with the rear legs lead the front by 25% of the gait period, minimizing the energy lost from foot impacts while maximizing balance stability. Unlike quadrupedal robots, the XRL cannot command the human's limbs to coordinate quadrupedal locomotion. Using a pair of Rimless Wheel models, it is shown that the systems coupled with a spring and damper converge to the desired 25% phase difference. A Poincar\'e return map was generated using numerical simulation to examine the convergence properties to different coupler design parameters, and initial conditions. The Dynamically Coupled Double Rimless Wheel system was physically realized with a spring and dashpot chosen from the theoretical results, and initial experiments indicate that the desired synchronization properties may be achieved within several steps using this set of passive components alone.

[135] 2007.01705

Hybrid Open-Loop Closed-Loop Control of Coupled Human-Robot Balance During Assisted Stance Transition with Extra Robotic Legs

A new approach to the human-robot shared control of the Extra Robotic Legs (XRL) wearable augmentation system is presented. The XRL system consists of two extra legs that bear the entirety of its backpack payload, as well as some of the human operator's weight. The XRL System must support its own balance and assist the operator stably while allowing them to move in selected directions. In some directions of the task space the XRL must constrain the human motion with position feedback for balance, while in other directions the XRL must have no position feedback, so that the human can move freely. Here, we present Hybrid Open-Loop / Closed-Loop Control Architecture for mixing the two control modes in a systematic manner. The system is reduced to individual joint feedback control that is simple to implement and reliable against failure. The method is applied to the XRL system that assists a human in conducting a nuclear waste decommissioning task. A prototype XRL system has been developed and demonstrated with a simulated human performing the transition from standing to crawling and back again while coupled to the prototype XRL system.

[136] 2007.01708

Fault Diagnosis of the 10MW Floating Offshore Wind Turbine Benchmark: a Mixed Model and Signal-based Approach

Floating Offshore Wind Turbines (FOWTs) operate in the harsh marine environment with limited accessibility and maintainability. Not only failures are more likely to occur than in land-based turbines, but also corrective maintenance is more expensive. In the present study, a mixed model and signal-based Fault Diagnosis (FD) architecture is developed to detect and isolate critical faults in FOWTs. More specifically, a model-based scheme is developed to detect and isolate the faults associated with the turbine system. It is based on a fault detection and approximation estimator and fault isolation estimators, with time-varying adaptive thresholds to guarantee against false-alarms. In addition, a signal-based scheme is established, within the proposed architecture, for detecting and isolating two representative mooring lines faults. For the purpose of verification, a 10MW FOWT benchmark is developed and its operating conditions, which contains predefined faults, are simulated by extending the high-fidelity simulator. Based on it, the effectiveness of the proposed architecture is illustrated. In addition, the advantages and limitations are discussed by comparing its fault detection to the results delivered by other approaches. Results show that the proposed architecture has the best performance in detecting and isolating the critical faults in FOWTs under diverse operating conditions.

[137] 2007.01709

Many-Sorted Hybrid Modal Languages

We continue our investigation into hybrid polyadic multi-sorted logic with a focus on expresivity related to the operational and axiomatic semantics of rogramming languages, and relations with first-order logic. We identify a fragment of the full logic, for which we prove sound and complete deduction and we show that it is powerful enough to represent both the programs and their semantics in an uniform way. Although weaker than other hybrid systems previously developed, this system is expected to have better computational properties. Finally, we provide a standard translation from full hybrid many-sorted logic to first-order logic.

[138] 2007.01711

Synergistic saliency and depth prediction for RGB-D saliency detection

Depth information available from an RGB-D camera can be useful in segmenting salient objects when figure/ground cues from RGB channels are weak. This has motivated the development of several RGB-D saliency datasets and algorithms that use all four channels of the RGB-D data for both training and inference. Unfortunately, existing RGB-D saliency datasets are small, leading to overfitting and poor generalization. Here we demonstrate a system for RGB-D saliency detection that makes effective joint use of large RGB saliency datasets with hand-labelled saliency ground truth together, and smaller RGB-D saliency datasets {\em without} saliency ground truth. This novel prediction-guided cross-refinement network is trained to jointly estimate both saliency and depth, allowing mutual refinement between feature representations tuned for the two respective tasks. An adversarial stage resolves domain shift between RGB and RGB-D saliency datasets, allowing representations for saliency and depth estimation to be aligned on either. Critically, our system does not require saliency ground-truth for the RGB-D datasets, making it easier to expand these datasets for training, and does not require the D channel for inference, allowing the method to be used for the much broader range of applications where only RGB data are available. Evaluation on seven RGBD datasets demonstrates that, without using hand-labelled saliency ground truth for RGB-D datasets and using only the RGB channels of these datasets at inference, our system achieves performance that is comparable to state-of-the-art methods that use hand-labelled saliency maps for RGB-D data at training and use the depth channels of these datasets at inference.

[139] 2007.01713

Towards the Adoption of OMG Standards in the Development of SOA-Based IoT Systems

A common feature of the Internet of Things (IoT) is the high heterogeneity, regarding network protocols, data formats, hardware and software platforms. Aiming to deal with such a degree of heterogeneity, several frameworks have applied the Model-Driven Development (MDD) to build IoT applications. On the software architecture viewpoint, the literature has shown that the Service-Oriented Architecture (SOA) is a promising style to address the interoperability of entities composing these solutions. Some features of IoT make it challenging to analyze the impact of design decisions on the SOA-based IoT applications behavior. Thus, it is a key requirement to simulate the model to verify whether the system performs as expected before its implementation. Although the literature has identified that the SOA style is suitable for addressing the interoperability, existing modelling languages do not consider SOA elements as first-class citizens when designing IoT applications. Furthermore, although existing MDD frameworks provide modeling languages comprising well-defined syntax, they lack execution semantics, thus, are not suitable for model execution and analysis. This work aims at addressing these issues by introducing IoTDraw. The framework provides a fully OMG-compliant executable modeling language for SOA-based IoT systems; thus, its specifications can be implemented by any tool implementing OMG standards.

[140] 2007.01719

Ensemble Regression Models for Software Development Effort Estimation: A Comparative Study

As demand for computer software continually increases, software scope and complexity become higher than ever. The software industry is in real need of accurate estimates of the project under development. Software development effort estimation is one of the main processes in software project management. However, overestimation and underestimation may cause the software industry loses. This study determines which technique has better effort prediction accuracy and propose combined techniques that could provide better estimates. Eight different ensemble models to estimate effort with Ensemble Models were compared with each other base on the predictive accuracy on the Mean Absolute Residual (MAR) criterion and statistical tests. The results have indicated that the proposed ensemble models, besides delivering high efficiency in contrast to its counterparts, and produces the best responses for software project effort estimation. Therefore, the proposed ensemble models in this study will help the project managers working with development quality software.

[141] 2007.01721

Smartphone Security Behavioral Scale: A NewPsychometric Measurement for Smartphone Security

Despite widespread use of smartphones, there is no measurement standard targeted at smartphone security behaviors. In this paper we translate a well-known cybersecurity behavioral scale into the smartphone domain and show that we can improve on this translation by following an established psychometrics approach surveying 1011 participants. We design a new 14-item Smartphone Security Behavioral Scale (SSBS) exhibiting high reliability and good fit to a two-component behavioural model based on technical versus social protection strategies. We then demonstrate how SSBS can be applied to measure the influence of mental health issues on smartphone security behavior intentions. We found significant correlations that predict SSBS profiles from three types of MHIs. Conversely, we are able to predict presence of MHIs using SSBS profiles.We obtain prediction AUCs of 72.1% for Internet addiction,75.8% for depression and 66.2% for insomnia.

[142] 2007.01722

Learning Utilities and Equilibria in Non-Truthful Auctions

In non-truthful auctions, agents' utility for a strategy depends on the strategies of the opponents and also the prior distribution over their private types; the set of Bayes Nash equilibria generally has an intricate dependence on the prior. Using the First Price Auction as our main demonstrating example, we show that $\tilde O(n / \epsilon^2)$ samples from the prior with $n$ agents suffice for an algorithm to learn the interim utilities for all monotone bidding strategies. As a consequence, this number of samples suffice for learning all approximate equilibria. We give almost matching (up to polylog factors) lower bound on the sample complexity for learning utilities. We also consider settings where agents must pay a search cost to discover their own types. Drawing on a connection between this setting and the first price auction, discovered recently by Kleinberg et al. (2016), we show that $\tilde O(n / \epsilon^2)$ samples suffice for utilities and equilibria to be estimated in a near welfare-optimal descending auction in this setting. En route, we improve the sample complexity bound, recently obtained by Guo et al. (2019), for the Pandora's Box problem, which is a classical model for sequential consumer search.

[143] 2007.01724

Deep Fence Estimation using Stereo Guidance and Adversarial Learning

People capture memorable images of events and exhibits that are often occluded by a wire mesh loosely termed as fence. Recent works in removing fence have limited performance due to the difficulty in initial fence segmentation. This work aims to accurately segment fence using a novel fence guidance mask (FM) generated from stereo image pair. This binary guidance mask contains deterministic cues about the structure of fence and is given as additional input to the deep fence estimation model. We also introduce a directional connectivity loss (DCL), which is used alongside adversarial loss to precisely detect thin wires. Experimental results obtained on real world scenarios demonstrate the superiority of proposed method over state-of-the-art techniques.

[144] 2007.01733

Probabilistic Soft Type Assignment

We model randomized complexity classes in the style of Implicit Computational Complexity. We introduce PSTA, a probabilistic version of STA, the type-theoretical counterpart of Soft Linear Logic. PSTA is a type assignment for an extension of Simpson's Linear Lambda Calculus and its surface reduction, where Linear additives express random choice. Linear additives are weaker than the usual ones; they allow for duplications harmlessly affecting the computational cost of normalization. PSTA is sound and complete w.r.t. probabilistic polynomial time functions and characterizes the probabilistic complexity classes PP and BPP, the latter slightly less implicitly than PP.

[145] 2007.01736

A global-in-time domain decomposition method for the coupled nonlinear Stokes and Darcy flows

We study a decoupling iterative algorithm based on domain decomposition for the time-dependent nonlinear Stokes-Darcy model, in which different time steps can be used in the flow region and in the porous medium. The coupled system is formulated as a space-time interface problem based on the interface condition for mass conservation. The nonlinear interface problem is then solved by a nested iteration approach which involves, at each Newton iteration, the solution of a linearized interface problem and, at each Krylov iteration, parallel solution of time-dependent linearized Stokes and Darcy problems. Consequently, local discretizations in time (and in space) can be used to efficiently handle multiphysics systems of coupled equations evolving at different temporal scales. Numerical results with nonconforming time grids are presented to illustrate the performance of the proposed method.

[146] 2007.01738

Video Prediction via Example Guidance

In video prediction tasks, one major challenge is to capture the multi-modal nature of future contents and dynamics. In this work, we propose a simple yet effective framework that can efficiently predict plausible future states. The key insight is that the potential distribution of a sequence could be approximated with analogous ones in a repertoire of training pool, namely, expert examples. By further incorporating a novel optimization scheme into the training procedure, plausible predictions can be sampled efficiently from distribution constructed from the retrieved examples. Meanwhile, our method could be seamlessly integrated with existing stochastic predictive models; significant enhancement is observed with comprehensive experiments in both quantitative and qualitative aspects. We also demonstrate the generalization ability to predict the motion of unseen class, i.e., without access to corresponding data during training phase.

[147] 2007.01751

Assessing and Improving Cybersecurity Maturity for SMEs: Standardization aspects

SMEs constitute a very large part of the economy in every country and they play an important role in economic growth and social development. SMEs are frequent targets of cybersecurity attacks similar to large enterprises. However, unlike large enterprises, SMEs mostly have limited capabilities regarding cybersecurity practices. Given the increasing cybersecurity risks and the large impact that the risks may bring to the SMEs, assessing and improving the cybersecurity capabilities is crucial for SMEs for sustainability. This research aims to provide an approach for SMEs for assessing and improving their cybersecurity capabilities by integrating key elements from existing industry standards.

[148] 2007.01753

A Complete List of All Convex Polyhedra Made by Gluing Regular Pentagons

We give a complete description of all convex polyhedra whose surface can be constructed from several congruent regular pentagons by folding and gluing them edge to edge. Our method of determining the graph structure of the polyhedra from a gluing is of independent interest and can be used in other similar settings.

[149] 2007.01754

Differentiable Causal Discovery from Interventional Data

Discovering causal relationships in data is a challenging task that involves solving a combinatorial problem for which the solution is not always identifiable. A new line of work reformulates the combinatorial problem as a continuous constrained optimization one, enabling the use of different powerful optimization techniques. However, methods based on this idea do not yet make use of interventional data, which can significantly alleviate identifiability issues. In this work, we propose a neural network-based method for this task that can leverage interventional data. We illustrate the flexibility of the continuous-constrained framework by taking advantage of expressive neural architectures such as normalizing flows. We show that our approach compares favorably to the state of the art in a variety of settings, including perfect and imperfect interventions for which the targeted nodes may even be unknown.

[150] 2007.01755

Multi-Label Image Recognition with Multi-Class Attentional Regions

Multi-label image recognition is a practical and challenging task compared to single-label image classification. However, previous works may be suboptimal because of a great number of object proposals or complex attentional region generation modules. In this paper, we propose a simple but efficient two-stream framework to recognize multi-category objects from global image to local regions, similar to how human beings perceive objects. To bridge the gap between global and local streams, we propose a multi-class attentional region module which aims to make the number of attentional regions as small as possible and keep the diversity of these regions as high as possible. Our method can efficiently and effectively recognize multi-class objects with an affordable computation cost and a parameter-free region localization module. Over three benchmarks on multi-label image classification, we create new state-of-the-art results with a single model only using image semantics without label dependency. In addition, the effectiveness of the proposed method is extensively demonstrated under different factors such as global pooling strategy, input size and network architecture.

[151] 2007.01758

Collaborative Learning for Faster StyleGAN Embedding

The latent code of the recent popular model StyleGAN has learned disentangled representations thanks to the multi-layer style-based generator. Embedding a given image back to the latent space of StyleGAN enables wide interesting semantic image editing applications. Although previous works are able to yield impressive inversion results based on an optimization framework, which however suffers from the efficiency issue. In this work, we propose a novel collaborative learning framework that consists of an efficient embedding network and an optimization-based iterator. On one hand, with the progress of training, the embedding network gives a reasonable latent code initialization for the iterator. On the other hand, the updated latent code from the iterator in turn supervises the embedding network. In the end, high-quality latent code can be obtained efficiently with a single forward pass through our embedding network. Extensive experiments demonstrate the effectiveness and efficiency of our work.

[152] 2007.01760

Explainable Deep One-Class Classification

Deep one-class classification variants for anomaly detection learn a mapping that concentrates nominal samples in feature space causing anomalies to be mapped away. Because this transformation is highly non-linear, finding interpretations poses a significant challenge. In this paper we present an explainable deep one-class classification method, Fully Convolutional Data Description (FCDD), where the mapped samples are themselves also an explanation heatmap. FCDD yields competitive detection performance and provides reasonable explanations on common anomaly detection benchmarks with CIFAR-10 and ImageNet. On MVTec-AD, a recent manufacturing dataset offering ground-truth anomaly maps, FCDD meets the state of the art in an unsupervised setting, and outperforms its competitors in a semi-supervised setting. Finally, using FCDD's explanations we demonstrate the vulnerability of deep one-class classification models to spurious image features such as image watermarks.

[153] 2007.01761

The Lack of Shared Understanding of Non-Functional Requirements in Continuous Software Engineering: Accidental or Essential?

Building shared understanding of requirements is key to ensuring downstream software activities are efficient and effective. However, in continuous software engineering (CSE) some lack of shared understanding is an expected, and essential, part of a rapid feedback learning cycle. At the same time, there is a key trade-off with avoidable costs, such as rework, that come from accidental gaps in shared understanding. This trade-off is even more challenging for non-functional requirements (NFRs), which have significant implications for product success. Comprehending and managing NFRs is especially difficult in small, agile organizations. How such organizations manage shared understanding of NFRs in CSE is understudied. We conducted a case study of three small organizations scaling up CSE to further understand and identify factors that contribute to lack of shared understanding of NFRs, and its relationship to rework. Our in-depth analysis identified 41 NFR-related software tasks as rework due to a lack of shared understanding of NFRs. Of these 41 tasks 78% were due to avoidable (accidental) lack of shared understanding of NFRs. Using a mixed-methods approach we identify factors that contribute to lack of shared understanding of NFRs, such as the lack of domain knowledge, rapid pace of change, and cross-organizational communication problems. We also identify recommended strategies to mitigate lack of shared understanding through more effective management of requirements knowledge in such organizations. We conclude by discussing the complex relationship between shared understanding of requirements, rework and, CSE.

[154] 2007.01764

Disentangled Graph Collaborative Filtering

Learning informative representations of users and items from the interaction data is of crucial importance to collaborative filtering (CF). Present embedding functions exploit user-item relationships to enrich the representations, evolving from a single user-item instance to the holistic interaction graph. Nevertheless, they largely model the relationships in a uniform manner, while neglecting the diversity of user intents on adopting the items, which could be to pass time, for interest, or shopping for others like families. Such uniform approach to model user interests easily results in suboptimal representations, failing to model diverse relationships and disentangle user intents in representations. In this work, we pay special attention to user-item relationships at the finer granularity of user intents. We hence devise a new model, Disentangled Graph Collaborative Filtering (DGCF), to disentangle these factors and yield disentangled representations. Specifically, by modeling a distribution over intents for each user-item interaction, we iteratively refine the intent-aware interaction graphs and representations. Meanwhile, we encourage independence of different intents. This leads to disentangled representations, effectively distilling information pertinent to each intent. We conduct extensive experiments on three benchmark datasets, and DGCF achieves significant improvements over several state-of-the-art models like NGCF, DisenGCN, and MacridVAE. Further analyses offer insights into the advantages of DGCF on the disentanglement of user intents and interpretability of representations. Our codes are available in

[155] 2007.01769

End-to-end Interpretable Learning of Non-blind Image Deblurring

Non-blind image deblurring is typically formulated as a linear least-squares problem regularized by natural priors on the corresponding sharp picture's gradients, which can be solved, for example, using a half-quadratic splitting method with Richardson fixed-point iterations for its least-squares updates and a proximal operator for the auxiliary variable updates. We propose to precondition the Richardson solver using approximate inverse filters of the (known) blur and natural image prior kernels. Using convolutions instead of a generic linear preconditioner allows extremely efficient parameter sharing across the image, and leads to significant gains in accuracy and/or speed compared to classical FFT and conjugate-gradient methods. More importantly, the proposed architecture is easily adapted to learning both the preconditioner and the proximal operator using CNN embeddings. This yields a simple and efficient algorithm for non-blind image deblurring which is fully interpretable, can be learned end to end, and whose accuracy matches or exceeds the state of the art, quite significantly, in the non-uniform case.

[156] 2007.01771

Learning Expectation of Label Distribution for Facial Age and Attractiveness Estimation

Facial attributes (e.g., age and attractiveness) estimation performance has been greatly improved by using convolutional neural networks. However, existing methods have an inconsistency between the training objectives and the evaluation metric, so they may be suboptimal. In addition, these methods always adopt image classification or face recognition models with a large amount of parameters, which carry expensive computation cost and storage overhead. In this paper, we firstly analyze the essential relationship between two state-of-the-art methods (Ranking-CNN and DLDL) and show that the Ranking method is in fact learning label distribution implicitly. This result thus firstly unifies two existing popular state-of-the-art methods into the DLDL framework. Second, in order to alleviate the inconsistency and reduce resource consumption, we design a lightweight network architecture and propose a unified framework which can jointly learn facial attribute distribution and regress attribute value. The effectiveness of our approach has been demonstrated on both facial age and attractiveness estimation tasks. Our method achieves new state-of-the-art results using the single model with 36$\times$(6$\times$) fewer parameters and 2.6$\times$(2.1$\times$) faster inference speed on facial age (attractiveness) estimation. Moreover, our method can achieve comparable results as the state-of-the-art even though the number of parameters is further reduced to 0.9M (3.8MB disk storage).

[157] 2007.01773

Supervisory Controller Synthesis for Non-terminating Processes is an Obliging Game

We present a new algorithm to solve the supervisory control problem over non-terminating processes modeled as $\omega$-regular automata. A solution to the problem was obtained by Thistle in 1995 which uses complex manipulations of automata. This algorithm is notoriously hard to understand and, to the best of our knowledge, has never been implemented. We show a new solution to the problem through a reduction to reactive synthesis. A naive, and incorrect, approach reduces the supervisory control problem to a reactive synthesis problem that asks for a control strategy which ensures the given specification if the plant behaves in accordance to its liveness properties. This is insufficient. A correct control strategy might not fulfill the specification but force the plant to invalidate its liveness property. To prevent such solutions, supervisory control additionally requires that the controlled system is non-conflicting: any finite word compliant with the supervisor should be extendable to a word satisfying the plants' liveness properties. To capture this additional requirement, our solution goes through obliging games instead. An obliging game has two requirements: a strong winning condition as in reactive synthesis and a weak winning condition. A strategy is winning if it satisfies the strong condition and additionally, every partial play can be extended to satisfy the weak condition. Obliging games can be reduced to $\omega$-regular reactive synthesis, for which symbolic algorithms exist. We reduce supervisor synthesis to obliging games. The strong condition is an implication: if the plant behaves in accordance with its liveness properties, the specification should also hold. The weak condition is the plants' liveness property.

[158] 2007.01777

Interpretable Sequence Classification Via Prototype Trajectory

We propose a novel interpretable recurrent neural network (RNN) model, called ProtoryNet, in which we introduce a new concept of prototype trajectories. Motivated by the prototype theory in modern linguistics, ProtoryNet makes a prediction by finding the most similar prototype for each sentence in a text sequence and feeding an RNN backbone with the proximity of each of the sentences to the prototypes. The RNN backbone then captures the temporal pattern of the prototypes, to which we refer as prototype trajectories. The prototype trajectories enable intuitive, fine-grained interpretation of how the model reached to the final prediction, resembling the process of how humans analyze paragraphs. Experiments conducted on multiple public data sets reveal that the proposed method not only is more interpretable but also is more accurate than the current state-of-the-art prototype-based method. Furthermore, we report a survey result indicating that human users find ProtoryNet more intuitive and easier to understand, compared to the other prototype-based methods.

[159] 2007.01779

The combined basic LP and affine IP relaxation for promise VCSPs on infinite domains

Convex relaxations have been instrumental in solvability of constraint satisfaction problems (CSPs), as well as in the three different generalisations of CSPs: valued CSPs, infinite-domain CSPs, and most recently promise CSPs. In this work, we extend an existing tractability result to the three generalisations of CSPs combined: We give a sufficient condition for the combined basic linear programming and affine integer programming relaxation for exact solvability of promise valued CSPs over infinite-domains. This extends a result of Brakensiek and Guruswami [SODA'20] for promise (non-valued) CSPs (on finite domains).

[160] 2007.01780

Visual Question Answering as a Multi-Task Problem

Visual Question Answering(VQA) is a highly complex problem set, relying on many sub-problems to produce reasonable answers. In this paper, we present the hypothesis that Visual Question Answering should be viewed as a multi-task problem, and provide evidence to support this hypothesis. We demonstrate this by reformatting two commonly used Visual Question Answering datasets, COCO-QA and DAQUAR, into a multi-task format and train these reformatted datasets on two baseline networks, with one designed specifically to eliminate other possible causes for performance changes as a result of the reformatting. Though the networks demonstrated in this paper do not achieve strongly competitive results, we find that the multi-task approach to Visual Question Answering results in increases in performance of 5-9% against the single-task formatting, and that the networks reach convergence much faster than in the single-task case. Finally we discuss possible reasons for the observed difference in performance, and perform additional experiments which rule out causes not associated with the learning of the dataset as a multi-task problem.

[161] 2007.01787

Evaluating Uncertainty Estimation Methods on 3D Semantic Segmentation of Point Clouds

Deep learning models are extensively used in various safety critical applications. Hence these models along with being accurate need to be highly reliable. One way of achieving this is by quantifying uncertainty. Bayesian methods for UQ have been extensively studied for Deep Learning models applied on images but have been less explored for 3D modalities such as point clouds often used for Robots and Autonomous Systems. In this work, we evaluate three uncertainty quantification methods namely Deep Ensembles, MC-Dropout and MC-DropConnect on the DarkNet21Seg 3D semantic segmentation model and comprehensively analyze the impact of various parameters such as number of models in ensembles or forward passes, and drop probability values, on task performance and uncertainty estimate quality. We find that Deep Ensembles outperforms other methods in both performance and uncertainty metrics. Deep ensembles outperform other methods by a margin of 2.4% in terms of mIOU, 1.3% in terms of accuracy, while providing reliable uncertainty for decision making.

[162] 2007.01788

TICO-19: the Translation Initiative for Covid-19

The COVID-19 pandemic is the worst pandemic to strike the world in over a century. Crucial to stemming the tide of the SARS-CoV-2 virus is communicating to vulnerable populations the means by which they can protect themselves. To this end, the collaborators forming the Translation Initiative for COvid-19 (TICO-19) have made test and development data available to AI and MT researchers in 35 different languages in order to foster the development of tools and resources for improving access to information about COVID-19 in these languages. In addition to 9 high-resourced, "pivot" languages, the team is targeting 26 lesser resourced languages, in particular languages of Africa, South Asia and South-East Asia, whose populations may be the most vulnerable to the spread of the virus. The same data is translated into all of the languages represented, meaning that testing or development can be done for any pairing of languages in the set. Further, the team is converting the test and development data into translation memories (TMXs) that can be used by localizers from and to any of the languages.

[163] 2007.01789

Mapping Datasets to Object Storage System

Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. This project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding re-implementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations.

[164] 2007.01790

Harnessing Wireless Channels for Scalable and Privacy-Preserving Federated Learning

Wireless connectivity is instrumental in enabling scalable federated learning (FL), yet wireless channels bring challenges for model training, in which channel randomness perturbs each worker's model update while multiple workers' updates incur significant interference under limited bandwidth. To address these challenges, in this work we formulate a novel constrained optimization problem, and propose an FL framework harnessing wireless channel perturbations and interference for improving privacy, bandwidth-efficiency, and scalability. The resultant algorithm is coined analog federated ADMM (A-FADMM) based on analog transmissions and the alternating direct method of multipliers (ADMM). In A-FADMM, all workers upload their model updates to the parameter server (PS) using a single channel via analog transmissions, during which all models are perturbed and aggregated over-the-air. This not only saves communication bandwidth, but also hides each worker's exact model update trajectory from any eavesdropper including the honest-but-curious PS, thereby preserving data privacy against model inversion attacks. We formally prove the convergence and privacy guarantees of A-FADMM for convex functions under time-varying channels, and numerically show the effectiveness of A-FADMM under noisy channels and stochastic non-convex functions, in terms of convergence speed and scalability, as well as communication bandwidth and energy efficiency.

[165] 2007.01791

Towards an Intelligent Data Delivery Service

The ATLAS Event Streaming Service (ESS) at the LHC is an approach to preprocess and deliver data for Event Service (ES) that has implemented a fine-grained approach for ATLAS event processing. The ESS allows one to asynchronously deliver only the input events required by ES processing, with the aim to decrease data traffic over WAN and improve overall data processing throughput. A prototype of ESS was developed to deliver streaming events to fine-grained ES jobs. Based on it, an intelligent Data Delivery Service (iDDS) is under development to decouple the "cold format" and the processing format of the data, which also opens the opportunity to include the production systems of other HEP experiments. Here we will at first present the ESS model view and its motivations for iDDS system. Then we will also present the iDDS schema, architecture and the applications of iDDS.

[166] 2007.01793

CacheNet: A Model Caching Framework for Deep Learning Inference on the Edge

The success of deep neural networks (DNN) in machine perception applications such as image classification and speech recognition comes at the cost of high computation and storage complexity. Inference of uncompressed large scale DNN models can only run in the cloud with extra communication latency back and forth between cloud and end devices, while compressed DNN models achieve real-time inference on end devices at the price of lower predictive accuracy. In order to have the best of both worlds (latency and accuracy), we propose CacheNet, a model caching framework. CacheNet caches low-complexity models on end devices and high-complexity (or full) models on edge or cloud servers. By exploiting temporal locality in streaming data, high cache hit and consequently shorter latency can be achieved with no or only marginal decrease in prediction accuracy. Experiments on CIFAR-10 and FVG have shown CacheNet is 58-217% faster than baseline approaches that run inference tasks on end devices or edge servers alone.

[167] 2007.01795

Approval-Based Committee Voting: Axioms, Algorithms, and Applications

Approval-based committee (ABC) rules are voting rules that output a fixed-size subset of candidates, a so-called committee. ABC rules select committees based on dichotomous preferences, i.e., a voter either approves or disapproves a candidate. This simple type of preferences makes ABC rules widely suitable for practical use. In this survey, we summarize the current understanding of ABC rules from the viewpoint of computational social choice. The main focus is on axiomatic analysis, algorithmic results, and relevant applications.

[168] 2007.01799

Transfer Function Models for Cylindrical MC Channels with Diffusion and Laminar Flow

The analysis and design of advection-diffusion based molecular communication (MC) systems in cylindrical environments is of particular interest for applications such as micro-fluidics and targeted drug delivery in blood vessels. Therefore, the accurate modeling of the corresponding MC channel is of high importance. The propagation of particles in these systems is caused by a combination of diffusion and flow with a parabolic velocity profile, i.e., laminar flow. The propagation characteristics of the particles can be categorized into three different regimes: The flow dominant regime where the influence of diffusion on the particle transport is negligible, the dispersive regime where diffusion has a much stronger impact than flow, and the mixed regime where both effects are important. For the limiting regimes, i.e., the flow dominant and dispersive regimes, there are well-known solutions and approximations for particle transport. In contrast, there is no general analytical solution for the mixed regime, and instead, approximations, numerical techniques, and particle based simulations have been employed. In this paper, we develop a general model for the advection-diffusion problem in cylindrical environments which provides an analytical solution applicable in all regimes. The modeling procedure is based on a transfer function approach and the main focus lies on the incorporation of laminar flow into the analytical model. The properties of the proposed model are analyzed by numerical evaluation for different scenarios including the uniform and point release of particles. We provide a comparison with particle based simulations and the well-known solutions for the limiting regimes to demonstrate the validity of the proposed analytical model.

[169] 2007.01800

Exploration and Discovery of the COVID-19 Literature through Semantic Visualization

We are developing semantic visualization techniques in order to enhance exploration and enable discovery over large datasets of complex networks of relations. Semantic visualization is a method of enabling exploration and discovery over large datasets of complex networks by exploiting the semantics of the relations in them. This involves (i) NLP to extract named entities, relations and knowledge graphs from the original data; (ii) indexing the output and creating representations for all relevant entities and relations that can be visualized in many different ways, e.g., as tag clouds, heat maps, graphs, etc.; (iii) applying parameter reduction operations to the extracted relations, creating "relation containers", or functional entities that can also be visualized using the same methods, allowing the visualization of multiple relations, partial pathways, and exploration across multiple dimensions. Our hope is that this will enable the discovery of novel inferences over relations in complex data that otherwise would go unnoticed. We have applied this to analysis of the recently released CORD-19 dataset.

[170] 2007.01806

Deep learning for scene recognition from visual data: a survey

The use of deep learning techniques has exploded during the last few years, resulting in a direct contribution to the field of artificial intelligence. This work aims to be a review of the state-of-the-art in scene recognition with deep learning models from visual data. Scene recognition is still an emerging field in computer vision, which has been addressed from a single image and dynamic image perspective. We first give an overview of available datasets for image and video scene recognition. Later, we describe ensemble techniques introduced by research papers in the field. Finally, we give some remarks on our findings and discuss what we consider challenges in the field and future lines of research. This paper aims to be a future guide for model selection for the task of scene recognition.

[171] 2007.01807

Continuously Indexed Domain Adaptation

Existing domain adaptation focuses on transferring knowledge between domains with categorical indices (e.g., between datasets A and B). However, many tasks involve continuously indexed domains. For example, in medical applications, one often needs to transfer disease analysis and prediction across patients of different ages, where age acts as a continuous domain index. Such tasks are challenging for prior domain adaptation methods since they ignore the underlying relation among domains. In this paper, we propose the first method for continuously indexed domain adaptation. Our approach combines traditional adversarial adaptation with a novel discriminator that models the encoding-conditioned domain index distribution. Our theoretical analysis demonstrates the value of leveraging the domain index to generate invariant features across a continuous range of domains. Our empirical results show that our approach outperforms the state-of-the-art domain adaption methods on both synthetic and real-world medical datasets.

[172] 2007.01811

JAMPI: efficient matrix multiplication in Spark using Barrier Execution Mode

The new barrier mode in Apache Spark allows embedding distributed deep learning training as a Spark stage to simplify the distributed training workflow. In Spark, a task in a stage does not depend on any other tasks in the same stage, and hence it can be scheduled independently. However, several algorithms require more sophisticated inter-task communications, similar to the MPI paradigm. By combining distributed message passing (using asynchronous network IO), OpenJDK's new auto-vectorization and Spark's barrier execution mode, we can add non-map/reduce based algorithms, such as Cannon's distributed matrix multiplication to Spark. We document an efficient distributed matrix multiplication using Cannon's algorithm, which improves significantly on the performance of the existing MLlib implementation. Used within a barrier task, the algorithm described herein results in an up to 24 percent performance increase on a 10,000x10,000 square matrix with a significantly lower memory footprint. Applications of efficient matrix multiplication include, among others, accelerating the training and implementation of deep convolutional neural network based workloads, and thus such efficient algorithms can play a ground-breaking role in faster, more efficient execution of even the most complicated machine learning tasks.

[173] 2007.01813

AVP-SLAM: Semantic Visual Mapping and Localization for Autonomous Vehicles in the Parking Lot

Autonomous valet parking is a specific application for autonomous vehicles. In this task, vehicles need to navigate in narrow, crowded and GPS-denied parking lots. Accurate localization ability is of great importance. Traditional visual-based methods suffer from tracking lost due to texture-less regions, repeated structures, and appearance changes. In this paper, we exploit robust semantic features to build the map and localize vehicles in parking lots. Semantic features contain guide signs, parking lines, speed bumps, etc, which typically appear in parking lots. Compared with traditional features, these semantic features are long-term stable and robust to the perspective and illumination change. We adopt four surround-view cameras to increase the perception range. Assisting by an IMU (Inertial Measurement Unit) and wheel encoders, the proposed system generates a global visual semantic map. This map is further used to localize vehicles at the centimeter level. We analyze the accuracy and recall of our system and compare it against other methods in real experiments. Furthermore, we demonstrate the practicability of the proposed system by the autonomous parking application.

[174] 2007.01814

DynNet: Physics-based neural architecture design for linear and nonlinear structural response modeling and prediction

Data-driven models for predicting dynamic responses of linear and nonlinear systems are of great importance due to their wide application from probabilistic analysis to inverse problems such as system identification and damage diagnosis. In this study, a physics-based recurrent neural network model is designed that is able to learn the dynamics of linear and nonlinear multiple degrees of freedom systems given a ground motion. The model is able to estimate a complete set of responses, including displacement, velocity, acceleration, and internal forces. Compared to the most advanced counterparts, this model requires a smaller number of trainable variables while the accuracy of predictions is higher for long trajectories. In addition, the architecture of the recurrent block is inspired by differential equation solver algorithms and it is expected that this approach yields more generalized solutions. In the training phase, we propose multiple novel techniques to dramatically accelerate the learning process using smaller datasets, such as hardsampling, utilization of trajectory loss function, and implementation of a trust-region approach. Numerical case studies are conducted to examine the strength of the network to learn different nonlinear behaviors. It is shown that the network is able to capture different nonlinear behaviors of dynamic systems with very high accuracy and with no need for prior information or very large datasets.

[175] 2007.01815

Computing maximally-permissive strategies in acyclic timed automata

Timed automata are a convenient mathematical model for modelling and reasoning about real-time systems. While they provide a powerful way of representing timing aspects of such systems, timed automata assume arbitrary precision and zero-delay actions; in particular, a state might be declared reachable in a timed automaton, but impossible to reach in the physical system it models. In this paper, we consider permissive strategies as a way to overcome this problem: such strategies propose intervals of delays instead of single delays, and aim at reaching a target state whichever delay actually takes place. We develop an algorithm for computing the optimal permissiveness (and an associated maximally-permissive strategy) in acyclic timed automata and games.

[176] 2007.01816

Sherman-Morrison-Woodbury Identity for Tensors

In linear algebra, the sherman-morrison-woodbury identity says that the inverse of a rank-$k$ correction of some matrix can be computed by doing a rank-k correction to the inverse of the original matrix. This identity is crucial to accelerate the matrix inverse computation when the matrix involves correction. Many scientific and engineering applications have to deal with this matrix inverse problem after updating the matrix, e.g., sensitivity analysis of linear systems, covariance matrix update in kalman filter, etc. However, there is no similar identity in tensors. In this work, we will derive the sherman-morrison-woodbury identity for invertible tensors first. Since not all tensors are invertible, we further generalize the sherman-morrison-woodbury identity for tensors with moore-penrose generalized inverse by utilizing orthogonal projection of the correction tensor part into the original tensor and its Hermitian tensor. According to this new established the sherman-morrison-woodbury identity for tensors, we can perform sensitivity analysis for multi-linear systems by deriving the normalized upper bound for the solution of a multilinear system. Several numerical examples are also presented to demonstrate how the normalized error upper bounds are affected by perturbation degree of tensor coefficients.

[177] 2007.01818

Image-based Vehicle Re-identification Model with Adaptive Attention Modules and Metadata Re-ranking

Vehicle Re-identification is a challenging task due to intra-class variability and inter-class similarity across non-overlapping cameras. To tackle these problems, recently proposed methods require additional annotation to extract more features for false positive image exclusion. In this paper, we propose a model powered by adaptive attention modules that requires fewer label annotations but still out-performs the previous models. We also include a re-ranking method that takes account of the importance of metadata feature embeddings in our paper. The proposed method is evaluated on CVPR AI City Challenge 2020 dataset and achieves mAP of 37.25% in Track 2.

[178] 2007.01819

Addressing the interpretability problem for deep learning using many valued quantum logic

Deep learning models are widely used for various industrial and scientific applications. Even though these models have achieved considerable success in recent years, there exists a lack of understanding of the rationale behind decisions made by such systems in the machine learning community. This problem of interpretability is further aggravated by the increasing complexity of such models. This paper utilizes concepts from machine learning, quantum computation and quantum field theory to demonstrate how a many valued quantum logic system naturally arises in a specific class of generative deep learning models called Convolutional Deep Belief Networks. It provides a robust theoretical framework for constructing deep learning models equipped with the interpretability of many valued quantum logic systems without compromising their computing efficiency.

[179] 2007.01820

A Machine Learning Pipeline Stage for Adaptive Frequency Adjustment

A machine learning (ML) design framework is proposed for adaptively adjusting clock frequency based on propagation delay of individual instructions. A random forest model is trained to classify propagation delays in real time, utilizing current operation type, current operands, and computation history as ML features. The trained model is implemented in Verilog as an additional pipeline stage within a baseline processor. The modified system is experimentally tested at the gate level in 45 nm CMOS technology, exhibiting a speedup of 70% and energy reduction of 30% with coarse-grained ML classification. A speedup of 89% is demonstrated with finer granularities with 15.5% reduction in energy consumption.

[180] 2007.01821

Regularization of the movement of a material point along a flat trajectory: application to robotics problems

The control problem of the working tool movement along a predefined trajectory is considered. The integral of kinetic energy and weighted inertia forces for the whole period of motion is considered as a cost functional. The trajectory is assumed to be planar and defined in advance. The problem is reduced to a system of ordinary differential equations of the fourth order. Numerical examples of solving the problem for movement along straight, circular and elliptical trajectories are given.

[181] 2007.01822

Construction of weak solutions to compressible Navier--Stokes equations with general inflow/outflow boundary conditions via a numerical approximation

The construction of weak solutions to compressible Navier-Stokes equations via a numerical method (including a rigorous proof of the convergence) is in a short supply, and so far, available only for one sole numerical scheme suggested in Karper [{\em Numer. Math.}, 125(3) : 441--510, 2013] for the no slip boundary conditions and the isentropic pressure with adiabatic coefficient $\gamma>3$. Here we consider the same problem for the general non zero inflow-outflow boundary conditions, which is definitely more appropriate setting from the point of view of applications, but which is essentially more involved as far as the existence of weak solutions is concerned. There is a few recent proofs of existence of weak solutions in this setting, but none of them is performed via a numerical method. The goal of this paper is to fill this gap. The existence of weak solutions on the continuous level requires several tools of functional and harmonic analysis and differential geometry whose numerical counterparts are not known. Our main strategy therefore consists in rewriting of the numerical scheme in its variational form modulo remainders and to apply and/or to adapt to the new variational formulation the tools developed in the theoretical analysis. In addition to the result, which is new, the synergy between numerical and theoretical analysis is the main originality of the present paper.

[182] 2007.01823

WordPress on AWS: a Communication Framework

Every organization needs to communicate with its audience, and social media is an attractive and inexpensive way to maintain dialogic communication. About 1/3 of the Internet web pages are powered by WordPress, and about a million companies have moved their IT infrastructure to the AWS cloud. Together, AWS and WordPress offer an attractive, effective and inexpensive way for companies, both large and small, to maintain their presence on the web.

[183] 2007.01830

Fractional Covers of Hypergraphs with Bounded Multi-Intersection

Fractional (hyper-)graph theory is concerned with the specific problems that arise when fractional analogues of otherwise integer-valued (hyper-)graph invariants are considered. The focus of this paper is on fractional edge covers of hypergraphs. Our main technical result generalizes and unifies previous conditions under which the size of the support of fractional edge covers is bounded independently of the size of the hypergraph itself. This allows us to extend previous tractability results for checking if the fractional hypertree width of a given hypergraph is $\leq k$ for some constant $k$. We also show how our results translate to fractional vertex covers.

[184] 2007.01833

PsychFM: Predicting your next gamble

There is a sudden surge to model human behavior due to its vast and diverse applications which includes modeling public policies, economic behavior and consumer behavior. Most of the human behavior itself can be modeled into a choice prediction problem. Prospect theory is a theoretical model that tries to explain the anomalies in choice prediction. These theories perform well in terms of explaining the anomalies but they lack precision. Since the behavior is person dependent, there is a need to build a model that predicts choices on a per-person basis. Looking on at the average persons choice may not necessarily throw light on a particular person's choice. Modeling the gambling problem on a per person basis will help in recommendation systems and related areas. A novel hybrid model namely psychological factorisation machine ( PsychFM ) has been proposed that involves concepts from machine learning as well as psychological theories. It outperforms the popular existing models namely random forest and factorisation machines for the benchmark dataset CPC-18. Finally,the efficacy of the proposed hybrid model has been verified by comparing with the existing models.

[185] 2007.01837

LOOC: Localize Overlapping Objects with Count Supervision

Acquiring count annotations generally requires less human effort than point-level and bounding box annotations. Thus, we propose the novel problem setup of localizing objects in dense scenes under this weaker supervision. We propose LOOC, a method to Localize Overlapping Objects with Count supervision. We train LOOC by alternating between two stages. In the first stage, LOOC learns to generate pseudo point-level annotations in a semi-supervised manner. In the second stage, LOOC uses a fully-supervised localization method that trains on these pseudo labels. The localization method is used to progressively improve the quality of the pseudo labels. We conducted experiments on popular counting datasets. For localization, LOOC achieves a strong new baseline in the novel problem setup where only count supervision is available. For counting, LOOC outperforms current state-of-the-art methods that only use count as their supervision. Code is available at:

[186] 2007.01839

Expected Eligibility Traces

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence. Eligibility traces enable efficient credit assignment to the recent sequence of states and actions experienced by the agent, but not to counterfactual sequences that could also have led to the current state. In this work, we introduce expected eligibility traces. Expected traces allow, with a single update, to update states and actions that could have preceded the current state, even if they did not do so on this occasion. We discuss when expected traces provide benefits over classic (instantaneous) traces in temporal-difference learning, and show that sometimes substantial improvements can be attained. We provide a way to smoothly interpolate between instantaneous and expected traces by a mechanism similar to bootstrapping, which ensures that the resulting algorithm is a strict generalisation of TD($\lambda$). Finally, we discuss possible extensions and connections to related ideas, such as successor features.

[187] 2007.01851

Swoosh! Rattle! Thump! -- Actions that Sound

Truly intelligent agents need to capture the interplay of all their senses to build a rich physical understanding of their world. In robotics, we have seen tremendous progress in using visual and tactile perception; however, we have often ignored a key sense: sound. This is primarily due to the lack of data that captures the interplay of action and sound. In this work, we perform the first large-scale study of the interactions between sound and robotic action. To do this, we create the largest available sound-action-vision dataset with 15,000 interactions on 60 objects using our robotic platform Tilt-Bot. By tilting objects and allowing them to crash into the walls of a robotic tray, we collect rich four-channel audio information. Using this data, we explore the synergies between sound and action and present three key insights. First, sound is indicative of fine-grained object class information, e.g., sound can differentiate a metal screwdriver from a metal wrench. Second, sound also contains information about the causal effects of an action, i.e. given the sound produced, we can predict what action was applied to the object. Finally, object representations derived from audio embeddings are indicative of implicit physical properties. We demonstrate that on previously unseen objects, audio embeddings generated through interactions can predict forward models 24% better than passive visual embeddings. Project videos and data are at

[188] 2007.01852

Language-agnostic BERT Sentence Embedding

We adapt multilingual BERT to produce language-agnostic sentence embeddings for 109 languages. %The state-of-the-art for numerous monolingual and multilingual NLP tasks is masked language model (MLM) pretraining followed by task specific fine-tuning. While English sentence embeddings have been obtained by fine-tuning a pretrained BERT model, such models have not been applied to multilingual sentence embeddings. Our model combines masked language model (MLM) and translation language model (TLM) pretraining with a translation ranking task using bi-directional dual encoders. The resulting multilingual sentence embeddings improve average bi-text retrieval accuracy over 112 languages to 83.7%, well above the 65.5% achieved by the prior state-of-the-art on Tatoeba. Our sentence embeddings also establish new state-of-the-art results on BUCC and UN bi-text retrieval.

[189] 2007.01855

Trace-Norm Adversarial Examples

White box adversarial perturbations are sought via iterative optimization algorithms most often minimizing an adversarial loss on a $l_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithms. These new structures for adversarial examples, yet pervasive in optimization, are for instance a challenge for adversarial theoretical certification which again provides only $l_p$ certificates. Because adversarial robustness is still an empirical field, defense mechanisms should also reasonably be evaluated against differently structured attacks. Besides, these structured adversarial perturbations may allow for larger distortions size than their $l_p$ counter-part while remaining imperceptible or perceptible as natural slight distortions of the image. Finally, they allow some control on the generation of the adversarial perturbation, like (localized) bluriness.

[190] 2007.01309

Learning-based Defect Recognition for Quasi-Periodic Microscope Images

The detailed control of crystalline material defects is a crucial process, as they affect properties of the material that may be detrimental or beneficial for the final performance of a device. Defect analysis on the sub-nanometer scale is enabled by high-resolution transmission electron microscopy (HRTEM), where the identification of defects is currently carried out based on human expertise. However, the process is tedious, highly time consuming and, in some cases, can yield to ambiguous results. Here we propose a semi-supervised machine learning method that assists in the detection of lattice defects from atomic resolution microscope images. It involves a convolutional neural network that classifies image patches as defective or non-defective, a graph-based heuristic that chooses one non-defective patch as a model, and finally an automatically generated convolutional filter bank, which highlights symmetry breaking such as stacking faults, twin defects and grain boundaries. Additionally, a variance filter is suggested to segment amorphous regions and beam defects. The algorithm is tested on III-V/Si crystalline materials and successfully evaluated against different metrics, showing promising results even for extremely small data sets. By combining the data-driven classification generality, robustness and speed of deep learning with the effectiveness of image filters in segmenting faulty symmetry arrangements, we provide a valuable open-source tool to the microscopist community that can streamline future HRTEM analyses of crystalline materials.

[191] 2007.01321

Optimal control of mean field equations with monotone coefficients and applications in neuroscience

We are interested in the optimal control problem associated with certain quadratic cost functionals depending on the solution $X=X^\alpha$ of the stochastic mean-field type evolution equation in $\mathbb R^d$ $dX_t=b(t,X_t,\mathcal L(X_t),\alpha_t)dt+\sigma(t,X_t,\mathcal L(X_t),\alpha_t)dW_t,$ $X_0\sim \mu$ given, under assumptions that enclose a sytem of FitzHugh-Nagumo neuron networks, and where for practical purposes the control $\alpha_t$ is deterministic. To do so, we assume that we are given a drift coefficient that satisfies a one-sided Lipshitz condition, and that the dynamics is subject to a (convex) level set constraint of the form $\pi(X_t)\leq0$. The mathematical treatment we propose follows the lines of the recent monograph of Carmona and Delarue for similar control problems with Lipshitz coefficients. After addressing the existence of minimizers via a martingale approach, we show a maximum principle and then numerically investigate a gradient algorithm for the approximation of the optimal control.

[192] 2007.01332

Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes

Stationary stochastic processes (SPs) are a key component of many probabilistic models, such as those for off-the-grid spatio-temporal data. They enable the statistical symmetry of underlying physical phenomena to be leveraged, thereby aiding generalization. Prediction in such models can be viewed as a translation equivariant map from observed data sets to predictive SPs, emphasizing the intimate relationship between stationarity and equivariance. Building on this, we propose the Convolutional Neural Process (ConvNP), which endows Neural Processes (NPs) with translation equivariance and extends convolutional conditional NPs to allow for dependencies in the predictive distribution. The latter enables ConvNPs to be deployed in settings which require coherent samples, such as Thompson sampling or conditional image completion. Moreover, we propose a new maximum-likelihood objective to replace the standard ELBO objective in NPs, which conceptually simplifies the framework and empirically improves performance. We demonstrate the strong performance and generalization capabilities of ConvNPs on 1D regression, image completion, and various tasks with real-world spatio-temporal data.

[193] 2007.01334

Multi-agent Planning for thermalling gliders using multi level graph-search

This paper solves a path planning problem for a group of gliders. The gliders are tasked with visiting a set of interest points. The gliders have limited range but are able to increase their range by visiting special points called thermals. The problem addressed in this paper is of path planning for the gliders such that, the total number of interest points visited by the gliders is maximized. This is referred to as the multi-agent problem. The problem is solved by first decomposing it into several single-agent problems. In a single-agent problem a set of interest points are allocated to a single glider. This problem is solved by planning a path which maximizes the number of visited interest points from the allocated set. This is achieved through a uniform cost graph search, as shown in our earlier work. The multi-agent problem now consists of determining the best allocation (of interest points) for each glider. Two ways are presented of solving this problem, a brute force search approach as shown in earlier work and a Branch\&Bound type graph search. The Branch&Bound approach is the main contribution of the paper. This approach is proven to be optimal and shown to be faster than the brute force search using simulations.

[194] 2007.01335

Clustering of Electromagnetic Showers and Particle Interactions with Graph Neural Networks in Liquid Argon Time Projection Chambers Data

Liquid Argon Time Projection Chambers (LArTPCs) are a class of detectors that produce high resolution images of charged particles within their sensitive volume. In these images, the clustering of distinct particles into superstructures is of central importance to the current and future neutrino physics program. Electromagnetic (EM) activity typically exhibits spatially detached fragments of varying morphology and orientation that are challenging to efficiently assemble using traditional algorithms. Similarly, particles that are spatially removed from each other in the detector may originate from a common interaction. Graph Neural Networks (GNNs) were developed in recent years to find correlations between objects embedded in an arbitrary space. GNNs are first studied with the goal of predicting the adjacency matrix of EM shower fragments and to identify the origin of showers, i.e. primary fragments. On the PILArNet public LArTPC simulation dataset, the algorithm developed in this paper achieves a shower clustering accuracy characterized by a mean adjusted Rand index (ARI) of 97.8 % and a primary identification accuracy of 99.8 %. It yields a relative shower energy resolution of $(4.1+1.4/\sqrt{E (\text{GeV})})\,\%$ and a shower direction resolution of $(2.1/\sqrt{E(\text{GeV})})^{\circ}$. The optimized GNN is then applied to the related task of clustering particle instances into interactions and yields a mean ARI of 99.2 % for an interaction density of $\sim\mathcal{O}(1)\,m^{-3}$.

[195] 2007.01349

Networks with Growth and Preferential Attachment: Modeling and Applications

In this article we presented a brief study of the main network models with growth and preferential attachment. Such models are interesting because they present several characteristics of real systems. We started with the classical model proposed by Barabasi and Albert: nodes are added to the network connecting preferably to other nodes that are more connected. We also presented models that consider more representative elements from social perspectives, such as the homophily between the vertices or the fitness that each node has to build connections. Furthermore, we showed a version of these models including the Euclidean distance between the nodes as a preferential attachment rule. Our objective is to investigate the basic properties of these networks as distribution of connectivity, degree correlation, shortest path, cluster coefficient and how these characteristics are affected by the preferential attachment rules. Finally, we also provided a comparison of these synthetic networks with real ones. We found that characteristics as homophily, fitness and geographic distance are significant preferential attachment rules to modeling real networks. These rules can change the degree distribution form of these synthetic network models and make them more suitable to model real networks.

[196] 2007.01356

Decoder-free Robustness Disentanglement without (Additional) Supervision

Adversarial Training (AT) is proposed to alleviate the adversarial vulnerability of machine learning models by extracting only robust features from the input, which, however, inevitably leads to severe accuracy reduction as it discards the non-robust yet useful features. This motivates us to preserve both robust and non-robust features and separate them with disentangled representation learning. Our proposed Adversarial Asymmetric Training (AAT) algorithm can reliably disentangle robust and non-robust representations without additional supervision on robustness. Empirical results show our method does not only successfully preserve accuracy by combining two representations, but also achieve much better disentanglement than previous work.

[197] 2007.01357

Efficient computation and analysis of distributional Shapley values

Distributional data Shapley value (DShapley) has been recently proposed as a principled framework to quantify the contribution of individual datum in machine learning. DShapley develops the foundational game theory concept of Shapley values into a statistical framework and can be applied to identify data points that are useful (or harmful) to a learning algorithm. Estimating DShapley is computationally expensive, however, and this can be a major challenge to using it in practice. Moreover, there has been little mathematical analyses of how this value depends on data characteristics. In this paper, we derive the first analytic expressions for DShapley for the canonical problems of linear regression and non-parametric density estimation. These analytic forms provide new algorithms to compute DShapley that are several orders of magnitude faster than previous state-of-the-art. Furthermore, our formulas are directly interpretable and provide quantitative insights into how the value varies for different types of data. We demonstrate the efficacy of our DShapley approach on multiple real and synthetic datasets.

[198] 2007.01367

Lecture Notes on Control System Theory and Design

This is a collection of the lecture notes of the three authors for a first-year graduate course on control system theory and design (ECE 515 , formerly ECE 415) at the ECE Department of the University of Illinois at Urbana-Champaign. This is a fundamental course on the modern theory of dynamical systems and their control, and builds on a first-level course in control that emphasizes frequency-domain methods (such as the course ECE 486 , formerly ECE 386, at UIUC ). The emphasis in this graduate course is on state space techniques, and it encompasses modeling , analysis (of structural properties of systems, such as stability, controllability, and observability), synthesis (of observers/compensators and controllers) subject to design specifications, and optimization . Accordingly, this set of lecture notes is organized in four parts, with each part dealing with one of the issues identified above. Concentration is on linear systems , with nonlinear systems covered only in some specific contexts, such as stability and dynamic optimization. Both continuous-time and discrete-time systems are covered, with the former, however, in much greater depth than the latter. The main objective of this course is to teach the student some fundamental principles within a solid conceptual framework, that will enable her/him to design feedback loops compatible with the information available on the "states" of the system to be controlled, and by taking into account considerations such as stability, performance, energy conservation, and even robustness. A second objective is to familiarize her/him with the available modern computational, simulation, and general software tools that facilitate the design of effective feedback loops

[199] 2007.01383

Deep Interactive Learning: An Efficient Labeling Approach for Deep Learning-Based Osteosarcoma Treatment Response Assessment

Osteosarcoma is the most common malignant primary bone tumor. Standard treatment includes pre-operative chemotherapy followed by surgical resection. The response to treatment as measured by ratio of necrotic tumor area to overall tumor area is a known prognostic factor for overall survival. This assessment is currently done manually by pathologists by looking at glass slides under the microscope which may not be reproducible due to its subjective nature. Convolutional neural networks (CNNs) can be used for automated segmentation of viable and necrotic tumor on osteosarcoma whole slide images. One bottleneck for supervised learning is that large amounts of accurate annotations are required for training which is a time-consuming and expensive process. In this paper, we describe Deep Interactive Learning (DIaL) as an efficient labeling approach for training CNNs. After an initial labeling step is done, annotators only need to correct mislabeled regions from previous segmentation predictions to improve the CNN model until the satisfactory predictions are achieved. Our experiments show that our CNN model trained by only 7 hours of annotation using DIaL can successfully estimate ratios of necrosis within expected inter-observer variation rate for non-standardized manual surgical pathology task.

[200] 2007.01394

Robust Linear Regression: Optimal Rates in Polynomial Time

We obtain a robust and computationally efficient estimator for Linear Regression that achieves statistically optimal convergence rate under mild distributional assumptions. Concretely, we assume our data is drawn from a $k$-hypercontractive distribution and an $\epsilon$-fraction is adversarially corrupted. We then describe an estimator that converges to the optimal least-squares minimizer for the true distribution at a rate proportional to $\epsilon^{2-2/k}$, when the noise is independent of the covariates. We note that no such estimator was known prior to our work, even with access to unbounded computation. The rate we achieve is information-theoretically optimal and thus we resolve the main open question in Klivans, Kothari and Meka [COLT'18]. Our key insight is to identify an analytic condition relating the distribution over the noise and covariates that completely characterizes the rate of convergence, regardless of the noise model. In particular, we show that when the moments of the noise and covariates are negatively-correlated, we obtain the same rate as independent noise. Further, when the condition is not satisfied, we obtain a rate proportional to $\epsilon^{2-4/k}$, and again match the information-theoretic lower bound. Our central technical contribution is to algorithmically exploit independence of random variables in the "sum-of-squares" framework by formulating it as a polynomial identity.

[201] 2007.01401

Hybrid deep learning architecture for general disruption prediction across tokamaks

In this letter, we present a new disruption prediction algorithm based on Deep Learning that effectively allows knowledge transfer from existing devices to new ones, while predicting disruptions using very limited disruptive data from the new devices. Future fusion reactors will need to run disruption-free or with very few unmitigated disruptions. The algorithm presented in this letter achieves high predictive accuracy on C-Mod, DIII-D and EAST tokamaks with limited hyperparameter tuning. Through numerical experiments, we show that good accuracy (AUC=0.959) is achieved on EAST predictions by including a small number of disruptive discharges, thousands of non-disruptive discharges from EAST, and combining this with more than a thousand discharges from DIII-D and C-Mod. This holds true for all permutations of the three devices. This cross-machine data-driven study finds that non-disruptive data is machine-specific while disruptions are machine-independent.

[202] 2007.01413

Wearable Respiration Monitoring: Interpretable Inference with Context and Sensor Biomarkers

Breathing rate (BR), minute ventilation (VE), and other respiratory parameters are essential for real-time patient monitoring in many acute health conditions, such as asthma. The clinical standard for measuring respiration, namely Spirometry, is hardly suitable for continuous use. Wearables can track many physiological signals, like ECG and motion, yet not respiration. Deriving respiration from other modalities has become an area of active research. In this work, we infer respiratory parameters from wearable ECG and wrist motion signals. We propose a modular and generalizable classification-regression pipeline to utilize available context information, such as physical activity, in learning context-conditioned inference models. Morphological and power domain novel features from the wearable ECG are extracted to use with these models. Exploratory feature selection methods are incorporated in this pipeline to discover application-specific interpretable biomarkers. Using data from 15 subjects, we evaluate two implementations of the proposed pipeline: for inferring BR and VE. Each implementation compares generalized linear model, random forest, support vector machine, Gaussian process regression, and neighborhood component analysis as contextual regression models. Permutation, regularization, and relevance determination methods are used to rank the ECG features to identify robust ECG biomarkers across models and activities. This work demonstrates the potential of wearable sensors not only in continuous monitoring, but also in designing biomarker-driven preventive measures.

[203] 2007.01433

Two-dimensional thermal finite element model of directed energy deposition: matching melt pool temperature profile with pyrometer measurement

An open source two-dimensional (2D) thermal finite element (FE) model of the Directed Energy Deposition (DED) process is developed using the Python-based FEniCS framework. The model incrementally deposits material ahead of the laser focus point according to the geometry of the part. The laser heat energy is supplied by a Gaussian-distributed heat source while the phase change is represented by increased heat capacity around the solidus-liquidus temperature range. Experimental validation of the numerical model is performed by matching with the melt pool temperature measurements taken by a dual wavelength pyrometer during the build process of a box-shaped Ti--6Al--4V part with large geometrical voids. Effects of large geometrical voids on the melt pool shape and maximum melt pool temperature are examined. Both the numerical and experimental data show an increase in the melt pool size and temperature during deposition above large voids. The trailing edge of the melt pool's temperature profile obtained using the developed numerical model closely matches pyrometer measurements.

[204] 2007.01444

Generative Modeling for Atmospheric Convection

To improve climate modeling, we need a better understanding of multi-scale atmospheric dynamics--the relationship between large scale environment and small-scale storm formation, morphology and propagation--as well as superior stochastic parameterization of convective organization. We analyze raw output from ~6 million instances of explicitly simulated convection spanning all global geographic regimes of convection in the tropics, focusing on the vertical velocities extracted every 15 minutes from ~4 hundred thousands separate instances of a storm-permitting moist turbulence model embedded within a multi-scale global model of the atmosphere. Generative modeling techniques applied on high-resolution climate data for representation learning hold the potential to drive next-generation parameterization and breakthroughs in understanding of convection and storm development. To that end, we design and implement a specialized Variational Autoencoder (VAE) to perform structural replication, dimensionality reduction and clustering on these cloud-resolving vertical velocity outputs. Our VAE reproduces the structure of disparate classes of convection, successfully capturing both their magnitude and variances. This VAE thus provides a novel way to perform unsupervised grouping of convective organization in multi-scale simulations of the atmosphere in a physically sensible manner. The success of our VAE in structural emulation, learning physical meaning in convective transitions and anomalous vertical velocity field detection may help set the stage for developing generative models for stochastic parameterization that might one day replace explicit convection calculations.

[205] 2007.01448

From Fear to Hate: How the Covid-19 Pandemic Sparks Racial Animus in the United States

We estimate the effect of the Coronavirus (Covid-19) pandemic on racial animus, as measured by Google searches and Twitter posts including a commonly used anti-Asian racial slur. Our empirical strategy exploits the plausibly exogenous variation in the timing of the first Covid-19 diagnosis across regions in the United States. We find that the first local diagnosis leads to an immediate increase in racist Google searches and Twitter posts, with the latter mainly coming from existing Twitter users posting the slur for the first time. This increase could indicate a rise in future hate crimes, as we document a strong correlation between the use of the slur and anti-Asian hate crimes using historic data. Moreover, we find that the rise in the animosity is directed at Asians rather than other minority groups and is stronger on days when the connection between the disease and Asians is more salient, as proxied by President Trump's tweets mentioning China and Covid-19 at the same time. In contrast, the negative economic impact of the pandemic plays little role in the initial increase in racial animus. Our results suggest that de-emphasizing the connection between the disease and a particular racial group can be effective in curbing current and future racial animus.

[206] 2007.01452

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks

This paper proposes a new mean-field framework for over-parameterized deep neural networks (DNNs), which can be used to analyze neural network training. In this framework, a DNN is represented by probability measures and functions over its features (that is, the function values of the hidden units over the training data) in the continuous limit, instead of the neural network parameters as most existing studies have done. This new representation overcomes the degenerate situation where all the hidden units essentially have only one meaningful hidden unit in each middle layer, and further leads to a simpler representation of DNNs, for which the training objective can be reformulated as a convex optimization problem via suitable re-parameterization. Moreover, we construct a non-linear dynamics called neural feature flow, which captures the evolution of an over-parameterized DNN trained by Gradient Descent. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures. Furthermore, we show, for Res-Net, when the neural feature flow process converges, it reaches a global minimal solution under suitable conditions. Our analysis leads to the first global convergence proof for over-parameterized neural network training with more than $3$ layers in the mean-field regime.

[207] 2007.01457

A generalized stochastic control problem of bounded noise process under ambiguity arising in biological management

The objectives and contributions of this paper are mathematical and numerical analyses of a stochastic control problem of bounded population dynamics under ambiguity, an important but not well-studied problem, focusing on the optimality equation as a nonlinear degenerate parabolic partial integro-differential equation (PIDE). The ambiguity comes from lack of knowledge on the continuous and jump noises in the dynamics, and its optimization appears as nonlinear and nonlocal terms in the PIDE. Assuming a strong dynamic programming principle for continuous value functions, we characterize its solutions from both viscosity and distribution viewpoints. Numerical computation focusing on an ergodic case are presented as well to complement the mathematical analysis.

[208] 2007.01494

Variance reduction for Riemannian non-convex optimization with batch size adaptation

Variance reduction techniques are popular in accelerating gradient descent and stochastic gradient descent for optimization problems defined on both Euclidean space and Riemannian manifold. In this paper, we further improve on existing variance reduction methods for non-convex Riemannian optimization, including R-SVRG and R-SRG/R-SPIDER with batch size adaptation. We show that this strategy can achieve lower total complexities for optimizing both general non-convex and gradient dominated functions under both finite-sum and online settings. As a result, we also provide simpler convergence analysis for R-SVRG and improve complexity bounds for R-SRG under finite-sum setting. Specifically, we prove that R-SRG achieves the same near-optimal complexity as R-SPIDER without requiring a small step size. Empirical experiments on a variety of tasks demonstrate effectiveness of proposed adaptive batch size scheme.

[209] 2007.01506

Symbiotic Radio: Cognitive Backscattering Communications for Future Wireless Networks

The heterogenous wireless services and exponentially growing traffic call for novel spectrum- and energy-efficient wireless communication technologies. In this paper, a new technique, called symbiotic radio (SR), is proposed to exploit the benefits and address the drawbacks of cognitive radio (CR) and ambient backscattering communications(AmBC), leading to mutualism spectrum sharing and highly reliable backscattering communications. In particular, the secondary transmitter (STx) in SR transmits messages to the secondary receiver (SRx) over the RF signals originating from the primary transmitter (PTx) based on cognitive backscattering communications, thus the secondary system shares not only the radio spectrum, but also the power, and infrastructure with the primary system. In return, the secondary transmission provides beneficial multipath diversity to the primary system, therefore the two systems form mutualism spectrum sharing. More importantly, joint decoding is exploited at SRx to achieve highly reliable backscattering communications. To exploit the full potential of SR, in this paper, we address three fundamental tasks in SR: (1) enhancing the backscattering link via active load; (2) achieving highly reliable communications through joint decoding; and (3) capturing PTx's RF signals using reconfigurable intelligent surfaces. Emerging applications, design challenges and open research problems will also be discussed.

[210] 2007.01525

Dynamic Equilibria in Time-Varying Networks

Predicting selfish behavior in public environments by considering Nash equilibria is a central concept of game theory. For the dynamic traffic assignment problem modeled by a flow over time game, in which every particle tries to reach its destination as fast as possible, the dynamic equilibria are called Nash flows over time. So far, this model has only been considered for networks in which each arc is equipped with a constant capacity, limiting the outflow rate, and with a transit time, determining the time it takes for a particle to traverse the arc. However, real-world traffic networks can be affected by temporal changes, for example, caused by construction works or special speed zones during some time period. To model these traffic scenarios appropriately, we extend the flow over time model by time-dependent capacities and time-dependent transit times. Our first main result is the characterization of the structure of Nash flows over time. Similar to the static-network model, the strategies of the particles in dynamic equilibria can be characterized by specific static flows, called thin flows with resetting. The second main result is the existence of Nash flows over time, which we show in a constructive manner by extending a flow over time step by step by these thin flows.

[211] 2007.01534

Mode Decomposition for Homogeneous Symmetric Operators

Finding latent structures in data is drawing increasing attention in broad and diverse fields such as fluid dynamics, signal processing, and machine learning. In this work, we formulate \acf{DMD} for two types of dynamical system. The first, a system which is derived by a $\gamma$-homogeneous operator ($\gamma\neq 1$). The second, a system which can be represented as a symmetric operator. Regarding to the first type, dynamical systems, derived by $\gamma$-homogeneous operators $\gamma\in[0,1)$, reach the steady state in finite time. This inherently contradicts the Dynamic Mode Decomposition (DMD) model, which can be seen as an exponential data fitting algorithm. Therefore, the induced \ac{DMD} operator leads to artifacts in the decomposition. We show certain cases where the DMD does not even exist. For homogeneous systems ($\gamma\neq 1$), we suggest a time rescaling that solves this conflict and show that DMD can perfectly restore the dynamics even for nonlinear flows. For dynamics which derived by a symmetric operator, we expect the eigenvalues of the DMD to be real. This requirement is embeded in a variant of the DMD algorithm, termed as Symmetric DMD (SDMD). With these adaptations, we formulate a closed form solution of DMD for dynamics $u_t = P(u) $, $u(t=0)=u_0$, where $P$ is a nonlinear $\gamma$-homogeneous operator, when the initial condition $u_0$ admits the nonlinear eigenvalue problem $P(u_0)=\lambda u_0 $ ($u_0$ is a nonlinear eigenfunction, with respect to the operator $P$). We show experimentally that, for such systems, for any initial condition, SDMD achieves lower mean square error for the spectrum estimation. Finally, we formulate a discrete decomposition, related to nonlinear eigenfunctions of $\gamma$-homogeneous operator.

[212] 2007.01543

Online Supervised Acoustic System Identification exploiting Prelearned Local Affine Subspace Models

In this paper we present a novel algorithm for improved block-online supervised acoustic system identification in adverse noise scenarios by exploiting prior knowledge about the space of Room Impulse Responses (RIRs). The method is based on the assumption that the variability of the unknown RIRs is controlled by only few physical parameters, describing, e.g., source position movements, and thus is confined to a low-dimensional manifold which is modelled by a union of affine subspaces. The offsets and bases of the affine subspaces are learned in advance from training data by unsupervised clustering followed by Principal Component Analysis. We suggest to denoise the parameter update of any supervised adaptive filter by projecting it onto an optimal affine subspace which is selected based on a novel computationally efficient approximation of the associated evidence. The proposed method significantly improves the system identification performance of state-of-the-art algorithms in adverse noise scenarios.

[213] 2007.01578

volesti: Volume Approximation and Sampling for Convex Polytopes in R

Sampling from high dimensional distributions and volume approximation of convex bodies are fundamental operations that appear in optimization, finance, engineering and machine learning. In this paper we present volesti, a C++ package with an R interface that provides efficient, scalable algorithms for volume estimation, uniform and Gaussian sampling from convex polytopes. volesti scales to hundreds of dimensions, handles efficiently three different types of polyhedra and provides non existing sampling routines to R. We demonstrate the power of volesti by solving several challenging problems using the R language.

[214] 2007.01579

Noise-Robust Adaptation Control for Supervised System Identification Exploiting A Noise Dictionary

We present a noise-robust adaptation control strategy for block-online supervised acoustic system identification by exploiting a noise dictionary. The proposed algorithm takes advantage of the pronounced spectral structure which characterizes many types of interfering noise signals. We model the noisy observations by a linear Gaussian Discrete Fourier Transform-domain state space model whose parameters are estimated by an online generalized Expectation-Maximization algorithm. Unlike all other state-of-the-art approaches we suggest to model the covariance matrix of the observation probability density function by a dictionary model. We propose to learn the noise dictionary from training data, which can be gathered either offline or online whenever the system is not excited, while we infer the activations continuously. The proposed algorithm represents a novel machine-learning-based approach to noise-robust adaptation control for challenging online supervised acoustic system identification applications characterized by high-level and non-stationary interfering noise signals.

[215] 2007.01592

Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

A spatial point process can be characterized by an intensity function which predicts the number of events that occur across space. In this paper, we develop a method to infer predictive intensity intervals by learning a spatial model using a regularized criterion. We prove that the proposed method exhibits out-of-sample prediction performance guarantees which, unlike standard estimators, are valid even when the spatial model is misspecified. The method is demonstrated using synthetic as well as real spatial data.

[216] 2007.01593

Deep image prior for 3D magnetic particle imaging: A quantitative comparison of regularization techniques on Open MPI dataset

Magnetic particle imaging (MPI) is an imaging modality exploiting the nonlinear magnetization behavior of (super-)paramagnetic nanoparticles to obtain a space- and often also time-dependent concentration of a tracer consisting of these nanoparticles. MPI has a continuously increasing number of potential medical applications. One prerequisite for successful performance in these applications is a proper solution to the image reconstruction problem. More classical methods from inverse problems theory, as well as novel approaches from the field of machine learning, have the potential to deliver high-quality reconstructions in MPI. We investigate a novel reconstruction approach based on a deep image prior, which builds on representing the solution by a deep neural network. Novel approaches, as well as variational and iterative regularization techniques, are compared quantitatively in terms of peak signal-to-noise ratios and structural similarity indices on the publicly available Open MPI dataset.

[217] 2007.01628

HDR-GAN: HDR Image Reconstruction from Multi-Exposed LDR Images with Large Motions

Synthesizing high dynamic range (HDR) images from multiple low-dynamic range (LDR) exposures in dynamic scenes is challenging. There are two major problems caused by the large motions of foreground objects. One is the severe misalignment among the LDR images. The other is the missing content due to the over-/under-saturated regions caused by the moving objects, which may not be easily compensated for by the multiple LDR exposures. Thus, it requires the HDR generation model to be able to properly fuse the LDR images and restore the missing details without introducing artifacts. To address these two problems, we propose in this paper a novel GAN-based model, HDR-GAN, for synthesizing HDR images from multi-exposed LDR images. To our best knowledge, this work is the first GAN-based approach for fusing multi-exposed LDR images for HDR reconstruction. By incorporating adversarial learning, our method is able to produce faithful information in the regions with missing content. In addition, we also propose a novel generator network, with a reference-based residual merging block for aligning large object motions in the feature domain, and a deep HDR supervision scheme for eliminating artifacts of the reconstructed HDR images. Experimental results demonstrate that our model achieves state-of-the-art reconstruction performance over the prior HDR methods on diverse scenes.

[218] 2007.01659

Diagnostic Uncertainty Calibration: Towards Reliable Machine Predictions in Medical Domain

Label disagreement between human experts is a common issue in the medical domain and poses unique challenges in the evaluation and learning of classification models. In this work, we extend metrics for probability prediction, including calibration, i.e., the reliability of predictive probability, to adapt to such a situation. We further formalize the metrics for higher-order statistics, including inter-rater disagreement, in a unified way, which enables us to assess the quality of distributional uncertainty. In addition, we propose a novel post-hoc calibration method that equips trained neural networks with calibrated distributions over class probability estimates. With a large-scale medical imaging application, we show that our approach significantly improves the quality of uncertainty estimates in multiple metrics.

[219] 2007.01668

Security Limitations of Classical-Client Delegated Quantum Computing

Secure delegated quantum computing allows a computationally weak client to outsource an arbitrary quantum computation to an untrusted quantum server in a privacy-preserving manner. One of the promising candidates to achieve classical delegation of quantum computation is classical-client remote state preparation ($RSP_{CC}$), where a client remotely prepares a quantum state using a classical channel. However, the privacy loss incurred by employing $RSP_{CC}$ as a sub-module is unclear. In this work, we investigate this question using the Constructive Cryptography framework by Maurer and Renner (ICS'11). We first identify the goal of $RSP_{CC}$ as the construction of ideal RSP resources from classical channels and then reveal the security limitations of using $RSP_{CC}$. First, we uncover a fundamental relationship between constructing ideal RSP resources (from classical channels) and the task of cloning quantum states. Any classically constructed ideal RSP resource must leak to the server the full classical description (possibly in an encoded form) of the generated quantum state, even if we target computational security only. As a consequence, we find that the realization of common RSP resources, without weakening their guarantees drastically, is impossible due to the no-cloning theorem. Second, the above result does not rule out that a specific $RSP_{CC}$ protocol can replace the quantum channel at least in some contexts, such as the Universal Blind Quantum Computing (UBQC) protocol of Broadbent et al. (FOCS '09). However, we show that the resulting UBQC protocol cannot maintain its proven composable security as soon as $RSP_{CC}$ is used as a subroutine. Third, we show that replacing the quantum channel of the above UBQC protocol by the $RSP_{CC}$ protocol QFactory of Cojocaru et al. (Asiacrypt '19), preserves the weaker, game-based, security of UBQC.

[220] 2007.01684

New Classes of Quantum Codes Associated with Surface Maps

If the cyclic sequences of {face types} {at} all vertices in a map are the same, then the map is said to be a semi-equivelar map. In particular, a semi-equivelar map is equivelar if the faces are the same type. Homological quantum codes represent a subclass of topological quantum codes. In this article, we introduce {thirteen} new classes of quantum codes. These codes are associated with the following: (i) equivelar maps of type $ [k^k]$, (ii) equivelar maps on the double torus along with the covering of the maps, and (iii) semi-equivelar maps on the surface of \Echar{-1}, along with {their} covering maps. The encoding rate of the class of codes associated with the maps in (i) is such that $ \frac{k}{n}\rightarrow 1 $ as $ n\rightarrow\infty $, and for the remaining classes of codes, the encoding rate is $ \frac{k}{n}\rightarrow \alpha $ as $ n\rightarrow \infty $ with $ \alpha< 1 $.

[221] 2007.01702

Fast Computation of Electromagnetic Wave Propagation and Scattering for Quasi-cylindrical Geometry

The cylindrical Taylor Interpolation through FFT (TI-FFT) algorithm for computation of the near-field and far-field in the quasi-cylindrical geometry has been introduced. The modal expansion coefficient of the vector potentials ${\bf F}$ and ${\bf A}$ within the context of the cylindrical harmonics (TE and TM modes) can be expressed in the closed-form expression through the cylindrical addition theorem. For the quasi-cylindrical geometry, the modal expansion coefficient can be evaluated through FFT with the help of the Taylor Interpolation (TI) technique. The near-field on any arbitrary cylindrical surface can be obtained through the Inverse Fourier Transform (IFT). The far-field can be obtained through the Near-Field Far-Field (NF-FF) transform. The cylindrical TI-FFT algorithm has the advantages of $\mathcal{O} \left( \hbox{N} \log_2 \hbox{N} \right)$ computational complexity for $\hbox{N} = \hbox{N}_\phi \times \hbox{N}_z$ computational grid, small sampling rate (large sampling spacing) and no singularity problem.

[222] 2007.01720

Qualitative Analysis of Monte Carlo Dropout

In this report, we present qualitative analysis of Monte Carlo (MC) dropout method for measuring model uncertainty in neural network (NN) models. We first consider the sources of uncertainty in NNs, and briefly review Bayesian Neural Networks (BNN), the group of Bayesian approaches to tackle uncertainties in NNs. After presenting mathematical formulation of MC dropout, we proceed to suggesting potential benefits and associated costs for using MC dropout in typical NN models, with the results from our experiments.

[223] 2007.01759

Topology Optimization of Heat Exchangers

A method for density-based topology optimization of heat exchangers with two fluids is proposed. The goal of the optimization process is to maximize the heat transfer from one fluid to the other, under maximum pressure drop constraints for each of the fluid flows. A single design variable is used to describe the physical fields. The solid interface and the fluid domains are generated using an erosion-dilation based identification technique, which guarantees well-separated fluids, as well as a minimum wall thickness between them. Under the assumption of laminar steady flow, the two fluids are modelled separately, but in the entire computational domain using the Brinkman penalization technique for ensuring negligible velocities outside of the respective fluid subdomains. The heat transfer is modelled using the convection-diffusion equation, where the convection is driven by both fluid flows. A stabilized finite element discretization is used to solve the governing equations. Results are presented for two different problems: a two-dimensional example illustrating and verifying the methodology; and a three-dimensional example inspired by shell-and-tube heat exchangers. The optimized designs for both cases show an improved heat transfer compared to the baseline designs. For the shell-and-tube case, the full freedom topology optimization approach is shown to yield performance improvements of up to 113% under the same pressure drop.

[224] 2007.01792

Almost Affinely Disjoint Subspaces

In this work, we introduce a natural notion concerning vector finite spaces. A family of $k$-dimensional subspaces of $\mathbb{F}_q^n$ is called almost affinely disjoint if any $(k+1)$-dimensional subspace containing a subspace from the family non-trivially intersects with only a few subspaces from the family. The central question discussed in the paper is the polynomial growth (in $q$) of the maximal cardinality of these families given the parameters $k$ and $n$. For the cases $k=1$ and $k=2$, optimal families are constructed. For other settings, we find lower and upper bounds on the polynomial growth. Additionally, some connections with problems in coding theory are shown.

[225] 2007.01834

Universality of the Bottleneck Distance for Extended Persistence Diagrams

The extended persistence diagram is an invariant of piecewise linear functions, introduced by Cohen-Steiner, Edelsbrunner, and Harer. The bottleneck distance has been introduced by the same authors as an extended pseudometric on the set of extended persistence diagrams, which is stable under perturbations of the function. We address the question whether the bottleneck distance is the largest possible stable distance, providing an affirmative answer.

[226] 2007.01836

Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning

Spoken language understanding is typically based on pipeline architectures including speech recognition and natural language understanding steps. Therefore, these components are optimized independently from each other and the overall system suffers from error propagation. In this paper, we propose a novel training method that enables pretrained contextual embeddings such as BERT to process acoustic features. In particular, we extend it with an encoder of pretrained speech recognition systems in order to construct end-to-end spoken language understanding systems. Our proposed method is based on the teacher-student framework across speech and text modalities that aligns the acoustic and the semantic latent spaces. Experimental results in three benchmark datasets show that our system reaches the pipeline architecture performance without using any training data and outperforms it after fine-tuning with only a few examples.