This paper experimentally demonstrates the effectiveness of an anticipative car-following algorithm in reducing energy use of gasoline engine and electric Connected and Automated Vehicles (CAV), without sacrificing safety and traffic flow. We propose a Vehicle-in-the-Loop (VIL) testing environment in which experimental CAVs driven on a track interact with surrounding virtual traffic in real-time. We explore the energy savings when following city and highway drive cycles, as well as in emergent highway traffic created from microsimulations. Model predictive control handles high level velocity planning and benefits from communicated intentions of a preceding CAV or estimated probable motion of a preceding human driven vehicle. A combination of classical feedback control and data-driven nonlinear feedforward control of pedals achieve acceleration tracking at the low level. The controllers are implemented in ROS and energy is measured via calibrated OBD-II readings. We report up to 30% improved energy economy compared to realistically calibrated human driver car-following without sacrificing following headway.

As data from IoT (Internet of Things) sensors become ubiquitous, state-of-the-art machine learning algorithms face many challenges on directly using sensor data. To overcome these challenges, methods must be designed to learn directly from sensors without manual annotations. This paper introduces Sensory Time-cue for Unsupervised Meta-learning (STUM). Different from traditional learning approaches that either heavily depend on labels or on time-independent feature extraction assumptions, such as Gaussian distribution features, the STUM system uses time relation of inputs to guide the feature space formation within and across modalities. The fact that STUM learns from a variety of small tasks may put this method in the camp of Meta-Learning. Different from existing Meta-Learning approaches, STUM learning tasks are composed within and across multiple modalities based on time-cue co-exist with the IoT streaming data. In an audiovisual learning example, because consecutive visual frames usually comprise the same object, this approach provides a unique way to organize features from the same object together. The same method can also organize visual object features with the object's spoken-name features together if the spoken name is presented with the object at about the same time. This cross-modality feature organization may further help the organization of visual features that belong to similar objects but acquired at different location and time. Promising results are achieved through evaluations.

The California Consumer Privacy Act (CCPA)---which began enforcement on July 1, 2020---grants California users the affirmative right to opt-out of the sale of their personal information. In this work, we perform a manual analysis of the top 500 U.S. websites and classify how each site implements this new requirement. We find that the vast majority of sites that implement opt-out mechanisms do so with a Do Not Sell link rather than with a privacy banner, and that many of the linked opt-out controls exhibit features such as nudging and indirect mechanisms (e.g., fillable forms). We then perform a pair of user studies with 4357 unique users (recruited from Google Ads and Amazon Mechanical Turk) in which we observe how users interact with different opt-out mechanisms and evaluate how the implementation choices we observed---exclusive use of links, prevalent nudging, and indirect mechanisms---affect the rate at which users exercise their right to opt-out of sale. We find that these design elements significantly deter interactions with opt-out mechanisms (including reducing the opt-out rate for users who are uncomfortable with the sale of their information) and that they reduce users' awareness of their ability to opt-out. Our results demonstrate the importance of regulations that provide clear implementation requirements in order empower users to exercise their privacy rights.

This paper surveys the field of transfer learning in the problem setting of Reinforcement Learning (RL). RL has been the key solution to sequential decision-making problems. Along with the fast advance of RL in various domains. including robotics and game-playing, transfer learning arises as an important technique to assist RL by leveraging and transferring external expertise to boost the learning process. In this survey, we review the central issues of transfer learning in the RL domain, providing a systematic categorization of its state-of-the-art techniques. We analyze their goals, methodologies, applications, and the RL frameworks under which these transfer learning techniques would be approachable. We discuss the relationship between transfer learning and other relevant topics from an RL perspective and also explore the potential challenges as well as future development directions for transfer learning in RL.

This paper proposes a novel coordinating mechanism between synchronous generators (SGs) and wind turbines (WTs) based on doubly-fed induction generators (DFIGs) for enhanced primary frequency regulation. WTs are urged to participate on frequency regulation, specially if wind power penetration keeps increasing. WTs control support is possible, but it is transient due to the WTs lack of energy storage. This drawback can result in either a further delayed response from the governors of SGs or further frequency decay when WTs support is over. The proposed coordination attempt to tackle this issue. An artificial neural network (ANN) is used to obtain an optimal coordination signal to improve frequency response. As a proof of concept, the proposed coordination is tested on a 9-bus test system that includes a wind farm with 5 WTs. Simulation results show that frequency nadir is reduced in about 22\% and rates of change of the system frequency (RoCoF) in about 29.5\%. Further work is needed to validate this concept in large-scale systems, but the development and results obtained so far are promising to strengthen power systems.

We present decentralized collision avoidance algorithms for quadrotor swarms operating under uncertain state estimation. Our approach exploits the differential flatness property and feedforward linearization to approximate the quadrotor dynamics and reciprocal collision avoidance. We account for the uncertainty in position and velocity by formulating the collision constraints as chance constraints, which describe a set of velocities that avoid collisions with a specified confidence level. We present two different methods for formulating and solving the chance constraint: our first method assumes a Gaussian noise distribution. Our second method is its extension to the non-Gaussian case by using a Gaussian Mixture Model (GMM). We reformulate the linear chance constraints into equivalent deterministic constraints on mean and covariance. Subsequently, the deterministic constraints are introduced in the MPC framework to compute a local collision-free trajectory for each quadrotor. We evaluate the proposed algorithm in simulations on benchmark scenarios and highlight its benefits over prior methods. We observe that both the Gaussian and non-Gaussian methods provide improved collision avoidance performance over the deterministic method. Further, the non-Gaussian method results in a relatively shorter path length compared to Gaussian formulations. On average, the Gaussian method requires ~5ms ms to compute a local collision-free trajectory, while our non-Gaussian method is computationally more expensive and requires ~7ms ms on average in the presence of 4 agents.

In this paper we introduce a novel, unified, open-source model interpretability library for PyTorch [12]. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms, also known as feature, neuron and layer importance algorithms, as well as a set of evaluation metrics for these algorithms. It can be used for both classification and non-classification models including graph-structured models built on Neural Networks (NN). In this paper we give a high-level overview of supported attribution algorithms and show how to perform memory-efficient and scalable computations. We emphasize that the three main characteristics of the library are multimodality, extensibility and ease of use. Multimodality supports different modality of inputs such as image, text, audio or video. Extensibility allows adding new algorithms and features. The library is also designed for easy understanding and use. Besides, we also introduce an interactive visualization tool called Captum Insights that is built on top of Captum library and allows sample-based model debugging and visualization using feature importance metrics.

Comparison Lift is an experimentation-as-a-service (EaaS) application for testing online advertising audiences and creatives at JD.com. Unlike many other EaaS tools that focus primarily on fixed sample A/B testing, Comparison Lift deploys a custom bandit-based experimentation algorithm. The advantages of the bandit-based approach are two-fold. First, it aligns the randomization induced in the test with the advertiser's goals from testing. Second, by adapting experimental design to information acquired during the test, it reduces substantially the cost of experimentation to the advertiser. Since launch in May 2019, Comparison Lift has been utilized in over 1,500 experiments. We estimate that utilization of the product has helped increase click-through rates of participating advertising campaigns by 46% on average. We estimate that the adaptive design in the product has generated 27% more clicks on average during testing compared to a fixed sample A/B design. Both suggest significant value generation and cost savings to advertisers from the product.

Over the last decade, time series motif discovery has emerged as a useful primitive for many downstream analytical tasks, including clustering, classification, rule discovery, segmentation, and summarization. In parallel, there has been an increased understanding that Dynamic Time Warping (DTW) is the best time series similarity measure in a host of settings. Surprisingly however, there has been virtually no work on using DTW to discover motifs. The most obvious explanation of this is the fact that both motif discovery and the use of DTW can be computationally challenging, and the current best mechanisms to address their lethargy are mutually incompatible. In this work, we present the first scalable exact method to discover time series motifs under DTW. Our method automatically performs the best trade-off between time-to-compute and tightness-of-lower-bounds for a novel hierarchy of lower bounds representation we introduce. We show that under realistic settings, our algorithm can admissibly prune up to 99.99% of the DTW computations.

Hash tables are ubiquitous. Properties such as an amortized constant time complexity for insertion and querying as well as a compact memory layout make them versatile associative data structures with manifold applications. The rapidly growing amount of data emerging in many fields motivated the need for accelerated hash tables designed for modern parallel architectures. In this work, we exploit the fast memory interface of modern GPUs together with a parallel hashing scheme tailored to improve global memory access patterns, to design WarpCore -- a versatile library of hash table data structures. Unique device-sided operations allow for building high performance data processing pipelines entirely on the GPU. Our implementation achieves up to 1.6 billion inserts and up to 4.3 billion retrievals per second on a single GV100 GPU thereby outperforming the state-of-the-art solutions cuDPP, SlabHash, and NVIDIA RAPIDS cuDF. This performance advantage becomes even more pronounced for high load factors of over $90\%$. To overcome the memory limitation of a single GPU, we scale our approach over a dense NVLink topology which gives us close-to-optimal weak scaling on DGX servers. We further show how WarpCore can be used for accelerating a real world bioinformatics application (metagenomic classification) with speedups of over two orders-of-magnitude against state-of-the-art CPU-based solutions. WC is written in C++/CUDA-C and is openly available at https://github.com/sleeepyjack/warpcore.

The Causal Bandit is a variant of the classic Bandit problem where an agent must identify the best action in a sequential decision-making process, where the reward distribution of the actions displays a non-trivial dependence structure that is governed by a causal model. All methods proposed thus far in the literature rely on exact prior knowledge of the causal model to obtain improved estimators for the reward. We formulate a new causal bandit algorithm that is the first to no longer rely on explicit prior causal knowledge and instead uses the output of causal discovery algorithms. This algorithm relies on a new estimator based on separating sets, a causal structure already known in causal discovery literature. We show that given a separating set, this estimator is unbiased, and has lower variance compared to the sample mean. We derive a concentration bound and construct a UCB-type algorithm based on this bound, as well as a Thompson sampling variant. We compare our algorithms with traditional bandit algorithms on simulation data. On these problems, our algorithms show a significant boost in performance.

The Internet has become an essential part of children's and adolescents' daily life. Social media platforms are used as educational and entertainment resources on daily bases by young users, leading enormous efforts to ensure their safety when interacting with various social media platforms. In this paper, we investigate the exposure of those users to inappropriate and malicious content in comments posted on YouTube videos targeting this demographic. We collected a large-scale dataset of approximately four million records, and studied the presence of malicious and inappropriate URLs embedded in the comments posted on these videos. Our results show a worrisome number of malicious and inappropriate URLs embedded in comments available for children and young users. In particular, we observe an alarming number of inappropriate and malicious URLs, with a high chance of kids exposure, since the average number of views on videos containing such URLs is 48 million. When using such platforms, children are not only exposed to the material available in the platform, but also to the content of the URLs embedded within the comments. This highlights the importance of monitoring the URLs provided within the comments, limiting the children's exposure to inappropriate content.

In multi-capacity ridesharing, multiple requests (e.g., customers, food items, parcels) with different origin and destination pairs travel in one resource. In recent years, online multi-capacity ridesharing services (i.e., where assignments are made online) like Uber-pool, foodpanda, and on-demand shuttles have become hugely popular in transportation, food delivery, logistics and other domains. This is because multi-capacity ridesharing services benefit all parties involved { the customers (due to lower costs), the drivers (due to higher revenues) and the matching platforms (due to higher revenues per vehicle/resource). Most importantly these services can also help reduce carbon emissions (due to fewer vehicles on roads). Online multi-capacity ridesharing is extremely challenging as the underlying matching graph is no longer bipartite (as in the unit-capacity case) but a tripartite graph with resources (e.g., taxis, cars), requests and request groups (combinations of requests that can travel together). The desired matching between resources and request groups is constrained by the edges between requests and request groups in this tripartite graph (i.e., a request can be part of at most one request group in the final assignment). While there have been myopic heuristic approaches employed for solving the online multi-capacity ridesharing problem, they do not provide any guarantees on the solution quality. To that end, this paper presents the first approach with bounds on the competitive ratio for online multi-capacity ridesharing (when resources rejoin the system at their initial location/depot after serving a group of requests).

We analyze the reservoir computation capability of the Lang-Kobayashi system by comparing the numerically computed recall capabilities and the eigenvalue spectrum. We show that these two quantities are deeply connected, and thus the reservoir computing performance is predictable by analyzing the eigenvalue spectrum. Our results suggest that any dynamical system used as a reservoir can be analyzed in this way as long as the reservoir perturbations are sufficiently small. Optimal performance is found for a system with the eigenvalues having real parts close to zero and off-resonant imaginary parts.

In this work we present a performance exploration on Eager K-truss, a linear-algebraic formulation of the K-truss graph algorithm. We address performance issues related to load imbalance of parallel tasks in symmetric, triangular graphs by presenting a fine-grained parallel approach to executing the support computation. This approach also increases available parallelism, making it amenable to GPU execution. We demonstrate our fine-grained parallel approach using implementations in Kokkos and evaluate them on an Intel Skylake CPU and an Nvidia Tesla V100 GPU. Overall, we observe between a 1.261. 48x improvement on the CPU and a 9.97-16.92x improvement on the GPU due to our fine-grained parallel formulation.

The performance of graph algorithms is often measured in terms of the number of traversed edges per second (TEPS). However, this performance metric is inadequate for a graph operation such as exact triangle counting. In triangle counting, execution times on graphs with a similar number of edges can be distinctly different as demonstrated by results from the past Graph Challenge entries. We discuss the need for an objective performance metric for graph operations and the desired characteristics of such a metric such that it more accurately captures the interactions between the amount of work performed and the capabilities of the hardware on which the code is executed. Using exact triangle counting as an example, we derive a metric that captures how certain techniques employed in many implementations improve performance. We demonstrate that our proposed metric can be used to evaluate and compare multiple approaches for triangle counting, using a SIMD approach as a case study against a scalar baseline.

In this paper, we derive a notion of word meaning in context from Fillmore's 'semantics of understanding', in which a listener draws on their knowledge of both language and the world to 'envision' the situation described in an utterance. We characterize utterance understanding as a combination of cognitive semantics and Discourse Representation Theory, formalized as a situation description system:a probabilistic model which takes utterance understanding to be the mental process of describing one or more situations that would account for an observed utterance. Our model captures the interplay of local and global contexts and their joint influence upon the lexical representation of sentence constituents. We implement the system using a directed graphical model, and apply it to examples containing various contextualization phenomena.

The secure command and control (C&C) of mobile agents arises in various settings including unmanned aerial vehicles, single pilot operations in commercial settings, and mobile robots to name a few. As more and more of these applications get integrated into aerospace and defense use cases, the security of the communication channel between the ground station and the mobile agent is of increasing importance. The development of quantum computing devices poses a unique threat to secure communications due to the vulnerability of asymmetric ciphers to Shor's algorithm. Given the active development of new quantum resistant encryption techniques, we report the first integration of post-quantum secure encryption schemes with the robot operating system (ROS) and C&C of mobile agents, in general. We integrate these schemes in the application and network layers, and study the performance of these methods by comparing them to present day security schemes such as the widely used RSA algorithm.

Knowledge graphs (KGs) are of great importance to many real world applications, but they generally suffer from incomplete information in the form of missing relations between entities. Knowledge graph completion (also known as relation prediction) is the task of inferring missing facts given existing ones. Most of the existing work is proposed by maximizing the likelihood of observed instance-level triples. Not much attention, however, is paid to the ontological information, such as type information of entities and relations. In this work, we propose a type-augmented relation prediction (TaRP) method, where we apply both the type information and instance-level information for relation prediction. In particular, type information and instance-level information are encoded as prior probabilities and likelihoods of relations respectively, and are combined by following Bayes' rule. Our proposed TaRP method achieves significantly better performance than state-of-the-art methods on three benchmark datasets: FB15K, YAGO26K-906, and DB111K-174. In addition, we show that TaRP achieves significantly improved data efficiency. More importantly, the type information extracted from a specific dataset can generalize well to other datasets through the proposed TaRP model.

The emergence of small and portable smart sensors have opened up new opportunities for many applications, including automated factories, smart cities and connected healthcare, broadly referred to as the "Internet of Things (IoT)". These devices produce time series data. While deep neural networks (DNNs) has been widely applied to computer vision, natural language processing and speech recognition, there is limited research on DNNs for time series prediction. Machine learning (ML) applications for time series prediction has traditionally involved predicting the next value in the series. However, in certain applications, segmenting the time series into a sequence of trends and predicting the next trend is preferred. Recently, a hybrid DNN algorithm, TreNet was proposed for trend prediction. TreNet, which combines an LSTM that takes in trendlines and a CNN that takes in point data was shown to have superior performance for trend prediction when compared to other approaches. However, the study used a standard cross-validation method which does not take into account the sequential nature of time series. In this work, we reproduce TreNet using a walk-forward validation method, which is more appropriate to time series data. We compare the performance of the hybrid TreNet algorithm, on the same three data sets used in the original study, to vanilla MLP, LSTM, and CNN that take in point data, and also to traditional ML algorithms, i.e. the Random Forest (RF), Support Vector Regression and Gradient Boosting Machine. Our results differ significantly from those reported for the original TreNet. In general TreNet still performs better than the vanilla DNN models, but not substantially so as reported for the original TreNet. Furthermore, our results show that the RF algorithm performed substantially better than TreNet on the methane data set.

We review how existing trends are relevant to the discussion of brain-computer interfaces and the data they would generate. Then, we posit how the commerce of neural data, dubbed Neurocapitalism, could be impacted by the maturation of brain-computer interface technology. We explore how this could pose fundamental changes to our way of interacting, as well as our sense of autonomy and identity. Because of the power inherent in the technology, and its potentially ruinous abuses, action must be taken before the appearance of the technology, and not come as a reaction to it. The widespread adoption of brain-computer interface technology will certainly change our way of life. Whether it is changed for the better or worse, depends on how well we prepare for its arrival.

Sepsis is one of the leading causes of death in Intensive Care Units (ICU). The strategy for treating sepsis involves the infusion of intravenous (IV) fluids and administration of antibiotics. Determining the optimal quantity of IV fluids is a challenging problem due to the complexity of a patient's physiology. In this study, we develop a data-driven optimization solution that derives the optimal quantity of IV fluids for individual patients. The proposed method minimizes the probability of severe outcomes by controlling the prescribed quantity of IV fluids and utilizes human-in-the-loop artificial intelligence. We demonstrate the performance of our model on 1122 ICU patients with sepsis diagnosis extracted from the MIMIC-III dataset. The results show that, on average, our model can reduce mortality by 22%. This study has the potential to help physicians synthesize optimal, patient-specific treatment strategies.

Aspect-based sentiment analysis (ABSA) aims to predict the sentiment towards a specific aspect in the text. However, existing ABSA test sets cannot be used to probe whether a model can distinguish the sentiment of the target aspect from the non-target aspects. To solve this problem, we develop a simple but effective approach to enrich ABSA test sets. Specifically, we generate new examples to disentangle the confounding sentiments of the non-target aspects from the target aspect's sentiment. Based on the SemEval 2014 dataset, we construct the Aspect Robustness Test Set (ARTS) as a comprehensive probe of the aspect robustness of ABSA models. Over 92% data of ARTS show high fluency and desired sentiment on all aspects by human evaluation. Using ARTS, we analyze the robustness of nine ABSA models, and observe, surprisingly, that their accuracy drops by up to 69.73%. Our code and new test set are available at https://github.com/zhijing-jin/ARTS_TestSet.

Multiscale methods for second order elliptic equations based on non-overlapping domain decomposition schemes have great potential to take advantage of multi-core, state-of-the-art parallel computers. These methods typically involve solving local boundary value problems followed by the solution of a global interface problem. Known iterative procedures for the solution of the interface problem have typically slow convergence, increasing the overall cost of the multiscale solver. To overcome this problem we develop a scalable recursive solution method for such interface problem that replaces the global problem by a family of small interface systems associated with adjacent subdomains, in a hierarchy of nested subdomains. Then, we propose a novel parallel algorithm to implement our recursive formulation in multi-core devices using the Multiscale Robin Coupled Method by Guiraldello et al. (2018), that can be seen as a generalization of several multiscale mixed methods. Through several numerical studies we show that the new algorithm is very fast and exhibits excellent strong and weak scalability. We consider very large problems, that can have billions of discretization cells, motivated by the numerical simulation of subsurface flows.

This paper presents a methodology and toolkit for creating a rule-based multi-domain conversational agent for transactions from (1) language annotations of the domains' database schemas and APIs and (2) a couple of hundreds of annotated human dialogues. There is no need for a large annotated training set, which is expensive to acquire. The toolkit uses a pre-defined abstract dialogue state machine to synthesize millions of dialogues based on the domains' information. The annotated and synthesized data are used to train a contextual semantic parser that interprets the user's latest utterance in the context of a formal representation of the conversation up to that point. Developers can refine the state machine to achieve higher accuracy. On the MultiWOZ benchmark, we achieve over 71% turn-by-turn slot accuracy on a cleaned, reannotated test set, without using any of the original training data. Our state machine can model 96% of the human agent turns. Our training strategy improves by 9% over a baseline that uses the same amount of hand-labeled data, showing the benefit of synthesizing data using the state machine.

Multiscale shape skeletonization on pixel adjacency graphs is an advanced intriguing research subject in the field of image processing, computer vision and data mining. The previous works in this area almost focused on the graph vertices. We proposed novel structured based graph morphological transformations based on edges opposite to the current node based transformations and used them for deploying skeletonization and reconstruction of infrared thermal images represented by graphs. The advantage of this method is that many widely used path based approaches become available within this definition of morphological operations. For instance, we use distance maps and image foresting transform (IFT) as two main path based methods are utilized for computing the skeleton of an image. Moreover, In addition, the open question proposed by Maragos et al (2013) about connectivity of graph skeletonization method are discussed and shown to be quite difficult to decide in general case.

Data classification is a major machine learning paradigm, which has been widely applied to solve a large number of real-world problems. Traditional data classification techniques consider only physical features (e.g., distance, similarity, or distribution) of the input data. For this reason, those are called \textit{low-level} classification. On the other hand, the human (animal) brain performs both low and high orders of learning and it has a facility in identifying patterns according to the semantic meaning of the input data. Data classification that considers not only physical attributes but also the pattern formation is referred to as \textit{high-level} classification. Several high-level classification techniques have been developed, which make use of complex networks to characterize data patterns and have obtained promising results. In this paper, we propose a pure network-based high-level classification technique that uses the betweenness centrality measure. We test this model in nine different real datasets and compare it with other nine traditional and well-known classification models. The results show us a competent classification performance.

For supervised learning models, the analysis of generalization ability (generalizability) is vital because the generalizability expresses how well a model will perform on unseen data. Traditional generalization methods, such as the VC dimension, do not apply to deep neural network (DNN) models. Thus, new theories to explain the generalizability of DNNs are required. In this study, we hypothesize that the DNN with a simpler decision boundary has better generalizability by the law of parsimony (Occam's Razor). We create the decision boundary complexity (DBC) score to define and measure the complexity of decision boundary of DNNs. The idea of the DBC score is to generate data points (called adversarial examples) on or near the decision boundary. Our new approach then measures the complexity of the boundary using the entropy of eigenvalues of these data. The method works equally well for high-dimensional data. We use training data and the trained model to compute the DBC score. And, the ground truth for model's generalizability is its test accuracy. Experiments based on the DBC score have verified our hypothesis. The DBC is shown to provide an effective method to measure the complexity of a decision boundary and gives a quantitative measure of the generalizability of DNNs.

Facility location problems often permit facilities to be located at any position. But what if this is not the case in practice? What if facilities can only be located at particular locations like a highway exit or close to a bus stop? We consider here the impact of such constraints on the location of facilities on the performance of strategy proof mechanisms for locating facilities.We study four different performance objectives: the total distance agents must travel to their closest facility, the maximum distance any agent must travel to their closest facility, and the utilitarian and egalitarian welfare.We show that constraining facilities to a limited set of locations makes all four objectives harder to approximate in general.

We study the impact on mechanisms for facility location of moving from one dimension to two (or more) dimensions and Euclidean or Manhattan distances. We consider three fundamental axiomatic properties: anonymity which is a basic fairness property, Pareto optimality which is one of the most important efficiency properties, and strategy proofness which ensures agents do not have an incentive to mis-report. We also consider how well such mechanisms can approximate the optimal welfare. Our results are somewhat negative. Moving from one dimension to two (or more) dimensions often makes these axiomatic properties more difficult to achieve. For example, with two facilities in Euclidean space or with just a single facility in Manhattan space, no mechanism is anonymous, Pareto optimal and strategy proof. By contrast, mechanisms on the line exist with all three properties.We also show that approximation ratios may increase when moving to two (or more) dimensions. All our impossibility results are minimal. If we drop one of the three axioms (anonymity, Pareto optimality or strategy proofness) multiple mechanisms satisfy the other two axioms.

Increasing single-cell DRAM error rates have pushed DRAM manufacturers to adopt on-die error-correction coding (ECC), which operates entirely within a DRAM chip to improve factory yield. The on-die ECC function and its effects on DRAM reliability are considered trade secrets, so only the manufacturer knows precisely how on-die ECC alters the externally-visible reliability characteristics. Consequently, on-die ECC obstructs third-party DRAM customers (e.g., test engineers, experimental researchers), who typically design, test, and validate systems based on these characteristics. To give third parties insight into precisely how on-die ECC transforms DRAM error patterns during error correction, we introduce Bit-Exact ECC Recovery (BEER), a new methodology for determining the full DRAM on-die ECC function (i.e., its parity-check matrix) without hardware tools, prerequisite knowledge about the DRAM chip or on-die ECC mechanism, or access to ECC metadata (e.g., error syndromes, parity information). BEER exploits the key insight that non-intrusively inducing data-retention errors with carefully-crafted test patterns reveals behavior that is unique to a specific ECC function. We use BEER to identify the ECC functions of 80 real LPDDR4 DRAM chips with on-die ECC from three major DRAM manufacturers. We evaluate BEER's correctness in simulation and performance on a real system to show that BEER is effective and practical across a wide range of on-die ECC functions. To demonstrate BEER's value, we propose and discuss several ways that third parties can use BEER to improve their design and testing practices. As a concrete example, we introduce and evaluate BEEP, the first error profiling methodology that uses the known on-die ECC function to recover the number and bit-exact locations of unobservable raw bit errors responsible for observable post-correction errors.

An important feature of many real world facility location problems are capacity limits on the facilities. We show here how capacity constraints make it harder to design strategy proof mechanisms for facility location, but counter-intuitively can improve the guarantees on how well we can approximate the optimal solution.

It is well observed that in deep learning and computer vision literature, visual data are always represented in a manually designed coding scheme (eg., RGB images are represented as integers ranging from 0 to 255 for each channel) when they are input to an end-to-end deep neural network (DNN) for any learning task. We boldly question whether the manually designed inputs are good for DNN training for different tasks and study whether the input to a DNN can be optimally learned end-to-end together with learning the weights of the DNN. In this paper, we propose the paradigm of {\em deep collective learning} which aims to learn the weights of DNNs and the inputs to DNNs simultaneously for given tasks. We note that collective learning has been implicitly but widely used in natural language processing while it has almost never been studied in computer vision. Consequently, we propose the lookup vision networks (Lookup-VNets) as a solution to deep collective learning in computer vision. This is achieved by associating each color in each channel with a vector in lookup tables. As learning inputs in computer vision has almost never been studied in the existing literature, we explore several aspects of this question through varieties of experiments on image classification tasks. Experimental results on four benchmark datasets, i.e., CIFAR-10, CIFAR-100, Tiny ImageNet, and ImageNet (ILSVRC2012) have shown several surprising characteristics of Lookup-VNets and have demonstrated the advantages and promise of Lookup-VNets and deep collective learning.

Component-based development is challenging in a distributed setting, for starters considering programming a task may involve the assembly of loosely-coupled remote components. In order for the task to be fulfilled, the supporting interaction among components should follow a well-defined protocol. In this paper we address a model for message passing component-based systems where components are assembled together with the protocol itself. Components can therefore be independent from the protocol, and reactive to messages in a flexible way. Our contribution is at the level of the type language that allows to capture component behaviour so as to check its compatibility with a protocol. We show the correspondence of component and type behaviours, which entails a progress property for components.

We initiate the development of a model-driven testing framework for message-passing systems. The notion of test for communicating systems cannot simply be borrowed from existing proposals. Therefore, we formalize a notion of suitable distributed tests for a given choreography and devise an algorithm that generates tests as projections of global views. Our algorithm abstracts away from the actual projection operation, for which we only set basic requirements. The algorithm can be instantiated by reusing existing projection operations (designed to generate local implementations of global models) as they satisfy our requirements. Finally, we show the correctness of the approach and validate our methodology via an illustrative example.

We investigate refinement in the context of choreographies. We introduce refinable global choreographies allowing for the underspecification of protocols, whose interactions can be refined into actual protocols. Arbitrary refinements may spoil well-formedness, that is the sufficient conditions that guarantee a protocol to be implementable. We introduce a typing discipline that enforces well-formedness of typed choreographies. Then we unveil the relation among refinable choregraphies and their admissible refinements in terms of an axiom scheme.

Self-supervised representation learning is an emerging research topic for its powerful capacity in learning with unlabeled data. As a mainstream self-supervised learning method, augmentation-based contrastive learning has achieved great success in various computer vision tasks that lack manual annotations. Despite current progress, the existing methods are often limited by extra cost on memory or storage, and their performance still has large room for improvement. Here we present a self-supervised representation learning method, namely AAG, which is featured by an auxiliary augmentation strategy and GNT-Xent loss. The auxiliary augmentation is able to promote the performance of contrastive learning by increasing the diversity of images. The proposed GNT-Xent loss enables a steady and fast training process and yields competitive accuracy. Experiment results demonstrate the superiority of AAG to previous state-of-the-art methods on CIFAR10, CIFAR100, and SVHN. Especially, AAG achieves 94.5% top-1 accuracy on CIFAR10 with batch size 64, which is 0.5% higher than the best result of SimCLR with batch size 1024.

We propose a webly-supervised representation learning method that does not suffer from the annotation unscalability of supervised learning, nor the computation unscalability of self-supervised learning. Most existing works on webly-supervised representation learning adopt a vanilla supervised learning method without accounting for the prevalent noise in the training data, whereas most prior methods in learning with label noise are less effective for real-world large-scale noisy data. We propose momentum prototypes (MoPro), a simple contrastive learning method that achieves online label noise correction, out-of-distribution sample removal, and representation learning. MoPro achieves state-of-the-art performance on WebVision, a weakly-labeled noisy dataset. MoPro also shows superior performance when the pretrained model is transferred to down-stream image classification and detection tasks. It outperforms the ImageNet supervised pretrained model by +10.5 on 1-shot classification on VOC, and outperforms the best self-supervised pretrained model by +17.3 when finetuned on 1\% of ImageNet labeled samples. Furthermore, MoPro is more robust to distribution shifts. Code and pretrained models are available at https://github.com/salesforce/MoPro.

Spectre and Meltdown attacks and their variants exploit performance optimization features to cause security breaches. Secret information is accessed and leaked through micro-architectural covert channels. New attack variants keep appearing and we do not have a systematic way to capture the critical characteristics of these attacks and evaluate why they succeed. In this paper, we provide a new attack-graph model for reasoning about speculative micro-architectural attacks. We model attacks as ordered dependency graphs, and define a new concept, i.e. security dependency, between a resource access and its prior authorization operation. We prove that a missing security dependency is equivalent to a race condition between authorization and access, which is a root cause of speculative execution attacks. We show detailed examples of how our attack graph models the Spectre and Meltdown attacks, and is generalizable to all the attack variants published so far. We also show that this attack model is very useful for identifying new attacks and for generalizing defense strategies. We show that the defenses proposed so far all fit under one of our defense strategies. We also explain how attack graphs can be constructed and point to this as very promising future work for tool designers.

Current federated learning algorithms take tens of communication rounds transmitting unwieldy model weights under ideal circumstances and hundreds when data is poorly distributed. Inspired by recent work on dataset distillation and distributed one-shot learning, we propose Distilled One-Shot Federated Learning, which reduces the number of communication rounds required to train a performant model to only one. Each client distills their private dataset and sends the synthetic data (e.g. images or sentences) to the server. The distilled data look like noise and become useless after model fitting. We empirically show that, in only one round of communication, our method can achieve 96% test accuracy on federated MNIST with LeNet (centralized 99%), 81% on federated IMDB with a customized CNN (centralized 86%), and 84% on federated TREC-6 with a Bi-LSTM (centralized 89%). Using only a few rounds, DOSFL can match the centralized baseline on all three tasks. By evading the need for model-wise updates (i.e., weights, gradients, loss, etc.), the total communication cost of DOSFL is reduced by over an order of magnitude. We believe that DOSFL represents a new direction orthogonal to previous work, towards weight-less and gradient-less federated learning.

There has been a recent wave of interest in intermediate trust models for differential privacy that eliminate the need for a fully trusted central data collector, but overcome the limitations of local differential privacy. This interest has led to the introduction of the shuffle model (Cheu et al., EUROCRYPT 2019; Erlingsson et al., SODA 2019) and revisiting the pan-private model (Dwork et al., ITCS 2010). The message of this line of work is that, for a variety of low-dimensional problems---such as counts, means, and histograms---these intermediate models offer nearly as much power as central differential privacy. However, there has been considerably less success using these models for high-dimensional learning and estimation problems. In this work, we show that, for a variety of high-dimensional learning and estimation problems, both the shuffle model and the pan-private model inherently incur an exponential price in sample complexity relative to the central model. For example, we show that, private agnostic learning of parity functions over $d$ bits requires $\Omega(2^{d/2})$ samples in these models, and privately selecting the most common attribute from a set of $d$ choices requires $\Omega(d^{1/2})$ samples, both of which are exponential separations from the central model. Our work gives the first non-trivial lower bounds for these problems for both the pan-private model and the general multi-message shuffle model.

Large-scale planting of trees has been proposed as a low-cost natural solution for carbon mitigation, but is hampered by poor selection of plantation sites, especially in developing countries. To aid in site selection, we develop the ePSA (e-Plantation Site Assistant) recommendation system based on algorithm fusion that combines physics-based/traditional forestry science knowledge with machine learning. ePSA assists forest range officers by identifying blank patches inside forest areas and ranking each such patch based on their tree growth potential. Experiments, user studies, and deployment results characterize the utility of the recommender system in shaping the long-term success of tree plantations as a nature climate solution for carbon mitigation in northern India and beyond.

Video style transfer is getting more attention in AI community for its numerous applications such as augmented reality and animation productions. Compared with traditional image style transfer, performing this task on video presents new challenges: how to effectively generate satisfactory stylized results for any specified style, and maintain temporal coherence across frames at the same time. Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos. Specifically, MCCNet works directly on the feature space of style and content domain where it learns to rearrange and fuse style features based on their similarity with content features. The outputs generated by MCC are features containing the desired style patterns which can further be decoded into images with vivid style textures. Moreover, MCCNet is also designed to explicitly align the features to input which ensures the output maintains the content structures as well as the temporal continuity. To further improve the performance of MCCNet under complex light conditions, we also introduce the illumination loss during training. Qualitative and quantitative evaluations demonstrate that MCCNet performs well in both arbitrary video and image style transfer tasks.

A visual homograph attack is a way that the attacker deceives the web users about which domain they are visiting by exploiting forged domains that look similar to the genuine domains. T. Thao et al. (IFIP SEC'19) proposed a homograph classification by applying conventional supervised learning algorithms on the features extracted from a single-character-based Structural Similarity Index (SSIM). This paper aims to improve the classification accuracy by combining their SSIM features with 199 features extracted from a N-gram model and applying advanced ensemble learning algorithms. The experimental result showed that our proposed method could enhance even 1.81% of accuracy and reduce 2.15% of false-positive rate. Furthermore, existing work applied machine learning on some features without being able to explain why applying it can improve the accuracy. Even though the accuracy could be improved, understanding the ground-truth is also crucial. Therefore, in this paper, we conducted an error empirical analysis and could obtain several findings behind our proposed approach.

The peridynamic theory reformulates the governing equation of continuum mechanics in an integro-differential form,which brings advantages in dealing with discontinuities,dynamics,and non-locality.The integro-differential formulation poses challenges to numerical solutions of complicated problems.While various numerical methods based on discretizing the computational domain have been developed and have their own merits,some important issues are yet to be solved,such as the computation of infinite domains,the treatment of softening of boundaries due to an incomplete horizon,and time error accumulation in dynamic processes.In this work,we develop the peridynamic boundary element method (PD-BEM).To this end,the boundary integral equations for static and dynamic problems are derived,and the corresponding numerical frameworks are presented.For static loading,this method gives the explicit equation solved directly without iterations.For dynamic loading,we solve the problem in the Laplace domain and obtain the results in the time domain via inversion.This treatment eliminates time error accumulation,and facilitates parallel computation.The computational results on static and dynamic examples within the bond-based peridynamic formulation exhibit several features.First,for non-destructive cases,the PD-BEM can be one to two orders of magnitude faster than the peridynamic meshless particle method (PD-MPM);second,it conserves the total energy much better than the PD-MPM;third,it does not exhibit spurious boundary softening phenomena.For destructive cases where new boundaries emerge during the loading process,we propose a coupling scheme where the PD-MPM is applied to the cracked region and the PD-BEM is applied to the un-cracked region such that the time of computation can be significantly reduced.The present method can be generalized to other subjects such as diffusion and multi-physical problems.

Discrete optimization is one of the most intractable problems in deep hashing. Previous methods usually mitigate this problem by binary approximation, substituting binary codes for real-values via activation functions or regularizations. However, such approximation leads to uncertainty between real-values and binary ones, degrading retrieval performance. In this paper, we propose a novel Deep Momentum Uncertainty Hashing (DMUH). It explicitly estimates the uncertainty during training and leverages the uncertainty information to guide the approximation process. Specifically, we model \emph{bit-level uncertainty} via measuring the discrepancy between the output of a hashing network and that of a momentum-updated network. The discrepancy of each bit indicates the uncertainty of the hashing network to the approximate output of that bit. Meanwhile, the mean discrepancy of all bits in a hashing code can be regarded as \emph{image-level uncertainty}. It embodies the uncertainty of the hashing network to the corresponding input image. The hashing bit and the image with higher uncertainty are paid more attention during optimization. To the best of our knowledge, this is the first work to study the uncertainty in hashing bits. Extensive experiments are conducted on four datasets to verify the superiority of our method, including CIFAR-10, NUS-WIDE, MS-COCO, and a million-scale dataset Clothing1M. Our method achieves best performance on all datasets and surpasses existing state-of-the-arts by a large margin, especially on Clothing1M.

An important problem in terrain analysis is modeling how water flows across a terrain creating floods by forming channels and filling depressions. In this paper we study a number of \emph{flow-query} related problems: Given a terrain $\Sigma$, represented as a triangulated $xy$-monotone surface with $n$ vertices, a rain distribution $R$ which may vary over time, determine how much water is flowing over a given edge as a function of time. We develop internal-memory as well as I/O-efficient algorithms for flow queries. This paper contains four main results: (i) We present an internal-memory algorithm that preprocesses $\Sigma$ into a linear-size data structure that for a (possibly time varying) rain distribution $R$ can return the flow-rate functions of all edges of $\Sigma$ in $O(\rho k+|\phi| \log n)$ time, where $\rho$ is the number of sinks in $\Sigma$, $k$ is the number of times the rain distribution changes, and $|\phi|$ is the total complexity of the flow-rate functions that have non-zero values; (ii) We also present an I/O-efficient algorithm for preprocessing $\Sigma$ into a linear-size data structure so that for a rain distribution $R$, it can compute the flow-rate function of all edges using $O(\text{Sort}(|\phi|))$ I/Os and $O(|\phi| \log |\phi|)$ internal computation time. (iii) $\Sigma$ can be preprocessed into a linear-size data structure so that for a given rain distribution $R$, the flow-rate function of an edge $(q,r) \in \Sigma$ under the single-flow direction (SFD) model can be computed more efficiently. (iv) We present an algorithm for computing the two-dimensional channel along which water flows using Manning's equation; a widely used empirical equation that relates the flow-rate of water in an open channel to the geometry of the channel along with the height of water in the channel.

This paper presents a neural network model to generate virtual violinist's 3-D skeleton movements from music audio. Improved from the conventional recurrent neural network models for generating 2-D skeleton data in previous works, the proposed model incorporates an encoder-decoder architecture, as well as the self-attention mechanism to model the complicated dynamics in body movement sequences. To facilitate the optimization of self-attention model, beat tracking is applied to determine effective sizes and boundaries of the training examples. The decoder is accompanied with a refining network and a bowing attack inference mechanism to emphasize the right-hand behavior and bowing attack timing. Both objective and subjective evaluations reveal that the proposed model outperforms the state-of-the-art methods. To the best of our knowledge, this work represents the first attempt to generate 3-D violinists' body movements considering key features in musical body movement.

Deep neural network (DNN), especially convolutional neural network, has achieved superior performance on image classification tasks. However, such performance is only guaranteed if the input to a trained model is similar to the training samples, i.e., the input follows the probability distribution of the training set. Out-Of-Distribution (OOD) samples do not follow the distribution of training set, and therefore the predicted class labels on OOD samples become meaningless. Classification-based methods have been proposed for OOD detection; however, in this study we show that this type of method is theoretically ineffective and practically breakable because of dimensionality reduction in the model. We also show that Glow likelihood-based OOD detection is ineffective as well. Our analysis is demonstrated on five open datasets, including a COVID-19 CT dataset. At last, we present a simple theoretical solution with guaranteed performance for OOD detection.

Summarization of multimedia data becomes increasingly significant as it is the basis for many real-world applications, such as question answering, Web search, and so forth. Most existing multi-modal summarization works however have used visual complementary features extracted from images rather than videos, thereby losing abundant information. Hence, we propose a novel multi-modal summarization task to summarize from a document and its associated video. In this work, we also build a baseline general model with effective strategies, i.e., bi-hop attention and improved late fusion mechanisms to bridge the gap between different modalities, and a bi-stream summarization strategy to employ text and video summarization simultaneously. Comprehensive experiments show that the proposed model is beneficial for multi-modal summarization and superior to existing methods. Moreover, we collect a novel dataset and it provides a new resource for future study that results from documents and videos.

Modern vehicles are equipped with various driver-assistance systems, including automatic lane keeping, which prevents unintended lane departures. Traditional lane detection methods incorporate handcrafted or deep learning-based features followed by postprocessing techniques for lane extraction using RGB cameras. The utilization of a RGB camera for lane detection tasks is prone to illumination variations, sun glare, and motion blur, which limits the performance of the lane detection method. The incorporation of an event camera for lane detection tasks in the perception stack of autonomous driving is one of the most promising solutions for mitigating challenges encountered by RGB cameras. In this work, Lane Detection using dynamic vision sensor (LDNet), is proposed, that is designed in an encoder-decoder manner with an atrous spatial pyramid pooling block followed by an attention-guided decoder for predicting and reducing false predictions in lane detection tasks. This decoder eliminates the implicit need for a postprocessing step. The experimental results show the significant improvement of $5.54\%$ and $5.03\%$ on the $F1$ scores in the multiclass and binary class lane detection tasks, respectively. Additionally, the $IoU$ scores of the proposed method surpass those of the best-performing state-of-the-art method by $6.50\%$ and $9.37\%$ in the multiclass and binary class tasks, respectively.

This work investigates the electrical impedance tomography (EIT) problem when only limited boundary measurements are available, which is known to be challenging due to the extreme ill-posedness. Based on the direct sampling method (DSM), we propose deep direct sampling methods (DDSMs) to locate inhomogeneous inclusions in which two types of deep neural networks (DNNs) are constructed to approximate the index function(functional): fully connected neural network(FNN) and convolutional neural network (CNN). The proposed DDSMs are easy to be implemented, capable of incorporating multiple Cauchy data pairs to achieve high-quality reconstruction and highly robust with respect to large noise. Additionally, the implementation of DDSMs adopts offline-online decomposition, which helps to reduce a lot of computational costs and makes DDSMs as efficient as the conventional DSM. The numerical experiments are presented to demonstrate the efficacy and show the potential benefits of combining DNN with DSM.

Most of the current user authentication systems are based on PIN code, password, or biometrics traits which can have some limitations in usage and security. Lifestyle authentication has become a new research approach. A promising idea for it is to use the location history since it is relatively unique. Even when people are living in the same area or have occasional travel, it does not vary from day to day. For Global Positioning System (GPS) data, the previous work used the longitude, the latitude, and the timestamp as the features for the classification. In this paper, we investigate a new approach utilizing the distance coherence which can be extracted from the GPS itself without the need to require other information. We applied three ensemble classification RandomForest, ExtraTrees, and Bagging algorithms; and the experimental result showed that the approach can achieve 99.42%, 99.12%, and 99.25% of accuracy, respectively.

Manually authoring 3D shapes is difficult and time consuming; generative models of 3D shapes offer compelling alternatives. Procedural representations are one such possibility: they offer high-quality and editable results but are difficult to author and often produce outputs with limited diversity. On the other extreme are deep generative models: given enough data, they can learn to generate any class of shape but their outputs have artifacts and the representation is not editable. In this paper, we take a step towards achieving the best of both worlds for novel 3D shape synthesis. We propose ShapeAssembly, a domain-specific "assembly-language" for 3D shape structures. ShapeAssembly programs construct shapes by declaring cuboid part proxies and attaching them to one another, in a hierarchical and symmetrical fashion. Its functions are parameterized with free variables, so that one program structure is able to capture a family of related shapes. We show how to extract ShapeAssembly programs from existing shape structures in the PartNet dataset. Then we train a deep generative model, a hierarchical sequence VAE, that learns to write novel ShapeAssembly programs. The program captures the subset of variability that is interpretable and editable. The deep model captures correlations across shape collections that are hard to express procedurally. We evaluate our approach by comparing shapes output by our generated programs to those from other recent shape structure synthesis models. We find that our generated shapes are more plausible and physically-valid than those of other methods. Additionally, we assess the latent spaces of these models, and find that ours is better structured and produces smoother interpolations. As an application, we use our generative model and differentiable program interpreter to infer and fit shape programs to unstructured geometry, such as point clouds.

Close your eyes and listen to music, one can easily imagine an actor dancing rhythmically along with the music. These dance movements are usually made up of dance movements you have seen before. In this paper, we propose to reproduce such an inherent capability of the human-being within a computer vision system. The proposed system consists of three modules. To explore the relationship between music and dance movements, we propose a cross-modal alignment module that focuses on dancing video clips, accompanied on pre-designed music, to learn a system that can judge the consistency between the visual features of pose sequences and the acoustic features of music. The learned model is then used in the imagination module to select a pose sequence for the given music. Such pose sequence selected from the music, however, is usually discontinuous. To solve this problem, in the spatial-temporal alignment module we develop a spatial alignment algorithm based on the tendency and periodicity of dance movements to predict dance movements between discontinuous fragments. In addition, the selected pose sequence is often misaligned with the music beat. To solve this problem, we further develop a temporal alignment algorithm to align the rhythm of music and dance. Finally, the processed pose sequence is used to synthesize realistic dancing videos in the imagination module. The generated dancing videos match the content and rhythm of the music. Experimental results and subjective evaluations show that the proposed approach can perform the function of generating promising dancing videos by inputting music.

We give an efficient algorithm to strongly refute \emph{semi-random} instances of all Boolean constraint satisfaction problems. The number of constraints required by our algorithm matches (up to polylogarithmic factors) the best-known bounds for efficient refutation of fully random instances. Our main technical contribution is an algorithm to strongly refute semi-random instances of the Boolean $k$-XOR problem on $n$ variables that have $\widetilde{O}(n^{k/2})$ constraints. (In a semi-random $k$-XOR instance, the equations can be arbitrary and only the right-hand sides are random.) One of our key insights is to identify a simple combinatorial property of random XOR instances that makes spectral refutation work. Our approach involves taking an instance that does not satisfy this property (i.e., is \emph{not} pseudorandom) and reducing it to a partitioned collection of $2$-XOR instances. We analyze these subinstances using a carefully chosen quadratic form as a proxy, which in turn is bounded via a combination of spectral methods and semidefinite programming. The analysis of our spectral bounds relies only on an off-the-shelf matrix Bernstein inequality. Even for the purely random case, this leads to a shorter proof compared to the ones in the literature that rely on problem-specific trace-moment computations.

An exoskeleton is a wearable electromechanical structure that is intended to resemble and allow movements in a manner similar to the human skeletal system. They can be used by both disabled and able people alike to increase physical strength in carrying out tasks that would be otherwise difficult, or as a rehabilitation device to aid in physiotherapeutic activities of a weakened body part. This paper intends to introduce a voicecontrolled upper body exoskeleton for industrial applications which can aid workers wearing it by reducing stresses on their arms and shoulders over longer periods and add up to 20kg more strength in lifting applications. The 3D design, calculations and considerations, and load analysis are presented along with brief results of a basic prototype model of the exoskeleton.

8-bit integer inference, as a promising direction in reducing both the latency and storage of deep neural networks, has made great progress recently. On the other hand, previous systems still rely on 32-bit floating point for certain functions in complex models (e.g., Softmax in Transformer), and make heavy use of quantization and de-quantization. In this work, we show that after a principled modification on the Transformer architecture, dubbed Integer Transformer, an (almost) fully 8-bit integer inference algorithm Scale Propagation could be derived. De-quantization is adopted when necessary, which makes the network more efficient. Our experiments on WMT16 En<->Ro, WMT14 En<->De and En->Fr translation tasks as well as the WikiText-103 language modelling task show that the fully 8-bit Transformer system achieves comparable performance with the floating point baseline but requires nearly 4X less memory footprint.

Segmentation of handwritten document images into text lines and words is one of the most significant and challenging tasks in the development of a complete Optical Character Recognition (OCR) system. This paper addresses the automatic segmentation of text words directly from unconstrained Bangla handwritten document images. The popular Distance transform (DT) algorithm is applied for locating the outer boundary of the word images. This technique is free from generating the over-segmented words. A simple post-processing procedure is applied to isolate the under-segmented word images, if any. The proposed technique is tested on 50 random images taken from CMATERdb1.1.1 database. Satisfactory result is achieved with a segmentation accuracy of 91.88% which confirms the robustness of the proposed methodology.

Recently, reconfigurable intelligent surfaces (RIS) have attracted a lot of attention due to their capability of extending cell coverage by reflecting signals toward the receiver. In this letter, we analyze the coverage of a downlink RIS-assisted network with one base station (BS) and one user equipment (UE). Since the RIS orientation and the horizontal distance between the RIS and the BS have a significant influence on the cell coverage, we formulate an RIS placement optimization problem to maximize the cell coverage by optimizing the RIS orientation and horizontal distance. To solve the formulated problem, a coverage maximization algorithm (CMA) is proposed, where a closed-form optimal RIS orientation is obtained. Numerical results verify our analysis.

We propose a variant of VAE capable of disentangling both variations within each class and variations shared across all classes. To represent these generative factors of data, we introduce two sets of continuous latent variables, private variable and public variable. Our proposed framework models the private variable as a Mixture of Gaussian and the public variable as a Gaussian, respectively. Each mode of the private variable is responsible for a class of the discrete variable. Most of the previous attempts to integrate the discrete generative factors to disentanglement assume statistical independence between the continuous and discrete variables. However, this assumption does not hold in general. Our proposed model, which we call Discond-VAE, DISentangles the class-dependent CONtinuous factors from the Discrete factors by introducing the private variables. The experiments show that Discond-VAE can discover the private and public factors from data qualitatively and quantitatively.

Unmanned vehicles often need to locate targets with high precision during work. In the unmanned material handling workshop, the unmanned vehicle needs to perform high-precision pose estimation of the workpiece to accurately grasp the workpiece. In this context, this paper proposes a high-precision unmanned vehicle target positioning system based on binocular vision. The system uses a region-based stereo matching algorithm to obtain a disparity map, and uses the RANSAC algorithm to extract position and posture features, which achives the estimation of the position and attitude of a six-degree-of-freedom cylindrical workpiece. In order to verify the effect of the system, this paper collects the accuracy and calculation time of the output results of the cylinder in different poses. The experimental data shows that the position accuracy of the system is 0.61~1.17mm and the angular accuracy is 1.95~5.13{\deg}, which can achieve better high-precision positioning effect.

Video Question Answering (Video QA) requires fine-grained understanding of both video and language modalities to answer the given questions. In this paper, we propose novel training schemes for multiple-choice video question answering with a self-supervised pre-training stage and a supervised contrastive learning in the main stage as an auxiliary learning. In the self-supervised pre-training stage, we transform the original problem format of predicting the correct answer into the one that predicts the relevant question to provide a model with broader contextual inputs without any further dataset or annotation. For contrastive learning in the main stage, we add a masking noise to the input corresponding to the ground-truth answer, and consider the original input of the ground-truth answer as a positive sample, while treating the rest as negative samples. By mapping the positive sample closer to the masked input, we show that the model performance is improved. We further employ locally aligned attention to focus more effectively on the video frames that are particularly relevant to the given corresponding subtitle sentences. We evaluate our proposed model on highly competitive benchmark datasets related to multiple-choice videoQA: TVQA, TVQA+, and DramaQA. Experimental results show that our model achieves state-of-the-art performance on all datasets. We also validate our approaches through further analyses.

Deploying Machine Learning (ML) algorithms within databases is a challenge due to the varied computational footprints of modern ML algorithms and the myriad of database technologies each with their own restrictive syntax. We introduce an Apache Spark-based micro-service orchestration framework that extends database operations to include web service primitives. Our system can orchestrate web services across hundreds of machines and takes full advantage of cluster, thread, and asynchronous parallelism. Using this framework, we provide large scale clients for intelligent services such as speech, vision, search, anomaly detection, and text analysis. This allows users to integrate ready-to-use intelligence into any datastore with an Apache Spark connector. To eliminate the majority of overhead from network communication, we also introduce a low-latency containerized version of our architecture. Finally, we demonstrate that the services we investigate are competitive on a variety of benchmarks, and present two applications of this framework to create intelligent search engines, and real time auto race analytics systems.

Conventional image retrieval techniques for Structure-from-Motion (SfM) suffer from the limit of effectively recognizing repetitive patterns and cannot guarantee to create just enough match pairs with high precision and high recall. In this paper, we present a novel retrieval method based on Graph Convolutional Network (GCN) to generate accurate pairwise matches without costly redundancy. We formulate image retrieval task as a node binary classification problem in graph data: a node is marked as positive if it shares the scene overlaps with the query image. The key idea is that we find that the local context in feature space around a query image contains rich information about the matchable relation between this image and its neighbors. By constructing a subgraph surrounding the query image as input data, we adopt a learnable GCN to exploit whether nodes in the subgraph have overlapping regions with the query photograph. Experiments demonstrate that our method performs remarkably well on the challenging dataset of highly ambiguous and duplicated scenes. Besides, compared with state-of-the-art matchable retrieval methods, the proposed approach significantly reduces useless attempted matches without sacrificing the accuracy and completeness of reconstruction.

The heavy traffic congestion problem has always been a concern for modern cities. To alleviate traffic congestion, researchers use reinforcement learning (RL) to develop better traffic signal control (TSC) algorithms in recent years. However, most RL models are trained and tested in the same traffic flow environment, which results in a serious overfitting problem. Since the traffic flow environment in the real world keeps varying, these models can hardly be applied due to the lack of generalization ability. Besides, the limited number of accessible traffic flow data brings extra difficulty in testing the generalization ability of the models. In this paper, we design a novel traffic flow generator based on Wasserstein generative adversarial network to generate sufficient diverse and quality traffic flows and use them to build proper training and testing environments. Then we propose a meta-RL TSC framework GeneraLight to improve the generalization ability of TSC models. GeneraLight boosts the generalization performance by combining the idea of flow clustering and model-agnostic meta-learning. We conduct extensive experiments on multiple real-world datasets to show the superior performance of GeneraLight on generalizing to different traffic flows.

The majority of adversarial machine learning research focuses on additive threat models, which add adversarial perturbation to input data. On the other hand, unlike image recognition problems, only a handful of threat models have been explored in the video domain. In this paper, we propose a novel adversarial attack against video recognition models, Multiplicative Adversarial Videos (MultAV), which imposes perturbation on video data by multiplication. MultAV has different noise distributions to the additive counterparts and thus challenges the defense methods tailored to resisting additive attacks. Moreover, it can be generalized to not only Lp-norm attacks with a new adversary constraint called ratio bound, but also different types of physically realizable attacks. Experimental results show that the model adversarially trained against additive attack is less robust to MultAV.

Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems. It uses the probabilities of predicting the top two most-likely classes around an input point under a smoothing distribution to generate a certified radius for a classifier's prediction. However, most smoothing methods do not give us any information about the \emph{confidence} with which the underlying classifier (e.g., deep neural network) makes a prediction. In this work, we propose a method to generate certified radii for the prediction confidence of the smoothed classifier. We consider two notions for quantifying confidence: average prediction score of a class and the margin by which the average prediction score of one class exceeds that of another. We modify the Neyman-Pearson lemma (a key theorem in randomized smoothing) to design a procedure for computing the certified radius where the confidence is guaranteed to stay above a certain threshold. Our experimental results on CIFAR-10 and ImageNet datasets show that using information about the distribution of the confidence scores allows us to achieve a significantly better certified radius than ignoring it. Thus, we demonstrate that extra information about the base classifier at the input point can help improve certified guarantees for the smoothed classifier.

In this paper, we consider data acquired by multimodal sensors capturing complementary aspects and features of a measured phenomenon. We focus on a scenario in which the measurements share mutual sources of variability but might also be contaminated by other measurement-specific sources such as interferences or noise. Our approach combines manifold learning, which is a class of nonlinear data-driven dimension reduction methods, with the well-known Riemannian geometry of symmetric and positive-definite (SPD) matrices. Manifold learning typically includes the spectral analysis of a kernel built from the measurements. Here, we take a different approach, utilizing the Riemannian geometry of the kernels. In particular, we study the way the spectrum of the kernels changes along geodesic paths on the manifold of SPD matrices. We show that this change enables us, in a purely unsupervised manner, to derive a compact, yet informative, description of the relations between the measurements, in terms of their underlying components. Based on this result, we present new algorithms for extracting the common latent components and for identifying common and measurement-specific components.

Differentially private federated learning has been intensively studied. The current works are mainly based on the \textit{curator model} or \textit{local model} of differential privacy. However, both of them have pros and cons. The curator model allows greater accuracy but requires a trusted analyzer. In the local model where users randomize local data before sending them to the analyzer, a trusted analyzer is not required, but the accuracy is limited. In this work, by leveraging the \textit{privacy amplification} effect in the recently proposed shuffle model of differential privacy, we achieve the best of two worlds, i.e., accuracy in the curator model and strong privacy without relying on any trusted party. We first propose an FL framework in the shuffle model and a simple protocol (SS-Simple) extended from existing work. We find that SS-Simple only provides an insufficient privacy amplification effect in FL since the dimension of the model parameter is quite large. To solve this challenge, we propose an enhanced protocol (SS-Double) to increase the privacy amplification effect by subsampling. Furthermore, for boosting the utility when the model size is greater than the user population, we propose an advanced protocol (SS-Topk) with gradient sparsification techniques. We also provide theoretical analysis and numerical evaluations of the privacy amplification of the proposed protocols. Experiments on real-world datasets validate that SS-Topk improves the testing accuracy by 60.7\% than the local model based FL. We highlight the observation that SS-Topk even can improve by 33.94\% accuracy than the curator model based FL without any trusted party. Compared with non-private FL, our protocol SS-Topk only lose 1.48\% accuracy under $(4.696, 10^{-5})$-DP.

Pretrained large-scale language models have increasingly demonstrated high accuracy on many natural language processing (NLP) tasks. However, the limited weight storage and computational speed on hardware platforms have impeded the popularity of pretrained models, especially in the era of edge computing. In this work, we propose an efficient transformer-based large-scale language representation using hardware-friendly block structure pruning. We incorporate the reweighted group Lasso into block-structured pruning for optimization. Besides the significantly reduced weight storage and computation, the proposed approach achieves high compression rates. Experimental results on different models (BERT, RoBERTa, and DistilBERT) on the General Language Understanding Evaluation (GLUE) benchmark tasks show that we achieve up to 5.0x with zero or minor accuracy degradation on certain task(s). Our proposed method is also orthogonal to existing compact pretrained language models such as DistilBERT using knowledge distillation, since a further 1.79x average compression rate can be achieved on top of DistilBERT with zero or minor accuracy degradation. It is suitable to deploy the final compressed model on resource-constrained edge devices.

Recent work (Feng et al., 2018) establishes the presence of short, uninterpretable input fragments that yield high confidence and accuracy in neural models. We refer to these as Minimal Prediction Preserving Inputs (MPPIs). In the context of question answering, we investigate competing hypotheses for the existence of MPPIs, including poor posterior calibration of neural models, lack of pretraining, and "dataset bias" (where a model learns to attend to spurious, non-generalizable cues in the training data). We discover a perplexing invariance of MPPIs to random training seed, model architecture, pretraining, and training domain. MPPIs demonstrate remarkable transferability across domains - closing half the gap between models' performance on comparably short queries and original queries. Additionally, penalizing over-confidence on MPPIs fails to improve either generalization or adversarial robustness. These results suggest the interpretability of MPPIs is insufficient to characterize generalization capacity of these models. We hope this focused investigation encourages a more systematic analysis of model behavior outside of the human interpretable distribution of examples.

The heterogeneous network is a robust data abstraction that can model entities of different types interacting in various ways. Such heterogeneity brings rich semantic information but presents nontrivial challenges in aggregating the heterogeneous relationships between objects - especially those of higher-order indirect relations. Recent graph neural network approaches for representation learning on heterogeneous networks typically employ the attention mechanism, which is often only optimized for predictions based on direct links. Furthermore, even though most deep learning methods can aggregate higher-order information by building deeper models, such a scheme can diminish the degree of interpretability. To overcome these challenges, we explore an architecture - Layer-stacked ATTention Embedding (LATTE) - that automatically decomposes higher-order meta relations at each layer to extract the relevant heterogeneous neighborhood structures for each node. Additionally, by successively stacking layer representations, the learned node embedding offers a more interpretable aggregation scheme for nodes of different types at different neighborhood ranges. We conducted experiments on several benchmark heterogeneous network datasets. In both transductive and inductive node classification tasks, LATTE can achieve state-of-the-art performance compared to existing approaches, all while offering a lightweight model. With extensive experimental analyses and visualizations, the framework can demonstrate the ability to extract informative insights on heterogeneous networks.

In this paper we propose and analyze a finite difference numerical scheme for the Poisson-Nernst-Planck equation (PNP) system. To understand the energy structure of the PNP model, we make use of the Energetic Variational Approach (EnVarA), so that the PNP system could be reformulated as a non-constant mobility $H^{-1}$ gradient flow, with singular logarithmic energy potentials involved. To ensure the unique solvability and energy stability, the mobility function is explicitly treated, while both the logarithmic and the electric potential diffusion terms are treated implicitly, due to the convex nature of these two energy functional parts. The positivity-preserving property for both concentrations, $n$ and $p$, is established at a theoretical level. This is based on the subtle fact that the singular nature of the logarithmic term around the value of $0$ prevents the numerical solution reaching the singular value, so that the numerical scheme is always well-defined. In addition, an optimal rate convergence analysis is provided in this work, in which many highly non-standard estimates have to be involved, due to the nonlinear parabolic coefficients. The higher order asymptotic expansion (up to third order temporal accuracy and fourth order spatial accuracy), the rough error estimate (to establish the $\ell^\infty$ bound for $n$ and $p$), and the refined error estimate have to be carried out to accomplish such a convergence result. In our knowledge, this work will be the first to combine the following three theoretical properties for a numerical scheme for the PNP system: (i) unique solvability and positivity, (ii) energy stability, and (iii) optimal rate convergence. A few numerical results are also presented in this article, which demonstrates the robustness of the proposed numerical scheme.

Music-to-visual style transfer is a challenging yet important cross-modal learning problem in the practice of creativity. Its major difference from the traditional image style transfer problem is that the style information is provided by music rather than images. Assuming that musical features can be properly mapped to visual contents through semantic links between the two domains, we solve the music-to-visual style transfer problem in two steps: music visualization and style transfer. The music visualization network utilizes an encoder-generator architecture with a conditional generative adversarial network to generate image-based music representations from music data. This network is integrated with an image style transfer method to accomplish the style transfer process. Experiments are conducted on WikiArt-IMSLP, a newly compiled dataset including Western music recordings and paintings listed by decades. By utilizing such a label to learn the semantic connection between paintings and music, we demonstrate that the proposed framework can generate diverse image style representations from a music piece, and these representations can unveil certain art forms of the same era. Subjective testing results also emphasize the role of the era label in improving the perceptual quality on the compatibility between music and visual content.

Traffic forecasting is an important prerequisite for the application of intelligent transportation systems in urban traffic networks. The existing works adopted RNN and CNN/GCN, among which GCRN is the state of art work, to characterize the temporal and spatial correlation of traffic flows. However, it is hard to apply GCRN to the large scale road networks due to high computational complexity. To address this problem, we propose to abstract the road network into a geometric graph and build a Fast Graph Convolution Recurrent Neural Network (FastGCRNN) to model the spatial-temporal dependencies of traffic flow. Specifically, We use FastGCN unit to efficiently capture the topological relationship between the roads and the surrounding roads in the graph with reducing the computational complexity through importance sampling, combine GRU unit to capture the temporal dependency of traffic flow, and embed the spatiotemporal features into Seq2Seq based on the Encoder-Decoder framework. Experiments on large-scale traffic data sets illustrate that the proposed method can greatly reduce computational complexity and memory consumption while maintaining relatively high accuracy.

This paper proposes a new pre-training method, called Code-Switching Pre-training (CSP for short) for Neural Machine Translation (NMT). Unlike traditional pre-training method which randomly masks some fragments of the input sentence, the proposed CSP randomly replaces some words in the source sentence with their translation words in the target language. Specifically, we firstly perform lexicon induction with unsupervised word embedding mapping between the source and target languages, and then randomly replace some words in the input sentence with their translation words according to the extracted translation lexicons. CSP adopts the encoder-decoder framework: its encoder takes the code-mixed sentence as input, and its decoder predicts the replaced fragment of the input sentence. In this way, CSP is able to pre-train the NMT model by explicitly making the most of the cross-lingual alignment information extracted from the source and target monolingual corpus. Additionally, we relieve the pretrain-finetune discrepancy caused by the artificial symbols like [mask]. To verify the effectiveness of the proposed method, we conduct extensive experiments on unsupervised and supervised NMT. Experimental results show that CSP achieves significant improvements over baselines without pre-training or with other pre-training methods.

Often in applications ranging from medical imaging and sensor networks to error correction and data science (and beyond), one needs to solve large-scale linear systems in which a fraction of the measurements have been corrupted. We consider solving such large-scale systems of linear equations $\mathbf{A}\mathbf{x}=\mathbf{b}$ that are inconsistent due to corruptions in the measurement vector $\mathbf{b}$. We develop several variants of iterative methods that converge to the solution of the uncorrupted system of equations, even in the presence of large corruptions. These methods make use of a quantile of the absolute values of the residual vector in determining the iterate update. We present both theoretical and empirical results that demonstrate the promise of these iterative approaches.

What is the frequency content of temporal logic formulas? That is, when we monitor a signal against a formula, which frequency bands of the signal are relevant to the logic and should be preserved, and which can be safely discarded? This question is relevant whenever signals are filtered or compressed before being monitored, which is almost always the case for analog signals. To answer this question, we focus on monitors that measure the robustness of a signal relative to a specification in Signal Temporal Logic. We prove that robustness monitors can be modeled using Volterra series. We then study the Fourier transforms of these Volterra representations, and provide a method to derive the Fourier transforms of entire formulas. We also make explicit the measurement process in temporal logic and re-define it on the basis of distributions to make it compatible with measurements in signal processing. Experiments illustrate these results. Beyond compression, this work enables the integration of temporal logic monitoring into common signal processing toolchains as just another signal processing operation, and enables a common formalism to study both logical and non-logical operations in the frequency domain, which we refer to as Logical Signal Processing.

We introduce a new notion of generalization-- Distributional Generalization-- which roughly states that outputs of a classifier at train and test time are close *as distributions*, as opposed to close in just their average error. For example, if we mislabel 30% of dogs as cats in the train set of CIFAR-10, then a ResNet trained to interpolation will in fact mislabel roughly 30% of dogs as cats on the *test set* as well, while leaving other classes unaffected. This behavior is not captured by classical generalization, which would only consider the average error and not the distribution of errors over the input domain. This example is a specific instance of our much more general conjectures which apply even on distributions where the Bayes risk is zero. Our conjectures characterize the form of distributional generalization that can be expected, in terms of problem parameters (model architecture, training procedure, number of samples, data distribution). We verify the quantitative predictions of these conjectures across a variety of domains in machine learning, including neural networks, kernel machines, and decision trees. These empirical observations are independently interesting, and form a more fine-grained characterization of interpolating classifiers beyond just their test error.

The global pandemic caused by COVID-19 affects our lives in all aspects. As of September 11, more than 28 million people have tested positive for COVID-19 infection, and more than 911,000 people have lost their lives in this virus battle. Some patients can not receive appropriate medical treatment due the limits of hospitalization volume and shortage of ICU beds. An estimated future hospitalization is critical so that medical resources can be allocated as needed. In this study, we propose to use 4 recurrent neural networks to infer hospitalization change for the following week compared with the current week. Results show that sequence to sequence model with attention achieves a high accuracy of 0.938 and AUC of 0.850 in the hospitalization prediction. Our work has the potential to predict the hospitalization need and send a warning to medical providers and other stakeholders when a re-surge initializes.

Deep neural networks have been shown to be vulnerable to membership inference attacks wherein the attacker aims to detect whether specific input data were used to train the model. These attacks can potentially leak private or proprietary data. We present a new extension of Fano's inequality and employ it to theoretically establish that the probability of success for a membership inference attack on a deep neural network can be bounded using the mutual information between its inputs and its activations. This enables the use of mutual information to measure the susceptibility of a DNN model to membership inference attacks. In our empirical evaluation, we show that the correlation between the mutual information and the susceptibility of the DNN model to membership inference attacks is 0.966, 0.996, and 0.955 for CIFAR-10, SVHN and GTSRB models, respectively.

To reach a broader audience and optimize traffic toward news articles, media outlets commonly run social media accounts and share their content with a short text summary. Despite its importance of writing a compelling message in sharing articles, research community does not own a sufficient level of understanding of what kinds of editing strategies are effective in promoting audience engagement. In this study, we aim to fill the gap by analyzing the current practices of media outlets using a data-driven approach. We first build a parallel corpus of original news articles and their corresponding tweets that were shared by eight media outlets. Then, we explore how those media edited tweets against original headlines, and the effects would be. To estimate the effects of editing news headlines for social media sharing in audience engagement, we present a systematic analysis that incorporates a causal inference technique with deep learning; using propensity score matching, it allows for estimating potential (dis-)advantages of an editing style compared to counterfactual cases where a similar news article is shared with a different style. According to the analyses of various editing styles, we report common and differing effects of the styles across the outlets. To understand the effects of various editing styles, media outlets could apply our easy-to-use tool by themselves.

In real-world applications, data do not reflect the ones commonly used for neural networks training, since they are usually few, unbalanced, unlabeled and can be available as a stream. Hence many existing deep learning solutions suffer from a limited range of applications, in particular in the case of online streaming data that evolve over time. To narrow this gap, in this work we introduce a novel and complex setting involving unsupervised meta-continual learning with unbalanced tasks. These tasks are built through a clustering procedure applied to a fitted embedding space. We exploit a meta-learning scheme that simultaneously alleviates catastrophic forgetting and favors the generalization to new tasks, even Out-of-Distribution ones. Moreover, to encourage feature reuse during the meta-optimization, we exploit a single inner loop taking advantage of an aggregated representation achieved through the use of a self-attention mechanism. Experimental results on few-shot learning benchmarks show competitive performance even compared to the supervised case. Additionally, we empirically observe that in an unsupervised scenario, the small tasks and the variability in the clusters pooling play a crucial role in the generalization capability of the network. Further, on complex datasets, the exploitation of more clusters than the true number of classes leads to higher results, even compared to the ones obtained with full supervision, suggesting that a predefined partitioning into classes can miss relevant structural information.

The field of computer vision has witnessed phenomenal progress in recent years partially due to the development of deep convolutional neural networks. However, deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images. Some existing defense methods require to re-train attacked target networks and augment the train set via known adversarial attacks, which is inefficient and might be unpromising with unknown attack types. To overcome the above issues, we propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks. The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises. To avoid pretrained parameters exploited by attackers, we alternately update the generator and the synthesized image at the inference stage. Experimental results demonstrate that the proposed defensive scheme and method outperforms a series of state-of-the-art defending models against gray-box adversarial attacks.

Active inference is a normative framework for generating behaviour based upon the free energy principle, a global theory of self-organisation. This framework has been successfully used to solve reinforcement learning and stochastic control problems, yet, the formal relation between active inference and reward maximisation has not been fully explicated. In this paper, we consider the relation between active inference and dynamic programming under the Bellman equation, which underlies many approaches to reinforcement learning and control. Our contribution shows that, on finite-horizon partially observed Markov decision processes, dynamic programming is a limiting case of active inference. In active inference, agents select actions in order to maximise expected free energy. In the absence of ambiguity about the latent causes of outcomes, this reduces to matching a target distribution encoding the agent's preferences. When these target states correspond to rewarding states, this minimises risk or maximises expected reward, as in reinforcement learning. When states are partially observed or ambiguous, an active inference agent will choose the action that minimises both risk and ambiguity. This allows active inference agents to supplement their reward maximising (or exploitative) behaviour with novelty-seeking (or exploratory) behaviour. This speaks to the unifying potential of active inference, as the functional optimised during action selection subsumes many important quantities used in decision-making in the physical, engineering, and life sciences.

Recognizing toponyms and resolving them to their real-world referents is required for providing advanced semantic access to textual data. This process is often hindered by the high degree of variation in toponyms. Candidate selection is the task of identifying the potential entities that can be referred to by a toponym previously recognized. While it has traditionally received little attention in the research community, it has been shown that candidate selection has a significant impact on downstream tasks (i.e. entity resolution), especially in noisy or non-standard text. In this paper, we introduce a flexible deep learning method for candidate selection through toponym matching, using state-of-the-art neural network architectures. We perform an intrinsic toponym matching evaluation based on several new realistic datasets, which cover various challenging scenarios (cross-lingual and regional variations, as well as OCR errors). We report its performance on candidate selection in the context of the downstream task of toponym resolution, both on existing datasets and on a new manually-annotated resource of nineteenth-century English OCR'd text.

Structured belief states are crucial for user goal tracking and database query in task-oriented dialog systems. However, training belief trackers often requires expensive turn-level annotations of every user utterance. In this paper we aim at alleviating the reliance on belief state labels in building end-to-end dialog systems, by leveraging unlabeled dialog data towards semi-supervised learning. We propose a probabilistic dialog model, called the LAtent BElief State (LABES) model, where belief states are represented as discrete latent variables and jointly modeled with system responses given user inputs. Such latent variable modeling enables us to develop semi-supervised learning under the principled variational learning framework. Furthermore, we introduce LABES-S2S, which is a copy-augmented Seq2Seq model instantiation of LABES. In supervised experiments, LABES-S2S obtains strong results on three benchmark datasets of different scales. In utilizing unlabeled dialog data, semi-supervised LABES-S2S significantly outperforms both supervised-only and semi-supervised baselines. Remarkably, we can reduce the annotation demands to 50% without performance loss on MultiWOZ.

Object detectors are usually trained with large amount of labeled data, which is expensive and labor-intensive. Pre-trained detectors applied to unlabeled dataset always suffer from the difference of dataset distribution, also called domain shift. Domain adaptation for object detection tries to adapt the detector from labeled datasets to unlabeled ones for better performance. In this paper, we are the first to reveal that the region proposal network (RPN) and region proposal classifier~(RPC) in the endemic two-stage detectors (e.g., Faster RCNN) demonstrate significantly different transferability when facing large domain gap. The region classifier shows preferable performance but is limited without RPN's high-quality proposals while simple alignment in the backbone network is not effective enough for RPN adaptation. We delve into the consistency and the difference of RPN and RPC, treat them individually and leverage high-confidence output of one as mutual guidance to train the other. Moreover, the samples with low-confidence are used for discrepancy calculation between RPN and RPC and minimax optimization. Extensive experimental results on various scenarios have demonstrated the effectiveness of our proposed method in both domain-adaptive region proposal generation and object detection. Code is available at https://github.com/GanlongZhao/CST_DA_detection.

We present a method to automatically find security strategies for the use case of intrusion prevention. Following this method, we model the interaction between an attacker and a defender as a Markov game and let attack and defense strategies evolve through reinforcement learning and self-play without human intervention. Using a simple infrastructure configuration, we demonstrate that effective security strategies can emerge from self-play. This shows that self-play, which has been applied in other domains with great success, can be effective in the context of network security. Inspection of the converged policies show that the emerged policies reflect common-sense knowledge and are similar to strategies of humans. Moreover, we address known challenges of reinforcement learning in this domain and present an approach that uses function approximation, an opponent pool, and an autoregressive policy representation. Through evaluations we show that our method is superior to two baseline methods but that policy convergence in self-play remains a challenge.

Diffuse Large B-Cell Lymphoma (DLBCL) is the most common non-Hodgkin lymphoma. Though histologically DLBCL shows varying morphologies, no morphologic features have been consistently demonstrated to correlate with prognosis. We present a morphologic analysis of histology sections from 209 DLBCL cases with associated clinical and cytogenetic data. Duplicate tissue core sections were arranged in tissue microarrays (TMAs), and replicate sections were stained with H&E and immunohistochemical stains for CD10, BCL6, MUM1, BCL2, and MYC. The TMAs are accompanied by pathologist-annotated regions-of-interest (ROIs) that identify areas of tissue representative of DLBCL. We used a deep learning model to segment all tumor nuclei in the ROIs, and computed several geometric features for each segmented nucleus. We fit a Cox proportional hazards model to demonstrate the utility of these geometric features in predicting survival outcome, and found that it achieved a C-index (95% CI) of 0.635 (0.574,0.691). Our finding suggests that geometric features computed from tumor nuclei are of prognostic importance, and should be validated in prospective studies.

How do algorithmic decision aids introduced in business decision processes affect task performance? In a first experiment, we study effective collaboration. Faced with a decision, subjects alone have a success rate of 72%; Aided by a recommender that has a 75% success rate, their success rate reaches 76%. The human-system collaboration had thus a greater success rate than each taken alone. However, we noted a complacency/authority bias that degraded the quality of decisions by 5% when the recommender was wrong. This suggests that any lingering algorithmic bias may be amplified by decision aids. In a second experiment, we evaluated the effectiveness of 5 presentation variants in reducing complacency bias. We found that optional presentation increases subjects' resistance to wrong recommendations. We conclude by arguing that our metrics, in real usage scenarios, where decision aids are embedded as system-wide features in Business Process Management software, can lead to enhanced benefits.

In this paper, we propose Multi^2OIE, which performs open information extraction (open IE) by combining BERT with multi-head attention. Our model is a sequence-labeling system with an efficient and effective argument extraction method. We use a query, key, and value setting inspired by the Multimodal Transformer to replace the previously used bidirectional long short-term memory architecture with multi-head attention. Multi^2OIE outperforms existing sequence-labeling systems with high computational efficiency on two benchmark evaluation datasets, Re-OIE2016 and CaRB. Additionally, we apply the proposed method to multilingual open IE using multilingual BERT. Experimental results on new benchmark datasets introduced for two languages (Spanish and Portuguese) demonstrate that our model outperforms other multilingual systems without training data for the target languages.

Borehole resistivity measurements are routinely inverted in real-time during geosteering operations. The inversion process can be efficiently performed with the help of advanced artificial intelligence algorithms such as deep learning. These methods require a large dataset that relates multiple earth models with the corresponding borehole resistivity measurements. In here, we propose to use an advanced numerical method --refined isogeometric analysis (rIGA)-- to perform rapid and accurate 2.5D simulations and generate databases when considering arbitrary 2D earth models. Numerical results show that we can generate a meaningful synthetic database composed of 100,000 earth models with the corresponding measurements in 56 hours using a workstation equipped with two CPUs.

Wear is well known for causing material loss in a sliding interface. Available macroscopic approaches are bound to empirical fitting parameters, which range several orders of magnitude. Major advances in tribology have recently been achieved via Molecular Dynamics, although its use is strongly limited by computational cost. Here, we propose a study of the physical processes that lead to wear at the scale of the surface roughness, where adhesive junctions are formed between the asperities on the surface of the materials. Using a brittle formulation of the variational phase-field approach to fracture, we demonstrate that the failure mechanisms of an adhesive junction can be linked to its geometry. By imposing specific couplings between the damage and the elastic energy, we further investigate the triggering processes underlying each failure mechanism. We show that a large debris formation is mostly triggered by tensile stresses while shear stresses lead to small or no particle formation. We also study groups of junctions and discuss how microcontact interactions can be favored in some geometries to form macro-particles. This leads us to propose a classification in terms of macroscopic wear rate. Although based on a continuum approach, our phase-field calculations are able to effectively capture the failure of adhesive junctions, as observed through discrete Molecular Dynamics simulations.

Few-learn learning (FSL) is one of the key future steps in machine learning and has raised a lot of attention. However, in contrast to the rapid development in other domains, such as Computer Vision, the progress of FSL in Nature Language Processing (NLP) is much slower. One of the key reasons for this is the lacking of public benchmarks. NLP FSL researches always report new results on their own constructed few-shot datasets, which is pretty inefficient in results comparison and thus impedes cumulative progress. In this paper, we present FewJoint, a novel Few-Shot Learning benchmark for NLP. Different from most NLP FSL research that only focus on simple N-classification problems, our benchmark introduces few-shot joint dialogue language understanding, which additionally covers the structure prediction and multi-task reliance problems. This allows our benchmark to reflect the real-word NLP complexity beyond simple N-classification. Our benchmark is used in the few-shot learning contest of SMP2020-ECDT task-1. We also provide a compatible FSL platform to ease experiment set-up.

In this paper we focus on the problem of learning an optimal policy for Active Visual Search (AVS) of objects in known indoor environments with an online setup. Our POMP method uses as input the current pose of an agent (e.g. a robot) and a RGB-D frame. The task is to plan the next move that brings the agent closer to the target object. We model this problem as a Partially Observable Markov Decision Process solved by a Monte-Carlo planning approach. This allows us to make decisions on the next moves by iterating over the known scenario at hand, exploring the environment and searching for the object at the same time. Differently from the current state of the art in Reinforcement Learning, POMP does not require extensive and expensive (in time and computation) labelled data so being very agile in solving AVS in small and medium real scenarios. We only require the information of the floormap of the environment, an information usually available or that can be easily extracted from an a priori single exploration run. We validate our method on the publicly available AVD benchmark, achieving an average success rate of 0.76 with an average path length of 17.1, performing close to the state of the art but without any training needed. Additionally, we show experimentally the robustness of our method when the quality of the object detection goes from ideal to faulty.

For providing quick and accurate search results, a search engine maintains a local snapshot of the entire web. And, to keep this local cache fresh, it employs a crawler for tracking changes across various web pages. It would have been ideal if the crawler managed to update the local snapshot as soon as a page changed on the web. However, finite bandwidth availability and server restrictions mean that there is a bound on how frequently the different pages can be crawled. This then brings forth the following optimisation problem: maximise the freshness of the local cache subject to the crawling frequency being within the prescribed bounds. Recently, tractable algorithms have been proposed to solve this optimisation problem under different cost criteria. However, these assume the knowledge of exact page change rates, which is unrealistic in practice. We address this issue here. Specifically, we provide three novel schemes for online estimation of page change rates. All these schemes only need partial information about the page change process, i.e., they only need to know if the page has changed or not since the last crawl instance. Our first scheme is based on the law of large numbers, the second on the theory of stochastic approximation, while the third is an extension of the second and involves an additional momentum term. For all of these schemes, we prove convergence and, also, provide their convergence rates. As far as we know, the results concerning the third estimator is quite novel. Specifically, this is the first convergence type result for a stochastic approximation algorithm with momentum. Finally, we provide some numerical experiments (on real as well as synthetic data) to compare the performance of our proposed estimators with the existing ones (e.g., MLE).

The monitoring of underground criminal activities is often automated to maximize the data collection and to train ML models to automatically adapt data collection tools to different communities. On the other hand, sophisticated adversaries may adopt crawling-detection capabilities that may significantly jeopardize researchers' opportunities to perform the data collection, for example by putting their accounts under the spotlight and being expelled from the community. This is particularly undesirable in prominent and high-profile criminal communities where entry costs are significant (either monetarily or for example for background checking or other trust-building mechanisms). This paper presents CARONTE, a tool to semi-automatically learn virtually any forum structure for parsing and data-extraction, while maintaining a low profile for the data collection and avoiding the requirement of collecting massive datasets to maintain tool scalability. We showcase the tool against four underground forums, and compare the network traffic it generates (as seen from the adversary's position, i.e. the underground community's server) against state-of-the-art tools for web-crawling as well as human users.

Data skipping reduces I/O for SQL queries by skipping over irrelevant data objects (files) based on their metadata. We extend this notion by allowing developers to define their own data skipping metadata types and indexes using a flexible API. Our framework is the first to natively support data skipping for arbitrary data types (e.g. geospatial, logs) and queries with User Defined Functions (UDFs). We integrated our framework with Apache Spark and it is now deployed across multiple products/services at IBM. We present our extensible data skipping APIs, discuss index design, and implement various metadata indexes, requiring only around 30 lines of additional code per index. In particular we implement data skipping for a third party library with geospatial UDFs and demonstrate speedups of two orders of magnitude. Our centralized metadata approach provides a x3.6 speed up even when compared to queries which are rewritten to exploit Parquet min/max metadata. We demonstrate that extensible data skipping is applicable to broad class of applications, where user defined indexes achieve significant speedups and cost savings with very low development cost.

China has become one of the largest entertainment markets in the world in recent years. Due to the success of Xiaomi, many Chinese pop music industry entrepreneurs believe "Fans Economy" works in the pop music industry. "Fans Economy" is based on the assumption that pop music consumer market could be segmented based on artists. Each music artist has its own exclusive loyal fans. In this paper, we provide an insightful study of the pop music artists and fans social network. Particularly, we segment the pop music consumer market and pop music artists respectively. Our results show that due to the Matthew Effect and limited diversity of consumer market, "Fans Economy" does not work for the Chinese pop music industry.

Traditional event coreference systems usually rely on pipeline framework and hand-crafted features, which often face error propagation problem and have poor generalization ability. In this paper, we propose an End-to-End Event Coreference approach -- E3C neural network, which can jointly model event detection and event coreference resolution tasks, and learn to extract features from raw text automatically. Furthermore, because event mentions are highly diversified and event coreference is intricately governed by long-distance, semantic-dependent decisions, a type-guided event coreference mechanism is further proposed in our E3C neural network. Experiments show that our method achieves new state-of-the-art performance on two standard datasets.

We introduce and study two natural generalizations of the Connected VertexCover (VC) problem: the $p$-Edge-Connected and $p$-Vertex-Connected VC problem (where $p \geq 2$ is a fixed integer). Like Connected VC, both new VC problems are FPT, but do not admit a polynomial kernel unless $NP \subseteq coNP/poly$, which is highly unlikely. We prove however that both problems admit time efficient polynomial sized approximate kernelization schemes. We obtain an $O(2^{O(pk)}n^{O(1)})$-time algorithm for the $p$-Edge-Connected VC and an $O(2^{O(k^2)}n^{O(1)})$-time algorithm for the $p$-Vertex-Connected VC. Finally, we describe a $2(p+1)$-approximation algorithm for the $p$-Edge-Connected VC. The proofs for the new VC problems require more sophisticated arguments than for Connected VC. In particular, for the approximation algorithm we use Gomory-Hu trees and for the approximate kernels a result on small-size spanning $p$-vertex/edge-connected subgraph of a $p$-vertex/edge-connected graph obtained independently by Nishizeki and Poljak (1994) and Nagamochi and Ibaraki (1992).

We propose a Byzantine-robust variance-reduced stochastic gradient descent (SGD) method to solve the distributed finite-sum minimization problem when the data on the workers are not independent and identically distributed (i.i.d.). During the learning process, an unknown number of Byzantine workers may send malicious messages to the master node, leading to remarkable learning error. Most of the Byzantine-robust methods address this issue by using robust aggregation rules to aggregate the received messages, but rely on the assumption that all the regular workers have i.i.d. data, which is not the case in many federated learning applications. In light of the significance of reducing stochastic gradient noise for mitigating the effect of Byzantine attacks, we use a resampling strategy to reduce the impact of both inner variation (that describes the sample heterogeneity on every regular worker) and outer variation (that describes the sample heterogeneity among the regular workers), along with a stochastic average gradient algorithm (SAGA) to fully eliminate the inner variation. The variance-reduced messages are then aggregated with a robust geometric median operator. Under certain conditions, we prove that the proposed method reaches a neighborhood of the optimal solution with linear convergence rate, and the learning error is much smaller than those given by the state-of-the-art methods in the non-i.i.d. setting. Numerical experiments corroborate the theoretical results and show satisfactory performance of the proposed method.

We use the refined isogeometric analysis (rIGA) to solve generalized Hermitian eigenproblems $({Ku=\lambda Mu})$. The rIGA framework conserves the desirable properties of maximum-continuity isogeometric analysis (IGA) discretizations while reducing the computation cost of the solution through partitioning the computational domain by adding zero-continuity basis functions. As a result, rIGA enriches the approximation space and decreases the interconnection between degrees of freedom. We compare computational costs of rIGA versus those of IGA when employing a Lanczos eigensolver with a shift-and-invert spectral transformation. When all eigenpairs within a given interval ${[\lambda_s,\lambda_e]}$ are of interest, we select several shifts ${\sigma_k\in[\lambda_s,\lambda_e]}$ using a spectrum slicing technique. For each shift $\sigma_k$, the cost of factorization of the spectral transformation matrix ${K-\sigma_k M}$ drives the total computational cost of the eigensolution. Several multiplications of the operator matrices ${(K-\sigma_k M)^{-1} M}$ by vectors follow this factorization. Let $p$ be the polynomial degree of basis functions and assume that IGA has maximum continuity of ${p-1}$, while rIGA introduces $C^0$ separators to minimize the factorization cost. For this setup, our theoretical estimates predict computational savings to compute a fixed number of eigenpairs of up to ${O(p^2)}$ in the asymptotic regime, that is, large problem sizes. Yet, our numerical tests show that for moderately-sized eigenproblems, the total computational cost reduction is $O(p)$. Nevertheless, rIGA improves the accuracy of every eigenpair of the first $N_0$ eigenvalues and eigenfunctions. Here, we allow $N_0$ to be as large as the total number of eigenmodes of the original maximum-continuity IGA discretization.

Deep neural networks (DNNs) are usually over-parameterized to increase the likelihood of getting adequate initial weights by random initialization. Consequently, trained DNNs have many redundancies which can be pruned from the model to reduce complexity and improve the ability to generalize. Structural sparsity, as achieved by filter pruning, directly reduces the tensor sizes of weights and activations and is thus particularly effective for reducing complexity. We propose "Holistic Filter Pruning" (HFP), a novel approach for common DNN training that is easy to implement and enables to specify accurate pruning rates for the number of both parameters and multiplications. After each forward pass, the current model complexity is calculated and compared to the desired target size. By gradient descent, a global solution can be found that allocates the pruning budget over the individual layers such that the desired target size is fulfilled. In various experiments, we give insights into the training and achieve state-of-the-art performance on CIFAR-10 and ImageNet (HFP prunes 60% of the multiplications of ResNet-50 on ImageNet with no significant loss in the accuracy). We believe our simple and powerful pruning approach to constitute a valuable contribution for users of DNNs in low-cost applications.

ISCAS participated in two subtasks of SemEval 2020 Task 5: detecting counterfactual statements and detecting antecedent and consequence. This paper describes our system which is based on pre-trained transformers. For the first subtask, we train several transformer-based classifiers for detecting counterfactual statements. For the second subtask, we formulate antecedent and consequence extraction as a query-based question answering problem. The two subsystems both achieved third place in the evaluation. Our system is openly released at https://github.com/casnlu/ISCAS-SemEval2020Task5.

We study here several variants of the covariates fine balance problem where we generalize some of these problems and introduce a number of others. We present here a comprehensive complexity study of the covariates problems providing polynomial time algorithms, or a proof of NP-hardness. The polynomial time algorithms described are mostly combinatorial and rely on network flow techniques. In addition we present several fixed-parameter tractable results for problems where the number of covariates and the number of levels of each covariate are seen as a parameter.

Serverless computing shows good promise for efficiency and ease-of-use. Yet, there are only a few, scattered and sometimes conflicting reports on questions such as 'Why do so many companies adopt serverless?', 'When are serverless applications well suited?', and 'How are serverless applications currently implemented?' To address these questions, we analyze 89 serverless applications from open-source projects, industrial sources, academic literature, and scientific computing - the most extensive study to date.

We show a new simple algorithm that checks whether a given higher-order grammar generates a nonempty language of trees. The algorithm amounts to a procedure that transforms a grammar of order n to a grammar of order n-1, preserving nonemptiness, and increasing the size only exponentially. After repeating the procedure n times, we obtain a grammar of order 0, whose nonemptiness can be easily checked. Since the size grows exponentially at each step, the overall complexity is n-EXPTIME, which is known to be optimal. More precisely, the transformation (and hence the whole algorithm) is linear in the size of the grammar, assuming that the arity of employed nonterminals is bounded by a constant. The same algorithm allows to check whether an infinite tree generated by a higher-order recursion scheme is accepted by an alternating safety (or reachability) automaton, because this question can be reduced to the nonemptiness problem by taking a product of the recursion scheme with the automaton. A proof of correctness of the algorithm is formalised in the proof assistant Coq. Our transformation is motivated by a similar transformation of Asada and Kobayashi (2020) changing a word grammar of order n to a tree grammar of order n-1. The step-by-step approach can be opposed to previous algorithms solving the nonemptiness problem "in one step", being compulsorily more complicated.

We explore the performance of Bidirectional Encoder Representations from Transformers (BERT) at definition extraction. We further propose a joint model of BERT and Text Level Graph Convolutional Network so as to incorporate dependencies into the model. Our proposed model produces better results than BERT and achieves comparable results to BERT with fine tuned language model in DeftEval (Task 6 of SemEval 2020), a shared task of classifying whether a sentence contains a definition or not (Subtask 1).

Locating populations in rural areas of developing countries has attracted the attention of humanitarian mapping projects since it is important to plan actions that affect vulnerable areas. Recent efforts have tackled this problem as the detection of buildings in aerial images. However, the quality and the amount of rural building annotated data in open mapping services like OpenStreetMap (OSM) is not sufficient for training accurate models for such detection. Although these methods have the potential of aiding in the update of rural building information, they are not accurate enough to automatically update the rural building maps. In this paper, we explore a human-computer interaction approach and propose an interactive method to support and optimize the work of volunteers in OSM. The user is asked to verify/correct the annotation of selected tiles during several iterations and therefore improving the model with the new annotated data. The experimental results, with simulated and real user annotation corrections, show that the proposed method greatly reduces the amount of data that the volunteers of OSM need to verify/correct. The proposed methodology could benefit humanitarian mapping projects, not only by making more efficient the process of annotation but also by improving the engagement of volunteers.

We continue the study of the class of binary extended perfect propelinear codes constructed in the previous paper and consider their permutation automorphism (symmetry) groups and Steiner quadruple systems. We show that the automorphism group of the SQS of any such code coincides with the permutation automorphism group of the code. In particular, the SQS of these codes are complete invariants for the isomorphism classes of these codes. We obtain a criterion for the point transitivity of the automorphism group of SQS of proposed codes in terms of GL-equivalence (similar to EA-type equivalence for permutations of F^r). Based on these results we suggest a new construction for coordinate transitive and neighbor transitive extended perfect codes.

In the last decade, extended efforts have been poured into energy efficiency. Several energy consumption datasets were henceforth published, with each dataset varying in properties, uses and limitations. For instance, building energy consumption patterns are sourced from several sources, including ambient conditions, user occupancy, weather conditions and consumer preferences. Thus, a proper understanding of the available datasets will result in a strong basis for improving energy efficiency. Starting from the necessity of a comprehensive review of existing databases, this work is proposed to survey, study and visualize the numerical and methodological nature of building energy consumption datasets. A total of thirty-one databases are examined and compared in terms of several features, such as the geographical location, period of collection, number of monitored households, sampling rate of collected data, number of sub-metered appliances, extracted features and release date. Furthermore, data collection platforms and related modules for data transmission, data storage and privacy concerns used in different datasets are also analyzed and compared. Based on the analytical study, a novel dataset has been presented, namely Qatar university dataset, which is an annotated power consumption anomaly detection dataset. The latter will be very useful for testing and training anomaly detection algorithms, and hence reducing wasted energy. Moving forward, a set of recommendations is derived to improve datasets collection, such as the adoption of multi-modal data collection, smart Internet of things data collection, low-cost hardware platforms and privacy and security mechanisms. In addition, future directions to improve datasets exploitation and utilization are identified, including the use of novel machine learning solutions, innovative visualization tools and explainable recommender systems.

Driven by the need to coordinate activities of multiple agile development teams cooperating to produce a large software product, software-intensive organizations are turning to scaling agile software development frameworks. Despite the growing adoption of various scalin g agile frameworks, there is little empirical evidence of how effective their practices are in mitigating risk, especially in global software develop ment (GSD), where project failure is a known problem. In this study, we develop a GSD Risk Catalog of 63 risks to assess the degree to which two scaling agile frameworks--Disciplined Agile Delivery (DAD) and the Scaled Agile Framework (SAFe)--address software project risks in GSD. We examined data from two longitudinal case studies implementing each framework to identify the extent to which the framework practices address GSD risks. Scaling agile frameworks appear to help companies eliminate or mitigate many traditional risks in GSD, especially relating to users and customers. How ever, several important risks were not eliminated or mitigated. These persistent risks in the main belonged to the Environment quadrant highlighting t he inherent risk in developing software across geographic boundaries. Perhaps these frameworks (and arguably any framework), would have difficulty all eviating, issues that appear to be outside the immediate control of the organization.

We present Vax-a-Net; a technique for immunizing convolutional neural networks (CNNs) against adversarial patch attacks (APAs). APAs insert visually overt, local regions (patches) into an image to induce misclassification. We introduce a conditional Generative Adversarial Network (GAN) architecture that simultaneously learns to synthesise patches for use in APAs, whilst exploiting those attacks to adapt a pre-trained target CNN to reduce its susceptibility to them. This approach enables resilience against APAs to be conferred to pre-trained models, which would be impractical with conventional adversarial training due to the slow convergence of APA methods. We demonstrate transferability of this protection to defend against existing APAs, and show its efficacy across several contemporary CNN architectures.

This paper addresses the problem of approximating the set of all solutions for Multi-objective Markov Decision Processes. We show that in the vast majority of interesting cases, the number of solutions is exponential or even infinite. In order to overcome this difficulty we propose to approximate the set of all solutions by means of a limited precision approach based on White's multi-objective value-iteration dynamic programming algorithm. We prove that the number of calculated solutions is tractable and show experimentally that the solutions obtained are a good approximation of the true Pareto front.

Most peridynamics models adopt regular point distribution and unified horizon, limiting their flexibility and engineering applications. In this work, a micropolar peridynamics approach with non-unified horizon (NHPD) is proposed. This approach is implemented in a conventional finite element framework, using element-based discretization. By modifying the dual horizon approach into the pre-processing part, point dependent horizon and non-unified beam-like bonds are built. By implementing a domain correction strategy, the equivalence of strain energy density is assured. Then, a novel energy density-based failure criterion is presented which directly bridges the critical stretch to the mechanical strength. The numerical results indicate the weak mesh dependency of NHPD and the effectiveness of the new failure criterion. Moreover, it is proven that damage of solid with different non-local effects can lead to similar results by only adjusting the mechanical strength.

Adversarial attacks reveal important vulnerabilities and flaws of trained models. One potent type of attack are universal adversarial triggers, which are individual n-grams that, when appended to instances of a class under attack, can trick a model into predicting a target class. However, for inference tasks such as fact checking, these triggers often inadvertently invert the meaning of instances they are inserted in. In addition, such attacks produce semantically nonsensical inputs, as they simply concatenate triggers to existing samples. Here, we investigate how to generate adversarial attacks against fact checking systems that preserve the ground truth meaning and are semantically valid. We extend the HotFlip attack algorithm used for universal trigger generation by jointly minimising the target class loss of a fact checking model and the entailment class loss of an auxiliary natural language inference model. We then train a conditional language model to generate semantically valid statements, which include the found universal triggers. We find that the generated attacks maintain the directionality and semantic validity of the claim better than previous work.

The task of session search focuses on using interaction data to improve relevance for the user's next query at the session level. In this paper, we formulate session search as a personalization task under the framework of learning to rank. Personalization approaches re-rank results to match a user model. Such user models are usually accumulated over time based on the user's browsing behaviour. We use a pre-computed and transparent set of user models based on concepts from the social science literature. Interaction data are used to map each session to these user models. Novel features are then estimated based on such models as well as sessions' interaction data. Extensive experiments on test collections from the TREC session track show statistically significant improvements over current session search algorithms.

In this paper we consider the fundamental problem of finding subgraphs in highly dynamic distributed networks - networks which allow an arbitrary number of links to be inserted / deleted per round. We show that the problems of $k$-clique membership listing (for any $k\geq 3$), 4-cycle listing and 5-cycle listing can be deterministically solved in $O(1)$-amortized round complexity, even with limited logarithmic-sized messages. To achieve $k$-clique membership listing we introduce a very useful combinatorial structure which we name the robust $2$-hop neighborhood. This is a subset of the 2-hop neighborhood of a node, and we prove that it can be maintained in highly dynamic networks in $O(1)$-amortized rounds. We also show that maintaining the actual 2-hop neighborhood of a node requires near linear amortized time, showing the necessity of our definition. For $4$-cycle and $5$-cycle listing, we need edges within hop distance 3, for which we similarly define the robust $3$-hop neighborhood and prove it can be maintained in highly dynamic networks in $O(1)$-amortized rounds. We complement the above with several impossibility results. We show that membership listing of any other graph on $k\geq 3$ nodes except $k$-clique requires an almost linear number of amortized communication rounds. We also show that $k$-cycle listing for $k\geq 6$ requires $\Omega(\sqrt{n} / \log n)$ amortized rounds. This, combined with our upper bounds, paints a detailed picture of the complexity landscape for ultra fast graph finding algorithms in this highly dynamic environment.

Consciousness about power consumption at the appliance level can assist user in promoting energy efficiency in households. In this paper, a superior non-intrusive appliance recognition method that can provide particular consumption footprints of each appliance is proposed. Electrical devices are well recognized by the combination of different descriptors via the following steps: (a) investigating the applicability along with performance comparability of several time-domain (TD) feature extraction schemes; (b) exploring their complementary features; and (c) making use of a new design of the ensemble bagging tree (EBT) classifier. Consequently, a powerful feature extraction technique based on the fusion of TD features is proposed, namely fTDF, aimed at improving the feature discrimination ability and optimizing the recognition task. An extensive experimental performance assessment is performed on two different datasets called the GREEND and WITHED, where power consumption signatures were gathered at 1 Hz and 44000 Hz sampling frequencies, respectively. The obtained results revealed prime efficiency of the proposed fTDF based EBT system in comparison with other TD descriptors and machine learning classifiers.

With its growing use in industry, ROS is rapidly becoming a standard in robotics. While developments in ROS 2 show promise, the slow adoption cycles in industry will push widespread ROS 2 industrial adoption years from now. ROS will prevail in the meantime which raises the question: can ROS be used securely for industrial use cases even though its origins didn't consider it? The present study analyzes this question experimentally by performing a targeted offensive security exercise in a synthetic industrial use case involving ROS-Industrial and ROS packages. Our exercise results in four groups of attacks which manage to compromise the ROS computational graph, and all except one take control of most robotic endpoints at desire. To the best of our knowledge and given our setup, results do not favour the secure use of ROS in industry today, however, we managed to confirm that the security of certain robotic endpoints hold and remain optimistic about securing ROS industrial deployments.

Cultural research is dedicated to understanding the processes of knowledge dissemination and the social and technological practices in the book industry. Research on children books in the 19th century can be supported by computer systems. Specifically, the advances in digital image processing seem to offer great opportunities for analyzing and quantifying the visual components in the books. The production technology for illustrations in books in the 19th century was characterized by a shift from wood or copper engraving to lithography. We report classification experiments which intend to classify images based on the production technology. For a classification task that is also difficult for humans, the classification quality reaches only around 70%. We analyze some further error sources and identify reasons for the low performance.

We consider the online $\textsf{Bipartite Caching}$ problem where $n$ users are connected to $m$ caches in the form of a bipartite network. Each of the $m$ caches has a file storage capacity of $C$. There is a library consisting of $N >C$ distinct files. Each user can request any one of the files from the library at each time slot. We allow the file request sequences to be chosen in an adversarial fashion. A user's request at a time slot is satisfied if the requested file is already hosted on at least one of the caches connected to the user at that time slot. Our objective is to design an efficient online caching policy with minimal regret. In this paper, we propose $\textsf{LeadCache,}$ an online caching policy based on the $\textsf{Follow the Perturbed Leader}$ (FTPL) paradigm. We show that $\textsf{LeadCache}$ is regret optimal up to a multiplicative factor of $\tilde{O}(n^{0.375}).$ As a byproduct of our analysis, we design a new linear-time deterministic Pipage rounding procedure for the LP relaxation of a well-known NP-hard combinatorial optimization problem in this area. Our new rounding algorithm substantially improves upon the currently best-known complexity for this problem. Moreover, we show the surprising result that under mild Strong-Law-type assumptions on the file request sequence, the rate of file fetches to the caches approaches to zero under the $\textsf{LeadCache}$ policy. Finally, we derive a tight universal regret lower bound for the $\textsf{Bipartite Caching}$ problem, which critically makes use of results from graph coloring theory and certifies the announced approximation ratio.

The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches. Exact probabilistic inference algorithms such as the forward-backward and Viterbi algorithms are typically applied in training and prediction stages of the CRF model. However, these algorithms require sequential computation that makes parallelization impossible. In this paper, we propose to employ a parallelizable approximate variational inference algorithm for the CRF model. Based on this algorithm, we design an approximate inference network that can be connected with the encoder of the neural CRF model to form an end-to-end network, which is amenable to parallelization for faster training and prediction. The empirical results show that our proposed approaches achieve a 12.7-fold improvement in decoding speed with long sentences and a competitive accuracy compared with the traditional CRF approach.

Parasitic extraction is a powerful tool in the design process of electromechanical devices, specifically as part of workflows that ensure electromagnetic compatibility. A novel scheme to extract impedances from CAD device models, suitable for a finite element implementation, is derived from Maxwell's equations in differential form. It provides a foundation for parasitic extraction across a broad frequency range and is able to handle inhomogeneous permittivities and permeabilities, making it more flexible than existing integral equation approaches. The approach allows for the automatic treatment of multi-port models of arbitrary conductor geometry without requiring any significant manual user interference. This is achieved by computing a special source current density from its given divergence on chosen terminal surfaces, subsequently using this current density to compute the electric field, and finally calculating the impedance via a scalar potential. A mandatory low-frequency stabilization scheme is outlined, facilitating the finite element implementation. Two useful quasistatic approximations and the advantageous special case of perfect electric conductors are treated theoretically. The magnetoquasistatic approximation is validated against an analytical model in a numerical experiment. Moreover, the intrinsic capability of the method to treat inhomogeneous permittivities and permeabilities is demonstrated with a simple capacitor-coil model including dielectric and magnetic core materials.

Recent studies indicate that current adversarial attack methods are flawed and easy to fail when encountering some deliberately designed defense. Sometimes even a slight modification in the model details will invalidate the attack. We find that training model with label smoothing can easily achieve striking accuracy under most gradient-based attacks. For instance, the robust accuracy of a WideResNet model trained with label smoothing on CIFAR-10 achieves 75% at most under PGD attack. To understand the reason underlying the subtle robustness, we investigate the relationship between label smoothing and adversarial robustness. Through theoretical analysis about the characteristics of the network trained with label smoothing and experiment verification of its performance under various attacks. We demonstrate that the robustness produced by label smoothing is incomplete based on the fact that its defense effect is volatile, and it cannot defend attacks transferred from a naturally trained model. Our study enlightens the research community to rethink how to evaluate the model's robustness appropriately.

One of the most impressive human endeavors of the past two decades is the collection and categorization of human knowledge in the free and accessible format that is Wikipedia. In this work we ask what makes a term worthy of entering this edifice of knowledge, and having a page of its own in Wikipedia? To what extent is this a natural product of on-going human discourse and discussion rather than an idiosyncratic choice of Wikipedia editors? Specifically, we aim to identify such "wiki-worthy" terms in a massive news corpus, and see if this can be done with no, or minimal, dependency on actual Wikipedia entries. We suggest a five-step pipeline for doing so, providing baseline results for all five, and the relevant datasets for benchmarking them. Our work sheds new light on the domain-specific Automatic Term Extraction problem, with the problem at hand being a domain-independent variant of it.

Ongoing climate change calls for fast and accurate weather and climate modeling. However, when solving large-scale weather prediction simulations, state-of-the-art CPU and GPU implementations suffer from limited performance and high energy consumption. These implementations are dominated by complex irregular memory access patterns and low arithmetic intensity that pose fundamental challenges to acceleration. To overcome these challenges, we propose and evaluate the use of near-memory acceleration using a reconfigurable fabric with high-bandwidth memory (HBM). We focus on compound stencils that are fundamental kernels in weather prediction models. By using high-level synthesis techniques, we develop NERO, an FPGA+HBM-based accelerator connected through IBM CAPI2 (Coherent Accelerator Processor Interface) to an IBM POWER9 host system. Our experimental results show that NERO outperforms a 16-core POWER9 system by 4.2x and 8.3x when running two different compound stencil kernels. NERO reduces the energy consumption by 22x and 29x for the same two kernels over the POWER9 system with an energy efficiency of 1.5 GFLOPS/Watt and 17.3 GFLOPS/Watt. We conclude that employing near-memory acceleration solutions for weather prediction modeling is promising as a means to achieve both high performance and high energy efficiency.

Unsupervised dimensionality reduction is one of the commonly used techniques in the field of high dimensional data recognition problems. The deep autoencoder network which constrains the weights to be non-negative, can learn a low dimensional part-based representation of data. On the other hand, the inherent structure of the each data cluster can be described by the distribution of the intraclass samples. Then one hopes to learn a new low dimensional representation which can preserve the intrinsic structure embedded in the original high dimensional data space perfectly. In this paper, by preserving the data distribution, a deep part-based representation can be learned, and the novel algorithm is called Distribution Preserving Network Embedding (DPNE). In DPNE, we first need to estimate the distribution of the original high dimensional data using the $k$-nearest neighbor kernel density estimation, and then we seek a part-based representation which respects the above distribution. The experimental results on the real-world data sets show that the proposed algorithm has good performance in terms of cluster accuracy and AMI. It turns out that the manifold structure in the raw data can be well preserved in the low dimensional feature space.

In this paper, a distribution system operator (DSO) framework is proposed for comprehensive retail and wholesale markets participation of distributed energy resource (DER) aggregators under uncertainty based on two-stage stochastic programming. Different kinds of DER aggregators including energy storage aggregators (ESAGs), demand response aggregators (DRAGs), electric vehicle (EV) aggregating charging stations (EVCSs), dispatchable distributed generation (DDG) aggregators (DDGAGs), and renewable energy aggregators (REAGs) are modeled. Distribution network operation constraints are considered using a linearized power flow. The problem is modeled using mixed-integer linear programming (MILP) which can be solved by using commercial solvers. Case studies are conducted to investigate the performance of the proposed DSO framework.

Stereo image pairs encode 3D scene cues into stereo correspondences between the left and right images. To exploit 3D cues within stereo images, recent CNN based methods commonly use cost volume techniques to capture stereo correspondence over large disparities. However, since disparities can vary significantly for stereo cameras with different baselines, focal lengths and resolutions, the fixed maximum disparity used in cost volume techniques hinders them to handle different stereo image pairs with large disparity variations. In this paper, we propose a generic parallax-attention mechanism (PAM) to capture stereo correspondence regardless of disparity variations. Our PAM integrates epipolar constraints with attention mechanism to calculate feature similarities along the epipolar line to capture stereo correspondence. Based on our PAM, we propose a parallax-attention stereo matching network (PASMnet) and a parallax-attention stereo image super-resolution network (PASSRnet) for stereo matching and stereo image super-resolution tasks. Moreover, we introduce a new and large-scale dataset named Flickr1024 for stereo image super-resolution. Experimental results show that our PAM is generic and can effectively learn stereo correspondence under large disparity variations in an unsupervised manner. Comparative results show that our PASMnet and PASSRnet achieve the state-of-the-art performance.

A robust and accurate 3D detection system is an integral part of autonomous vehicles. Traditionally, a majority of 3D object detection algorithms focus on processing 3D point clouds using voxel grids or bird's eye view (BEV). Recent works, however, demonstrate the utilization of the graph neural network (GNN) as a promising approach to 3D object detection. In this work, we propose an attention based feature aggregation technique in GNN for detecting objects in LiDAR scan. We first employ a distance-aware down-sampling scheme that not only enhances the algorithmic performance but also retains maximum geometric features of objects even if they lie far from the sensor. In each layer of the GNN, apart from the linear transformation which maps the per node input features to the corresponding higher level features, a per node masked attention by specifying different weights to different nodes in its first ring neighborhood is also performed. The masked attention implicitly accounts for the underlying neighborhood graph structure of every node and also eliminates the need of costly matrix operations thereby improving the detection accuracy without compromising the performance. The experiments on KITTI dataset show that our method yields comparable results for 3D object detection.

Dealing with the inconsistency between a foreground object and a background image is a challenging task in high-fidelity image composition. State-of-the-art methods strive to harmonize the composed image by adapting the style of foreground objects to be compatible with the background image, whereas the potential shadow of foreground objects within the composed image which is critical to the composition realism is largely neglected. In this paper, we propose an Adversarial Image Composition Net (AIC-Net) that achieves realistic image composition by considering potential shadows that the foreground object projects in the composed image. A novel branched generation mechanism is proposed, which disentangles the generation of shadows and the transfer of foreground styles for optimal accomplishment of the two tasks simultaneously. A differentiable spatial transformation module is designed which bridges the local harmonization and the global harmonization to achieve their joint optimization effectively. Extensive experiments on pedestrian and car composition tasks show that the proposed AIC-Net achieves superior composition performance qualitatively and quantitatively.

Many NLP tasks have benefited from transferring knowledge from contextualized word embeddings, however the picture of what type of knowledge is transferred is incomplete. This paper studies the types of linguistic phenomena accounted for by language models in the context of a Conversational Question Answering (CoQA) task. We identify the problematic areas for the finetuned RoBERTa, BERT and DistilBERT models through systematic error analysis - basic arithmetic (counting phrases), compositional semantics (negation and Semantic Role Labeling), and lexical semantics (surprisal and antonymy). When enhanced with the relevant linguistic knowledge through multitask learning, the models improve in performance. Ensembles of the enhanced models yield a boost between 2.2 and 2.7 points in F1 score overall, and up to 42.1 points in F1 on the hardest question classes. The results show differences in ability to represent compositional and lexical information between RoBERTa, BERT and DistilBERT.

As combating climate change has become a top priority and as many countries are taking steps to make their power generation sustainable, there is a marked increase in the use of renewable energy sources (RESs) for electricity generation. Among these RESs, solar photovoltaics (PV) is one of the most popular sources of energy connected to LV distribution networks. With the greater integration of solar PV into LV distribution networks, utility providers impose caps to solar penetration in order to operate their network safely and within acceptable norms. One parameter that restricts solar PV penetration is unbalances created by loads and single-phase rooftop schemes connected to LV distribution grids. In this paper, a novel method is proposed to mitigate voltage unbalance in LV distribution grids by optimally re-phasing grid-connected rooftop PV systems. A modified version of the discrete bacterial foraging optimization algorithm (DBFOA) is introduced as the optimization technique to minimize the overall voltage unbalance of the network as the objective function, subjected to various network and operating parameters. The impact of utilizing the proposed PV re-phasing technique as opposed to a fixed phase configuration are compared based on overall voltage unbalance, which was observed hourly throughout the day. The case studies show that the proposed approach can significantly mitigate the overall voltage unbalance during the daytime and can facilitate to increase the usable PV capacity of the considered network by 77%.

We consider a contextual online learning (multi-armed bandit) problem with high-dimensional covariate $\mathbf{x}$ and decision $\mathbf{y}$. The reward function to learn, $f(\mathbf{x},\mathbf{y})$, does not have a particular parametric form. The literature has shown that the optimal regret is $\tilde{O}(T^{(d_x+d_y+1)/(d_x+d_y+2)})$, where $d_x$ and $d_y$ are the dimensions of $\mathbf x$ and $\mathbf y$, and thus it suffers from the curse of dimensionality. In many applications, only a small subset of variables in the covariate affect the value of $f$, which is referred to as \textit{sparsity} in statistics. To take advantage of the sparsity structure of the covariate, we propose a variable selection algorithm called \textit{BV-LASSO}, which incorporates novel ideas such as binning and voting to apply LASSO to nonparametric settings. Our algorithm achieves the regret $\tilde{O}(T^{(d_x^*+d_y+1)/(d_x^*+d_y+2)})$, where $d_x^*$ is the effective covariate dimension. The regret matches the optimal regret when the covariate is $d^*_x$-dimensional and thus cannot be improved. Our algorithm may serve as a general recipe to achieve dimension reduction via variable selection in nonparametric settings.

We consider a generalization of the fundamental online metrical service systems (MSS) problem where the feasible region can be transformed between requests. In this problem, which we call T-MSS, an algorithm maintains a point in a metric space and has to serve a sequence of requests. Each request is a map (transformation) $f_t\colon A_t\to B_t$ between subsets $A_t$ and $B_t$ of the metric space. To serve it, the algorithm has to go to a point $a_t\in A_t$, paying the distance from its previous position. Then, the transformation is applied, modifying the algorithm's state to $f_t(a_t)$. Such transformations can model, e.g., changes to the environment that are outside of an algorithm's control, and we therefore do not charge any additional cost to the algorithm when the transformation is applied. The transformations also allow to model requests occurring in the $k$-taxi problem. We show that for $\alpha$-Lipschitz transformations, the competitive ratio is $\Theta(\alpha)^{n-2}$ on $n$-point metrics. Here, the upper bound is achieved by a deterministic algorithm and the lower bound holds even for randomized algorithms. For the $k$-taxi problem, we prove a competitive ratio of $\tilde O((n\log k)^2)$. For chasing convex bodies, we show that even with contracting transformations no competitive algorithm exists. The problem T-MSS has a striking connection to the following deep mathematical question: Given a finite metric space $M$, what is the required cardinality of an extension $\hat M\supseteq M$ where each partial isometry on $M$ extends to an automorphism? We give partial answers for special cases.

Recent studies have reported biases in machine learning image classifiers, especially against particular demographic groups. Counterfactual examples for an input---perturbations that change specific features but not others---have been shown to be useful for evaluating explainability and fairness of machine learning models. However, generating counterfactual examples for images is non-trivial due to the underlying causal structure governing the various features of an image. To be meaningful, generated perturbations need to satisfy constraints implied by the causal model. We present a method for generating counterfactuals by incorporating a known causal graph structure in a conditional variant of Adversarially Learned Inference (ALI). The proposed approach learns causal relationships between the specified attributes of an image and generates counterfactuals in accordance with these relationships. On Morpho-MNIST and CelebA datasets, the method generates counterfactuals that can change specified attributes and their causal descendants while keeping other attributes constant. As an application, we apply the generated counterfactuals from CelebA images to evaluate fairness biases in a classifier that predicts attractiveness of a face.

In compressive learning, a mixture model (a set of centroids or a Gaussian mixture) is learned from a sketch vector, that serves as a highly compressed representation of the dataset. This requires solving a non-convex optimization problem, hence in practice approximate heuristics (such as CLOMPR) are used. In this work we explore, by numerical simulations, properties of this non-convex optimization landscape and those heuristics.

In deep learning, it is usually assumed that the optimization process is conducted on a shape-fixed loss surface. Differently, we first propose a novel concept of deformation mapping in this paper to affect the behaviour of the optimizer. Vertical deformation mapping (VDM), as a type of deformation mapping, can make the optimizer enter a flat region, which often implies better generalization performance. Moreover, we design various VDMs, and further provide their contributions to the loss surface. After defining the local M region, theoretical analyses show that deforming the loss surface can enhance the gradient descent optimizer's ability to filter out sharp minima. With visualizations of loss landscapes, we evaluate the flatnesses of minima obtained by both the original optimizer and optimizers enhanced by VDMs on CIFAR-100. The experimental results show that VDMs do find flatter regions. Moreover, we compare popular convolutional neural networks enhanced by VDMs with the corresponding original ones on ImageNet, CIFAR-10, and CIFAR-100. The results are surprising: there are significant improvements on all of the involved models equipped with VDMs. For example, the top-1 test accuracy of ResNet-20 on CIFAR-100 increases by 1.46%, with insignificant additional computational overhead.

The accurate prediction of the solar Diffuse Fraction (DF), sometimes called the Diffuse Ratio, is an important topic for solar energy research. In the present study, the current state of Diffuse Irradiance research is discussed and then three robust, Machine Learning (ML) models, are examined using a large dataset (almost 8 years) of hourly readings from Almeria, Spain. The ML models used herein, are a hybrid Adaptive Network-based Fuzzy Inference System (ANFIS), a single Multi-Layer Perceptron (MLP) and a hybrid Multi-Layer Perceptron-Grey Wolf Optimizer (MLP-GWO). These models were evaluated for their predictive precision, using various Solar and Diffuse Fraction (DF) irradiance data, from Spain. The results were then evaluated using two frequently used evaluation criteria, the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE). The results showed that the MLP-GWO model, followed by the ANFIS model, provided a higher performance, in both the training and the testing procedures.

We propose a system that uses video as the input to track the position of objects relative to their surrounding environment in real-time. The neural network employed is trained on a 100% synthetic dataset coming from our own automated generator. The positional tracker relies on a range of 1 to n video cameras placed around an arena of choice. The system returns the positions of the tracked objects relative to the broader world by understanding the overlapping matrices formed by the cameras and therefore these can be extrapolated into real world coordinates. In most cases, we achieve a higher update rate and positioning precision than any of the existing GPS-based systems, in particular for indoor objects or those occluded from clear sky.

The present project aims to use machine learning, specifically neural networks (NN), to learn the trajectories of a set of coupled ordinary differential equations (ODEs) and decrease compute times for obtaining ODE solutions by using this surragate model. As an example system of proven biological significance, we use an ODE model of a gene regulatory circuit of cyanobacteria related to photosynthesis \cite{original_biology_Kehoe, Sundus_math_model}. Using data generated by a numeric solution to the exemplar system, we train several long-short-term memory neural networks. We stopping training when the networks achieve an accuracy of of 3\% on testing data resulting in networks able to predict values in the ODE time series ranging from 0.25 minutes to 6.25 minutes beyond input values. We observed computational speed ups ranging from 9.75 to 197 times when comparing prediction compute time with compute time for obtaining the numeric solution. Given the success of this proof of concept, we plan on continuing this project in the future and will attempt to realize the same computational speed-ups in the context of an agent-based modeling platfom.

Conformal mapping, a classical topic in complex analysis and differential geometry, has become a subject of great interest in the area of surface parameterization in recent decades with various applications in science and engineering. However, most of the existing conformal parameterization algorithms only focus on simply-connected surfaces and cannot be directly applied to surfaces with holes. In this work, we propose two novel algorithms for computing the conformal parameterization of multiply-connected surfaces. We first develop an efficient method for conformally parameterizing an open surface with one hole to an annulus on the plane. Based on this method, we then develop an efficient method for conformally parameterizing an open surface with $k$ holes onto a unit disk with $k$ circular holes. The conformality and bijectivity of the mappings are ensured by quasi-conformal theory. Numerical experiments and applications are presented to demonstrate the effectiveness of the proposed methods.

A model of face representation, inspired by the biology of the visual system, is compared to experimental data on the perception of facial similarity. The face representation model uses aggregate primary visual cortex (V1) cell responses topographically linked to a grid covering the face, allowing comparison of shape and texture at corresponding points in two facial images. When a set of relatively similar faces was used as stimuli, this Linked Aggregate Code (LAC) predicted human performance in similarity judgment experiments. When faces of perceivable categories were used, dimensions such as apparent sex and race emerged from the LAC model without training. The dimensional structure of the LAC similarity measure for the mixed category task displayed some psychologically plausible features but also highlighted differences between the model and the human similarity judgements. The human judgements exhibited a racial perceptual bias that was not shared by the LAC model. The results suggest that the LAC based similarity measure may offer a fertile starting point for further modelling studies of face representation in higher visual areas, including studies of the development of biases in face perception.

Event cameras are novel vision sensors that sample, in an asynchronous fashion, brightness increments with low latency and high temporal resolution. The resulting streams of events are of high value by themselves, especially for high speed motion estimation. However, a growing body of work has also focused on the reconstruction of intensity frames from the events, as this allows bridging the gap with the existing literature on appearance- and frame-based computer vision. Recent work has mostly approached this intensity reconstruction problem using neural networks trained with synthetic, ground-truth data. Nevertheless, since accurate ground truth is only available in simulation, these methods are subject to the reality gap and, to ensure generalizability, their training datasets need to be carefully designed. In this work, we approach, for the first time, the reconstruction problem from a self-supervised learning perspective. Our framework combines estimated optical flow and the event-based photometric constancy to train neural networks without the need for any ground-truth or synthetic data. Results across multiple datasets show that the performance of the proposed approach is in line with the state-of-the-art.

This paper presents a novel numerical method for the hybrid reliability analysis by using the uncertainty theory. Aleatory uncertainty and epistemic uncertainty are considered simultaneously in this method. Epistemic uncertainty is characterized by the uncertainty theory, and the effect of epistemic uncertainty is quantified by the sub-additive uncertain measure. Then, under the framework of the chance theory which can be interpreted as the combination of the probability theory and the uncertainty theory, a general uncertainty quantification model is established to deal with the hybrid reliability analysis problem, then the corresponding reliability metric is defined. After that, to improve the feasibility of the proposed model, by utilizing the polar coordinate transformation based dimension reduction method, a numerical analysis method for the hybrid reliability model are provided. At last, several application cases are presented to prove the effectiveness of the proposed method for the reliability analysis under hybrid uncertainty. The comparisons between the results of the proposed method and the Monte Carlo simulation also illustrate the merit of this method.

With the growing constraints on power budget and increasing hardware failure rates, the operation of future exascale systems faces several challenges. Towards this, resource awareness and adaptivity by enabling malleable jobs has been actively researched in the HPC community. Malleable jobs can change their computing resources at runtime and can significantly improve HPC system performance. However, due to the rigid nature of popular parallel programming paradigms such as MPI and lack of support for dynamic resource management in batch systems, malleable jobs have been largely unrealized. In this paper, we extend the SLURM batch system to support the execution and batch scheduling of malleable jobs. The malleable applications are written using a new adaptive parallel paradigm called Invasive MPI which extends the MPI standard to support resource-adaptivity at runtime. We propose two malleable job scheduling strategies to support performance-aware and power-aware dynamic reconfiguration decisions at runtime. We implement the strategies in SLURM and evaluate them on a production HPC system. Results for our performance-aware scheduling strategy show improvements in makespan, average system utilization, average response, and waiting times as compared to other scheduling strategies. Moreover, we demonstrate dynamic power corridor management using our power-aware strategy.

Video representation learning has recently attracted attention in computer vision due to its applications for activity and scene forecasting or vision-based planning and control. Video prediction models often learn a latent representation of video which is encoded from input frames and decoded back into images. Even when conditioned on actions, purely deep learning based architectures typically lack a physically interpretable latent space. In this study, we use a differentiable physics engine within an action-conditional video representation network to learn a physical latent representation. We propose supervised and self-supervised learning methods to train our network and identify physical properties. The latter uses spatial transformers to decode physical states back into images. The simulation scenarios in our experiments comprise pushing, sliding and colliding objects, for which we also analyze the observability of the physical properties. In experiments we demonstrate that our network can learn to encode images and identify physical properties like mass and friction from videos and action sequences in the simulated scenarios. We evaluate the accuracy of our supervised and self-supervised methods and compare it with a system identification baseline which directly learns from state trajectories. We also demonstrate the ability of our method to predict future video frames from input images and actions.

Federated learning (FL) has enabled training models collaboratively from multiple data owning parties without sharing their data. Given the privacy regulations of patient's healthcare data, learning-based systems in healthcare can greatly benefit from privacy-preserving FL approaches. However, typical model aggregation methods in FL are sensitive to local model updates, which may lead to failure in learning a robust and accurate global model. In this work, we implement and evaluate different robust aggregation methods in FL applied to healthcare data. Furthermore, we show that such methods can detect and discard faulty or malicious local clients during training. We run two sets of experiments using two real-world healthcare datasets for training medical diagnosis classification tasks. Each dataset is used to simulate the performance of three different robust FL aggregation strategies when facing different poisoning attacks. The results show that privacy preserving methods can be successfully applied alongside Byzantine-robust aggregation techniques. We observed in particular how using differential privacy (DP) did not significantly impact the final learning convergence of the different aggregation strategies.

Neural Controlled Differential Equations (Neural CDEs) are the continuous-time analogue of an RNN, just as Neural ODEs are analogous to ResNets. However just like RNNs, training Neural CDEs can be difficult for long time series. Here, we propose to apply a technique drawn from stochastic analysis, namely the log-ODE method. Instead of using the original input sequence, our procedure summarises the information over local time intervals via the log-signature map, and uses the resulting shorter stream of log-signatures as the new input. This represents a length/channel trade-off. In doing so we demonstrate efficacy on problems of length up to 17k observations and observe significant training speed-ups, improvements in model performance, and reduced memory requirements compared to the existing algorithm.

The recovery of the underlying low-rank structure of clean data corrupted with sparse noise/outliers is attracting increasing interest. However, in many low-level vision problems, the exact target rank of the underlying structure, the particular locations and values of the sparse outliers are not known. Thus, the conventional methods can not separate the low-rank and sparse components completely, especially gross outliers or deficient observations. Therefore, in this study, we employ the Minimum Description Length (MDL) principle and atomic norm for low-rank matrix recovery to overcome these limitations. First, we employ the atomic norm to find all the candidate atoms of low-rank and sparse terms, and then we minimize the description length of the model in order to select the appropriate atoms of low-rank and the sparse matrix, respectively. Our experimental analyses show that the proposed approach can obtain a higher success rate than the state-of-the-art methods even when the number of observations is limited or the corruption ratio is high. Experimental results about synthetic data and real sensing applications (high dynamic range imaging, background modeling, removing shadows and specularities) demonstrate the effectiveness, robustness and efficiency of the proposed method.

We present a novel bilateral negotiation model that allows a self-interested agent to learn how to negotiate over multiple issues in the presence of user preference uncertainty. The model relies upon interpretable strategy templates representing the tactics the agent should employ during the negotiation and learns template parameters to maximize the average utility received over multiple negotiations, thus resulting in optimal bid acceptance and generation. Our model also uses deep reinforcement learning to evaluate threshold utility values, for those tactics that require them, thereby deriving optimal utilities for every environment state. To handle user preference uncertainty, the model relies on a stochastic search to find user model that best agrees with a given partial preference profile. Multi-objective optimization and multi-criteria decision-making methods are applied at negotiation time to generate Pareto-optimal outcomes thereby increasing the number of successful (win-win) negotiations. Rigorous experimental evaluations show that the agent employing our model outperforms the winning agents of the 10th Automated Negotiating Agents Competition (ANAC'19) in terms of individual as well as social-welfare utilities.

This paper develops an accurate, efficient filter (called the `TT filter') for tracking multiple targets using a spatially-distributed network of amplitude sensors that estimate distance but not direction. Several innovations are included in the algorithm that increase accuracy and reduce complexity. For initial target acquisition once tracking begins, a constrained Hessian search is used to find the maximum likelihood (ML) target vector, based on the measurement model and a Gaussian approximation of the prior. The Hessian at the ML vector is used to give an initial approximation of the negative log likelihood for the target vector distribution: corrections are applied if the Hessian is not positive definite due to the near-far problem. Further corrections are made by applying a transformation that matches the known nonlinearity introduced by distance-only sensors. A set of integration points is constructed using this information, which are used to estimate the mean and moments of the target vector distribution. Results show that the TT filter gives superior accuracy and lower complexity than previous alternatives such as Kalman-based or particle filters.

Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks, therefore sophisticated evaluation on their robustness is of great importance. However, evaluating the robustness only under the worst-case scenarios based on known attacks is not comprehensive, not to mention that some of them even rarely occur in the real world. In addition, the distribution of safety-critical data is usually multimodal, while most traditional attacks and evaluation methods focus on a single modality. To solve the above challenges, we propose a flow-based multimodal safety-critical scenario generator for evaluating decisionmaking algorithms. The proposed generative model is optimized with weighted likelihood maximization and a gradient-based sampling procedure is integrated to improve the sampling efficiency. The safety-critical scenarios are generated by querying the task algorithms and the log-likelihood of the generated scenarios is in proportion to the risk level. Experiments on a self-driving task demonstrate our advantages in terms of testing efficiency and multimodal modeling capability. We evaluate six Reinforcement Learning algorithms with our generated traffic scenarios and provide empirical conclusions about their robustness.

Insurance fraud occurs when policyholders file claims that are exaggerated or based on intentional damages. This contribution develops a fraud detection strategy by extracting insightful information from the social network of a claim. First, we construct a network by linking claims with all their involved parties, including the policyholders, brokers, experts, and garages. Next, we establish fraud as a social phenomenon in the network and use the BiRank algorithm with a fraud specific query vector to compute a fraud score for each claim. From the network, we extract features related to the fraud scores as well as the claims' neighborhood structure. Finally, we combine these network features with the claim-specific features and build a supervised model with fraud in motor insurance as the target variable. Although we build a model for only motor insurance, the network includes claims from all available lines of business. Our results show that models with features derived from the network perform well when detecting fraud and even outperform the models using only the classical claim-specific features. Combining network and claim-specific features further improves the performance of supervised learning models to detect fraud. The resulting model flags highly suspicions claims that need to be further investigated. Our approach provides a guided and intelligent selection of claims and contributes to a more effective fraud investigation process.

In an effort to overcome limitations of reward-driven feature learning in deep reinforcement learning (RL) from images, we propose decoupling representation learning from policy learning. To this end, we introduce a new unsupervised learning (UL) task, called Augmented Temporal Contrast (ATC), which trains a convolutional encoder to associate pairs of observations separated by a short time difference, under image augmentations and using a contrastive loss. In online RL experiments, we show that training the encoder exclusively using ATC matches or outperforms end-to-end RL in most environments. Additionally, we benchmark several leading UL algorithms by pre-training encoders on expert demonstrations and using them, with weights frozen, in RL agents; we find that agents using ATC-trained encoders outperform all others. We also train multi-task encoders on data from multiple environments and show generalization to different downstream RL tasks. Finally, we ablate components of ATC, and introduce a new data augmentation to enable replay of (compressed) latent images from pre-trained encoders when RL requires augmentation. Our experiments span visually diverse RL benchmarks in DeepMind Control, DeepMind Lab, and Atari, and our complete code is available at https://github.com/astooke/rlpyt/rlpyt/ul.

We consider the problem of encoding a set of vectors into a minimal number of bits while preserving information on their Euclidean geometry. We show that this task can be accomplished by applying a Johnson-Lindenstrauss embedding and subsequently binarizing each vector by comparing each entry of the vector to a uniformly random threshold. Using this simple construction we produce two encodings of a dataset such that one can query Euclidean information for a pair of points using a small number of bit operations up to a desired additive error - Euclidean distances in the first case and inner products and squared Euclidean distances in the second. In the latter case, each point is encoded in near-linear time. The number of bits required for these encodings is quantified in terms of two natural complexity parameters of the dataset - its covering numbers and localized Gaussian complexity - and shown to be near-optimal.

In this paper the argument is made that for true novel view synthesis of objects, where the object can be synthesized from any viewpoint, an explicit 3D shape representation isdesired. Our method estimates point clouds to capture the geometry of the object, which can be freely rotated into the desired view and then projected into a new image. This image, however, is sparse by nature and hence this coarse view is used as the input of an image completion network to obtain the dense target view. The point cloud is obtained using the predicted pixel-wise depth map, estimated from a single RGB input image,combined with the camera intrinsics. By using forward warping and backward warpingbetween the input view and the target view, the network can be trained end-to-end without supervision on depth. The benefit of using point clouds as an explicit 3D shape for novel view synthesis is experimentally validated on the 3D ShapeNet benchmark. Source code and data will be available at https://lhoangan.github.io/pc4novis/.

Gab is an online social network often associated with the alt-right political movement and users barred from other networks. It presents an interesting opportunity for research because near-complete data is available from day one of the network's creation. In this paper, we investigate the evolution of the user interaction graph, that is the graph where a link represents a user interacting with another user at a given time. We view this graph both at different times and at different timescales. The latter is achieved by using sliding windows on the graph which gives a novel perspective on social network data. The Gab network is relatively slowly growing over the period of months but subject to large bursts of arrivals over hours and days. We identify plausible events that are of interest to the Gab community associated with the most obvious such bursts. The network is characterised by interactions between `strangers' rather than by reinforcing links between `friends'. Gab usage follows the diurnal cycle of the predominantly US and Europe based users. At off-peak hours the Gab interaction network fragments into sub-networks with absolutely no interaction between them. A small group of users are highly influential across larger timescales, but a substantial number of users gain influence for short periods of time. Temporal analysis at different timescales gives new insights above and beyond what could be found on static graphs.

Deep neural networks (DNNs) fail to learn effectively under label noise and have been shown to memorize random labels which affect their generalization performance. We consider learning in isolation, using one-hot encoded labels as the sole source of supervision, and a lack of regularization to discourage memorization as the major shortcomings of the standard training procedure. Thus, we propose Noisy Concurrent Training (NCT) which leverages collaborative learning to use the consensus between two models as an additional source of supervision. Furthermore, inspired by trial-to-trial variability in the brain, we propose a counter-intuitive regularization technique, target variability, which entails randomly changing the labels of a percentage of training samples in each batch as a deterrent to memorization and over-generalization in DNNs. Target variability is applied independently to each model to keep them diverged and avoid the confirmation bias. As DNNs tend to prioritize learning simple patterns first before memorizing the noisy labels, we employ a dynamic learning scheme whereby as the training progresses, the two models increasingly rely more on their consensus. NCT also progressively increases the target variability to avoid memorization in later stages. We demonstrate the effectiveness of our approach on both synthetic and real-world noisy benchmark datasets.

Dimensionality reduction and clustering are often used as preliminary steps for many complex machine learning tasks. The presence of noise and outliers can deteriorate the performance of such preprocessing and therefore impair the subsequent analysis tremendously. In manifold learning, several studies indicate solutions for removing background noise or noise close to the structure when the density is substantially higher than that exhibited by the noise. However, in many applications, including astronomical datasets, the density varies alongside manifolds that are buried in a noisy background. We propose a novel method to extract manifolds in the presence of noise based on the idea of Ant colony optimization. In contrast to the existing random walk solutions, our technique captures points which are locally aligned with major directions of the manifold. Moreover, we empirically show that the biologically inspired formulation of ant pheromone reinforces this behavior enabling it to recover multiple manifolds embedded in extremely noisy data clouds. The algorithm's performance is demonstrated in comparison to the state-of-the-art approaches, such as Markov Chain, LLPD, and Disperse, on several synthetic and real astronomical datasets stemming from an N-body simulation of a cosmological volume.

One of the major challenges in using distributed learning to train complicated models with large data sets is to deal with stragglers effect. As a solution, coded computation has been recently proposed to efficiently add redundancy to the computation tasks. In this technique, coding is used across data sets, and computation is done over coded data, such that the results of an arbitrary subset of worker nodes with a certain size are enough to recover the final results. The major challenges with those approaches are (1) they are limited to polynomial function computations, (2) the size of the subset of servers that we need to wait for grows with the multiplication of the size of the data set and the model complexity (the degree of the polynomial), which can be prohibitively large, (3) they are not numerically stable for computation over real numbers. In this paper, we propose Berrut Approximated Coded Computing (BACC), as an alternative approach, which is not limited to polynomial function computation. In addition, the master node can approximately calculate the final results, using the outcomes of any arbitrary subset of available worker nodes. The approximation approach is proven to be numerically stable with low computational complexity. In addition, the accuracy of the approximation is established theoretically and verified by simulation results in different settings such as distributed learning problems. In particular, BACC is used to train a deep neural network on a cluster of servers, which outperforms repetitive computation (repetition coding) in terms of the rate of convergence.

Recent work proposes a family of contextual embeddings that significantly improves the accuracy of sequence labelers over non-contextual embeddings. However, there is no definite conclusion on whether we can build better sequence labelers by combining different kinds of embeddings in various settings. In this paper, we conduct extensive experiments on 3 tasks over 18 datasets and 8 languages to study the accuracy of sequence labeling with various embedding concatenations and make three observations: (1) concatenating more embedding variants leads to better accuracy in rich-resource and cross-domain settings and some conditions of low-resource settings; (2) concatenating additional contextual sub-word embeddings with contextual character embeddings hurts the accuracy in extremely low-resource settings; (3) based on the conclusion of (1), concatenating additional similar contextual embeddings cannot lead to further improvements. We hope these conclusions can help people build stronger sequence labelers in various settings.

Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot retrieval applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives. However, generalization capacity is known to scale with the embedding space dimensionality. Unfortunately, high dimensional embeddings also create higher retrieval cost for downstream applications. To remedy this, we propose S2SD - Simultaneous Similarity-based Self-distillation. S2SD extends DML with knowledge distillation from auxiliary, high-dimensional embedding and feature spaces to leverage complementary context during training while retaining test-time cost and with negligible changes to the training time. Experiments and ablations across different objectives and standard benchmarks show S2SD offering notable improvements of up to 7% in Recall@1, while also setting a new state-of-the-art. Code available at https://github.com/MLforHealth/S2SD.

We investigate the List $H$-Coloring problem, the generalization of graph coloring that asks whether an input graph $G$ admits a homomorphism to the undirected graph $H$ (possibly with loops), such that each vertex $v \in V(G)$ is mapped to a vertex on its list $L(v) \subseteq V(H)$. An important result by Feder, Hell, and Huang [JGT 2003] states that List $H$-Coloring is polynomial-time solvable if $H$ is a so-called bi-arc graph, and NP-complete otherwise. We investigate the NP-complete cases of the problem from the perspective of polynomial-time sparsification: can an $n$-vertex instance be efficiently reduced to an equivalent instance of bitsize $O(n^{2-\varepsilon})$ for some $\varepsilon > 0$? We prove that if $H$ is not a bi-arc graph, then List $H$-Coloring does not admit such a sparsification algorithm unless $NP \subseteq coNP/poly$. Our proofs combine techniques from kernelization lower bounds with a study of the structure of graphs $H$ which are not bi-arc graphs.

Satisfiability modulo theories (SMT) solving has become a critical part of many static analyses, including symbolic execution, refinement type checking, and model checking. We propose Formulog, a domain-specific language that makes it possible to write a range of SMT-based static analyses in a way that is both close to their formal specifications and amenable to high-level optimizations and efficient evaluation. Formulog extends the logic programming language Datalog with a first-order functional language and mechanisms for representing and reasoning about SMT formulas; a novel type system supports the construction of expressive formulas, while ensuring that neither normal evaluation nor SMT solving goes wrong. Our case studies demonstrate that a range of SMT-based analyses can naturally and concisely be encoded in Formulog, and that -- thanks to this encoding -- high-level Datalog-style optimizations can be automatically and advantageously applied to these analyses.

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code snippet as a sequence of tokens, while ignoring the inherent structure of code, which provides crucial code semantics and would enhance the code understanding process. We present GraphCodeBERT, a pre-trained model for programming language that considers the inherent structure of code. Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables. Such a semantic-level structure is neat and does not bring an unnecessarily deep hierarchy of AST, the property of which makes the model more efficient. We develop GraphCodeBERT based on Transformer. In addition to using the task of masked language modeling, we introduce two structure-aware pre-training tasks. One is to predict code structure edges, and the other is to align representations between source code and code structure. We implement the model in an efficient way with a graph-guided masked attention function to incorporate the code structure. We evaluate our model on four tasks, including code search, clone detection, code translation, and code refinement. Results show that code structure and newly introduced pre-training tasks can improve GraphCodeBERT and achieves state-of-the-art performance on the four downstream tasks. We further show that the model prefers structure-level attentions over token-level attentions in the task of code search.

A new method for the simulation of evolving multi-domains problems has been introduced in previous works (RealIMotion), Florez et al. (2020) and further developed in parallel in the context of isotropic Grain Growth (GG) with no consideration for the effects of the Stored Energy (SE) due to dislocations. The methodology consists in a new front-tracking approach where one of the originality is that not only interfaces between grains are discretized but their bulks are also meshed and topological changes of the domains are driven by selective local remeshing operations performed on the Finite Element (FE) mesh. In this article, further developments and studies of the model will be presented, mainly on the development of a model taking into account grain boundary migration by (GBM) SE. Further developments for the nucleation of new grains will be presented, allowing to model Dynamic Recrystallization (DRX) and Post-Dynamic Recrystallization (PDRX) phenomena. The accuracy and the performance of the numerical algorithms have been proven to be very promising in Florez et al. (2020). Here the results for multiple test cases will be given in order to validate the accuracy of the model taking into account GG and SE. The computational performance will be evaluated for the DRX and PDRX mechanisms and compared to a classical Finite Element (FE) framework using a Level-Set (LS) formulation.

The world is facing a huge health crisis due to the rapid transmission of coronavirus (COVID-19). Several guidelines were issued by the World Health Organization (WHO) for protection against the spread of coronavirus. According to WHO, the most effective preventive measure against COVID-19 is wearing a mask in public places and crowded areas. It is very difficult to monitor people manually in these areas. In this paper, a transfer learning model is proposed to automate the process of identifying the people who are not wearing mask. The proposed model is built by fine-tuning the pre-trained state-of-the-art deep learning model, InceptionV3. The proposed model is trained and tested on the Simulated Masked Face Dataset (SMFD). Image augmentation technique is adopted to address the limited availability of data for better training and testing of the model. The model outperformed the other recently proposed approaches by achieving an accuracy of 99.9% during training and 100% during testing.

We present a method for microtubule tracking in electron microscopy volumes. Our method first identifies a sparse set of voxels that likely belong to microtubules. Similar to prior work, we then enumerate potential edges between these voxels, which we represent in a candidate graph. Tracks of microtubules are found by selecting nodes and edges in the candidate graph by solving a constrained optimization problem incorporating biological priors on microtubule structure. For this, we present a novel integer linear programming formulation, which results in speed-ups of three orders of magnitude and an increase of 53% in accuracy compared to prior art (evaluated on three 1.2 x 4 x 4$\mu$m volumes of Drosophila neural tissue). We also propose a scheme to solve the optimization problem in a block-wise fashion, which allows distributed tracking and is necessary to process very large electron microscopy volumes. Finally, we release a benchmark dataset for microtubule tracking, here used for training, testing and validation, consisting of eight 30 x 1000 x 1000 voxel blocks (1.2 x 4 x 4$\mu$m) of densely annotated microtubules in the CREMI data set (https://github.com/nilsec/micron).

Finding objects is essential for almost any daily-life visual task. Saliency models have been useful to predict fixation locations in natural images, but are static, i.e., they provide no information about the time-sequence of fixations. Nowadays, one of the biggest challenges in the field is to go beyond saliency maps to predict a sequence of fixations related to a visual task, such as searching for a given target. Bayesian observer models have been proposed for this task, as they represent visual search as an active sampling process. Nevertheless, they were mostly evaluated on artificial images, and how they adapt to natural images remains largely unexplored. Here, we propose a unified Bayesian model for visual search guided by saliency maps as prior information. We validated our model with a visual search experiment in natural scenes recording eye movements. We show that, although state-of-the-art saliency models perform well in predicting the first two fixations in a visual search task, their performance degrades to chance afterward. This suggests that saliency maps alone are good to model bottom-up first impressions, but are not enough to explain the scanpaths when top-down task information is critical. Thus, we propose to use them as priors of Bayesian searchers. This approach leads to a behavior very similar to humans for the whole scanpath, both in the percentage of target found as a function of the fixation rank and the scanpath similarity, reproducing the entire sequence of eye movements.

As scientists worldwide search for answers to the overwhelmingly unknown behind the deadly pandemic, the literature concerning COVID-19 has been growing exponentially. Keeping abreast of the body of literature at such a rapidly advancing pace poses significant challenges not only to active researchers but also to the society as a whole. Although numerous data resources have been made openly available, the analytic and synthetic process that is essential in effectively navigating through the vast amount of information with heightened levels of uncertainty remains a significant bottleneck. We introduce a generic method that facilitates the data collection and sense-making process when dealing with a rapidly growing landscape of a research domain such as COVID-19 at multiple levels of granularity. The method integrates the analysis of structural and temporal patterns in scholarly publications with the delineation of thematic concentrations and the types of uncertainties that may offer additional insights into the complexity of the unknown. We demonstrate the application of the method in a study of the COVID-19 literature.

Allowing users to interact with multi-document summarizers is a promising direction towards improving and customizing summary results. Different ideas for interactive summarization have been proposed in previous work but these solutions are highly divergent and incomparable. In this paper, we develop an end-to-end evaluation framework for expansion-based interactive summarization, which considers the accumulating information along an interactive session. Our framework includes a procedure of collecting real user sessions and evaluation measures relying on standards, but adapted to reflect interaction. All of our solutions are intended to be released publicly as a benchmark, allowing comparison of future developments in interactive summarization. We demonstrate the use of our framework by evaluating and comparing baseline implementations that we developed for this purpose, which will serve as part of our benchmark. Our extensive experimentation and analysis of these systems motivate our design choices and support the viability of our framework.

Virtual Big Data (VBD) proved to be effective to alleviate mode collapse and vanishing generator gradient as two major problems of Generative Adversarial Neural Networks (GANs) very recently. In this paper, we investigate the capability of VBD to address two other major challenges in Machine Learning including deep autoencoder training and imbalanced data classification. First, we prove that, VBD can significantly decrease the validation loss of autoencoders via providing them a huge diversified training data which is the key to reach better generalization to minimize the over-fitting problem. Second, we use the VBD to propose the first projection-based method called cross-concatenation to balance the skewed class distributions without over-sampling. We prove that, cross-concatenation can solve uncertainty problem of data driven methods for imbalanced classification.

The recent outbreak of COVID-19 has affected millions of individuals around the world and has posed a significant challenge to global healthcare. From the early days of the pandemic, it became clear that it is highly contagious and that human mobility contributes significantly to its spread. In this paper, we study the impact of population movement on the spread of COVID-19, and we capitalize on recent advances in the field of representation learning on graphs to capture the underlying dynamics. Specifically, we create a graph where nodes correspond to a country's regions and the edge weights denote human mobility from one region to another. Then, we employ graph neural networks to predict the number of future cases, encoding the underlying diffusion patterns that govern the spread into our learning model. Furthermore, to account for the limited amount of training data, we capitalize on the pandemic's asynchronous outbreaks across countries and use a model-agnostic meta-learning based method to transfer knowledge from one country's model to another's. We compare the proposed approach against simple baselines and more traditional forecasting techniques in 3 European countries. Experimental results demonstrate the superiority of our method, highlighting the usefulness of GNNs in epidemiological prediction. Transfer learning provides the best model, highlighting its potential to improve the accuracy of the predictions in case of secondary waves, if data from past/parallel outbreaks is utilized.

This master's thesis work has the objective of performing an analysis of the methods for detecting communities in networks. As an initial part, I study of the main features of graph theory and communities, as well as common measures in this problem. Subsequently, I was performed a review of the main methods of detecting communities, developing a classification, taking into account its characteristics and computational complexity for the detection of strengths and weaknesses in the methods, as well as later works. Then, study the problem of classification of a clustering method, this in order to evaluate the quality of the communities detected by analyzing different measures. Finally conclusions are elaborated and possible lines of work that can be derived.

Citizen-generated counter speech is a promising way to fight hate speech and promote peaceful, non-polarized discourse. However, there is a lack of large-scale longitudinal studies of its effectiveness for reducing hate speech. We investigate the effectiveness of counter speech using several different macro- and micro-level measures of over 180,000 political conversations that took place on German Twitter over four years. We report on the dynamic interactions of hate and counter speech over time and provide insights into whether, as in `classic' bullying situations, organized efforts are more effective than independent individuals in steering online discourse. Taken together, our results build a multifaceted picture of the dynamics of hate and counter speech online. They suggest that organized hate speech produced changes in the public discourse. Counter speech, especially when organized, could help in curbing hate speech in online discussions.

Memes are graphics and text overlapped so that together they present concepts that become dubious if one of them is absent. It is spread mostly on social media platforms, in the form of jokes, sarcasm, motivating, etc. After the success of BERT in Natural Language Processing (NLP), researchers inclined to Visual-Linguistic (VL) multimodal problems like memes classification, image captioning, Visual Question Answering (VQA), and many more. Unfortunately, many memes get uploaded each day on social media platforms that need automatic censoring to curb misinformation and hate. Recently, this issue has attracted the attention of researchers and practitioners. State-of-the-art methods that performed significantly on other VL dataset, tends to fail on memes classification. In this context, this work aims to conduct a comprehensive study on memes classification, generally on the VL multimodal problems and cutting edge solutions. We propose a generalized framework for VL problems. We cover the early and next-generation works on VL problems. Finally, we identify and articulate several open research issues and challenges. This is the first study that presents the generalized view of the advanced classification techniques concerning memes classification to the best of our knowledge. We believe this study presents a clear road-map for the Machine Learning (ML) research community to implement and enhance memes classification techniques.

Passwords should be easy to remember, yet expiration policies mandate their frequent change. Caught in the crossfire between these conflicting requirements, users often adopt creative methods to perform slight variations over time. While easily fooling the most basic checks for similarity, these schemes lead to a substantial decrease in actual security, because leaked passwords, albeit expired, can be effectively exploited as seeds for crackers. This work describes an approach based on Bloom filters to detect password similarity, which can be used to discourage password reuse habits. The proposed scheme intrinsically obfuscates the stored passwords to protect them in case of database leaks, and can be tuned to be resistant to common cryptanalytic techniques, making it suitable for usage on exposed systems.

We propose a new method for training an agent via an evolutionary strategy (ES), in which we iteratively improve a set of samples to imitate: Starting with a random set, in every iteration we replace a subset of the samples with samples from the best trajectories discovered so far. The evaluation procedure for this set is to train, via supervised learning, a randomly initialised neural network (NN) to imitate the set and then execute the acquired policy against the environment. Our method is thus an ES based on a fitness function that expresses the effectiveness of imitating an evolving data subset. This is in contrast to other ES techniques that iterate over the weights of the policy directly. By observing the samples that the agent selects for learning, it is possible to interpret and evaluate the evolving strategy of the agent more explicitly than in NN learning. In our experiments, we trained an agent to solve the OpenAI Gym environment Bipedalwalker-v3 by imitating an evolutionarily selected set of only 25 samples with a NN with only a few thousand parameters. We further test our method on the Procgen game Plunder and show here as well that the proposed method is an interpretable, small, robust and effective alternative to other ES or policy gradient methods.

We propose a generalizable framework for the population estimation of dense, informal settlements in low-income urban areas--so called 'slums'--using high-resolution satellite imagery. Precise population estimates are a crucial factor for efficient resource allocations by government authorities and NGO's, for instance in medical emergencies. We utilize equitable ground-truth data, which is gathered in collaboration with local communities: Through training and community mapping, the local population contributes their unique domain knowledge, while also maintaining agency over their data. This practice allows us to avoid carrying forward potential biases into the modeling pipeline, which might arise from a less rigorous ground-truthing approach. We contextualize our approach in respect to the ongoing discussion within the machine learning community, aiming to make real-world machine learning applications more inclusive, fair and accountable. Because of the resource intensive ground-truth generation process, our training data is limited. We propose a gridded population estimation model, enabling flexible and customizable spatial resolutions. We test our pipeline on three experimental site in Nigeria, utilizing pre-trained and fine-tune vision networks to overcome data sparsity. Our findings highlight the difficulties of transferring common benchmark models to real-world tasks. We discuss this and propose steps forward.

Snapshot matrices of hyperbolic equations have a slow singular value decay, resulting in inefficient reduced-order models. We develop on the idea of inducing a faster singular value decay by computing snapshots on a transformed spatial domain, or the so-called snapshot calibration/transformation. We are particularly interested in problems involving shock collision, shock rarefaction-fan collision, shock formation, etc. For such problems, we propose a realizable algorithm to compute the spatial transform using monotonic feature matching. We consider discontinuities and kinks as features, and by carefully partitioning the parameter domain, we ensure that the spatial transform has properties that are desirable both from a theoretical and an implementation standpoint. We use these properties to prove that our method results in a fast m-width decay of a so-called calibrated manifold. A crucial observation we make is that due to calibration, the m-width does not only depend on m but also on the accuracy of the full order model, which is in contrast to elliptic and parabolic problems that do not need calibration. The method we propose only requires the solution snapshots and not the underlying partial differential equation (PDE) and is therefore, data-driven. We perform several numerical experiments to demonstrate the effectiveness of our method.

We provide new algorithms for maintaining approximate distances in a weighted undirected graph $G = (V, E)$ subject to edge deletions. Our first result is an algorithm that maintains $(1+\epsilon)$-approximate distances from a set of $s$ sources in $\tilde{O}(sm)$ total update time, assuming that $s= n^{\Omega(1)}$, $\epsilon = \Omega(1)$ and $|E|= n^{1+\Omega(1)}$. This matches the best known static algorithm, up to polylogarithmic factors for a wide range of settings. The currently best known algorithm for the problem is obtained by running the single-source algorithm of [Henzinger, Krinninger and Nanongkai, FOCS'14] independently from each source. Our result improves over the update time bound of this solution by removing a $2^{\tilde{O}(\log^{3/4} n)}$ factor. Additionally, we can maintain a $(1+\epsilon)$-approximate single-source shortest paths with amortized update time of $2^{\tilde{O}(\sqrt{\log n})}$, when $0< \epsilon<1$ is a constant and $|E|= n2^{\tilde{\Omega}(\sqrt{\log n})}$. This improves over the best known update time of $2^{\tilde{O}(\log^{3/4} n)}$ by [Henzinger, Krinninger and Nanongkai, FOCS'14]. Furthermore, for any integer $k \geq 1$ we give an algorithm for maintaining $(2k-1)(1+\epsilon)$-approximate all-pairs-shortest-paths, in $\tilde{O}(mn^{1/k})$ total update time and $O(k)$ query time\footnote{Throughout this paper we use the notation $\tilde{O}(f(n))$ to hide factors of $O(\text{polylog } (f(n)))$.}. This improves over the result of [Chechik, FOCS'18] in a twofold way. Namely, we improve the total update time bound by removing an $n^{o(1)}$ factor and reduce the query time from $O(\log \log (nW))$ to $O(k)$. Our results are based on a new decremental hopset construction that may be of independent interest.

In large-area forest inventories a trade-off between the amount of data to be sampled and the costs of collecting the data is necessary. It is not always possible to have a very large data sample when dealing with sampling-based inventories. It is therefore necessary to optimize the sampling design in order to achieve optimal population parameter estimation. On the contrary, the availability of remote sensing (RS) data correlated with the forest inventory variables is usually much higher. The combination of RS and the sampled field measurement data is often used for improving the forest inventory parameter estimation. In addition, it is also reasonable to study the utilization of RS data in inventory sampling, which can further improve the estimation of forest variables. In this study, we propose a data sampling method based on Bayesian optimization which uses RS data in forest inventory sample selection. The presented method applies the learned functional relationship between the RS and inventory data in new sampling decisions. We evaluate our method by conducting simulated sampling experiments with both synthetic data and measured data from the Aland region in Finland. The proposed method is benchmarked against two baseline methods: simple random sampling and the local pivotal method. The results of the simulated experiments show the best results in terms of MSE values for the proposed method when the functional relationship between RS and inventory data is correctly learned from the available training data.

Soft robots are notoriously hard to control. This is partly due to the scarcity of models able to capture their complex continuum mechanics, resulting in a lack of control methodologies that take full advantage of body compliance. Currently available simulation methods are either too computational demanding or overly simplistic in their physical assumptions, leading to a paucity of available simulation resources for developing such control schemes. To address this, we introduce Elastica, a free, open-source simulation environment for soft, slender rods that can bend, twist, shear and stretch. We demonstrate how Elastica can be coupled with five state-of-the-art reinforcement learning algorithms to successfully control a soft, compliant robotic arm and complete increasingly challenging tasks.

How meaning is represented in the brain is still one of the big open questions in neuroscience. Does a word (e.g., bird) always have the same representation, or does the task under which the word is processed alter its representation (answering "can you eat it?" versus "can it fly?")? The brain activity of subjects who read the same word while performing different semantic tasks has been shown to differ across tasks. However, it is still not understood how the task itself contributes to this difference. In the current work, we study Magnetoencephalography (MEG) brain recordings of participants tasked with answering questions about concrete nouns. We investigate the effect of the task (i.e. the question being asked) on the processing of the concrete noun by predicting the millisecond-resolution MEG recordings as a function of both the semantics of the noun and the task. Using this approach, we test several hypotheses about the task-stimulus interactions by comparing the zero-shot predictions made by these hypotheses for novel tasks and nouns not seen during training. We find that incorporating the task semantics significantly improves the prediction of MEG recordings, across participants. The improvement occurs 475-550ms after the participants first see the word, which corresponds to what is considered to be the ending time of semantic processing for a word. These results suggest that only the end of semantic processing of a word is task-dependent, and pose a challenge for future research to formulate new hypotheses for earlier task effects as a function of the task and stimuli.

Graph Neural Networks are perfectly suited to capture latent interactions occurring in the spatio-temporal domain. But when an explicit structure is not available, as in the visual domain, it is not obvious what atomic elements should be represented as nodes. They should depend on the context and the kinds of relations that we are interested in. We are focusing on modeling relations between instances by proposing a method that takes advantage of the locality assumption to create nodes that are clearly localised in space. Current works are using external object detectors or fixed regions to extract features corresponding to graph nodes, while we propose a module for generating the regions associated with each node dynamically, without explicit object-level supervision. Conditioned on the input, for each node we predict the location and size of a region and use them to pool node features using a differentiable mechanism. Constructing these localised, adaptive nodes makes our model biased towards object-centric representations and we show that it improves the modeling of visual interactions. By relying on a few localized nodes, our method learns to focus on salient regions leading to a more explainable model. Our model achieves superior results on video classification tasks involving instance interactions.

In this paper we present a novel radar-camera sensor fusion framework for accurate object detection and distance estimation in autonomous driving scenarios. The proposed architecture uses a middle-fusion approach to fuse the radar point clouds and RGB images. Our radar object proposal network uses radar point clouds to generate 3D proposals from a set of 3D prior boxes. These proposals are mapped to the image and fed into a Radar Proposal Refinement (RPR) network for objectness score prediction and box refinement. The RPR network utilizes both radar information and image feature maps to generate accurate object proposals and distance estimations. The radar-based proposals are combined with image-based proposals generated by a modified Region Proposal Network (RPN). The RPN has a distance regression layer for estimating distance for every generated proposal. The radar-based and image-based proposals are merged and used in the next stage for object classification. Experiments on the challenging nuScenes dataset show our method outperforms other existing radar-camera fusion methods in the 2D object detection task while at the same time accurately estimates objects' distances.

Since the Lipschitz properties of convolutional neural network (CNN) are widely considered to be related to adversarial robustness, we theoretically characterize the $\ell_1$ norm and $\ell_\infty$ norm of 2D multi-channel convolutional layers and provide efficient methods to compute the exact $\ell_1$ norm and $\ell_\infty$ norm. Based on our theorem, we propose a novel regularization method termed norm decay, which can effectively reduce the norms of CNN layers. Experiments show that norm-regularization methods, including norm decay, weight decay, and singular value clipping, can improve generalization of CNNs. However, we are surprised to find that they can slightly hurt adversarial robustness. Furthermore, we compute the norms of layers in the CNNs trained with three different adversarial training frameworks and find that adversarially robust CNNs have comparable or even larger norms than their non-adversarially robust counterparts. Moreover, we prove that under a mild assumption, adversarially robust classifiers can be achieved with neural networks and an adversarially robust neural network can have arbitrarily large Lipschitz constant. For these reasons, enforcing small norms of CNN layers may be neither effective nor necessary in achieving adversarial robustness. Our code is available at https://github.com/youweiliang/norm_robustness.

DRAM Main memory is a performance bottleneck for many applications due to the high access latency. In-DRAM caches work to mitigate this latency by augmenting regular-latency DRAM with small-but-fast regions of DRAM that serve as a cache for the data held in the regular-latency region of DRAM. While an effective in-DRAM cache can allow a large fraction of memory requests to be served from a fast DRAM region, the latency savings are often hindered by inefficient mechanisms for relocating copies of data into and out of the fast regions. Existing in-DRAM caches have two sources of inefficiency: (1) the data relocation granularity is an entire multi-kilobyte row of DRAM; and (2) because the relocation latency increases with the physical distance between the slow and fast regions, multiple fast regions are physically interleaved among slow regions to reduce the relocation latency, resulting in increased hardware area and manufacturing complexity. We propose a new substrate, FIGARO, that uses existing shared global buffers among subarrays within a DRAM bank to provide support for in-DRAM data relocation across subarrays at the granularity of a single cache block. FIGARO has a distance-independent latency within a DRAM bank, and avoids complex modifications to DRAM. Using FIGARO, we design a fine-grained in-DRAM cache called FIGCache. The key idea of FIGCache is to cache only small, frequently-accessed portions of different DRAM rows in a designated region of DRAM. By caching only the parts of each row that are expected to be accessed in the near future, we can pack more of the frequently-accessed data into FIGCache, and can benefit from additional row hits in DRAM. Our evaluations show that FIGCache improves the average performance of a system using DDR4 DRAM by 16.3% and reduces average DRAM energy consumption by 7.8% for 8-core workloads, over a conventional system without in-DRAM caching.

The increasing importance of robots and automation creates a demand for learnable controllers which can be obtained through various approaches such as Evolutionary Algorithms (EAs) or Reinforcement Learning (RL). Unfortunately, these two families of algorithms have mainly developed independently and there are only a few works comparing modern EAs with deep RL algorithms. We show that Multidimensional Archive of Phenotypic Elites (MAP-Elites), which is a modern EA, can deliver better-performing solutions than one of the state-of-the-art RL methods, Proximal Policy Optimization (PPO) in the generation of locomotion controllers for a simulated hexapod robot. Additionally, extensive hyper-parameter tuning shows that MAP-Elites displays greater robustness across seeds and hyper-parameter sets. Generally, this paper demonstrates that EAs combined with modern computational resources display promising characteristics and have the potential to contribute to the state-of-the-art in controller learning.

Empathy is critical to successful mental health support. Empathy measurement has predominantly occurred in synchronous, face-to-face settings, and may not translate to asynchronous, text-based contexts. Because millions of people use text-based platforms for mental health support, understanding empathy in these contexts is crucial. In this work, we present a computational approach to understanding how empathy is expressed in online mental health platforms. We develop a novel unifying theoretically-grounded framework for characterizing the communication of empathy in text-based conversations. We collect and share a corpus of 10k (post, response) pairs annotated using this empathy framework with supporting evidence for annotations (rationales). We develop a multi-task RoBERTa-based bi-encoder model for identifying empathy in conversations and extracting rationales underlying its predictions. Experiments demonstrate that our approach can effectively identify empathic conversations. We further apply this model to analyze 235k mental health interactions and show that users do not self-learn empathy over time, revealing opportunities for empathy training and feedback.

Self-supervised pre-training of transformer models has revolutionized NLP applications. Such pre-training with language modeling objectives provides a useful initial point for parameters that generalize well to new tasks with fine-tuning. However, fine-tuning is still data inefficient -- when there are few labeled examples, accuracy can be low. Data efficiency can be improved by optimizing pre-training directly for future fine-tuning with few examples; this can be treated as a meta-learning problem. However, standard meta-learning techniques require many training tasks in order to generalize; unfortunately, finding a diverse set of such supervised tasks is usually difficult. This paper proposes a self-supervised approach to generate a large, rich, meta-learning task distribution from unlabeled text. This is achieved using a cloze-style objective, but creating separate multi-class classification tasks by gathering tokens-to-be blanked from among only a handful of vocabulary terms. This yields as many unique meta-training tasks as the number of subsets of vocabulary terms. We meta-train a transformer model on this distribution of tasks using a recent meta-learning framework. On 17 NLP tasks, we show that this meta-training leads to better few-shot generalization than language-model pre-training followed by finetuning. Furthermore, we show how the self-supervised tasks can be combined with supervised tasks for meta-learning, providing substantial accuracy gains over previous supervised meta-learning.

We develop primal-dual coordinate methods for solving bilinear saddle-point problems of the form $\min_{x \in \mathcal{X}} \max_{y\in\mathcal{Y}} y^\top A x$ which contain linear programming, classification, and regression as special cases. Our methods push existing fully stochastic sublinear methods and variance-reduced methods towards their limits in terms of per-iteration complexity and sample complexity. We obtain nearly-constant per-iteration complexity by designing efficient data structures leveraging Taylor approximations to the exponential and a binomial heap. We improve sample complexity via low-variance gradient estimators using dynamic sampling distributions that depend on both the iterates and the magnitude of the matrix entries. Our runtime bounds improve upon those of existing primal-dual methods by a factor depending on sparsity measures of the $m$ by $n$ matrix $A$. For example, when rows and columns have constant $\ell_1/\ell_2$ norm ratios, we offer improvements by a factor of $m+n$ in the fully stochastic setting and $\sqrt{m+n}$ in the variance-reduced setting. We apply our methods to computational geometry problems, i.e. minimum enclosing ball, maximum inscribed ball, and linear regression, and obtain improved complexity bounds. For linear regression with an elementwise nonnegative matrix, our guarantees improve on exact gradient methods by a factor of $\sqrt{\mathrm{nnz}(A)/(m+n)}$.

Deep neural networks require large training sets but suffer from high computational cost and long training times. Training on much smaller training sets while maintaining nearly the same accuracy would be very beneficial. In the few-shot learning setting, a model must learn a new class given only a small number of samples from that class. One-shot learning is an extreme form of few-shot learning where the model must learn a new class from a single example. We propose the `less than one'-shot learning task where models must learn $N$ new classes given only $M<N$ examples and we show that this is achievable with the help of soft labels. We use a soft-label generalization of the k-Nearest Neighbors classifier to explore the intricate decision landscapes that can be created in the `less than one'-shot learning setting. We analyze these decision landscapes to derive theoretical lower bounds for separating $N$ classes using $M<N$ soft-label samples and investigate the robustness of the resulting systems.

Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities? For example, in the intrusion detection setting, existing work seeks to detect anomalous events or edges in dynamic graph streams, but this does not allow us to take into account additional attributes of each entry. Our work aims to define a streaming multi-aspect data anomaly detection framework, termed MStream, which can detect unusual group anomalies as they occur, in a dynamic manner. MStream has the following properties: (a) it detects anomalies in multi-aspect data including both categorical and numeric attributes; (b) it is online, thus processing each record in constant time and constant memory; (c) it can capture the correlation between multiple aspects of the data. MStream is evaluated over the KDDCUP99, CICIDS-DoS, UNSW-NB 15 and CICIDS-DDoS datasets, and outperforms state-of-the-art baselines.

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprising edges. In this work, we propose MIDAS, which focuses on detecting microcluster anomalies, or suddenly arriving groups of suspiciously similar edges, such as lockstep behavior, including denial of service attacks in network traffic data. We further propose MIDAS-F, to solve the problem by which anomalies are incorporated into the algorithm's internal states, creating a 'poisoning' effect which can allow future anomalies to slip through undetected. MIDAS-F introduces two modifications: 1) We modify the anomaly scoring function, aiming to reduce the 'poisoning' effect of newly arriving edges; 2) We introduce a conditional merge step, which updates the algorithm's data structures after each time tick, but only if the anomaly score is below a threshold value, also to reduce the `poisoning' effect. Experiments show that MIDAS-F has significantly higher accuracy than MIDAS. MIDAS has the following properties: (a) it detects microcluster anomalies while providing theoretical guarantees about its false positive probability; (b) it is online, thus processing each edge in constant time and constant memory, and also processes the data 130 to 929 times faster than state-of-the-art approaches; (c) it provides 41% to 55% higher accuracy (in terms of ROC-AUC) than state-of-the-art approaches.

In this paper, we introduce a simple yet effective approach that can boost the vanilla ResNet-50 to 80%+ Top-1 accuracy on ImageNet without any tricks. Generally, our method is based on the recently proposed MEAL, i.e., ensemble knowledge distillation via discriminators. We further simplify it through 1) adopting the similarity loss and discriminator only on the final outputs and 2) using the average of softmax probabilities from all teacher ensembles as the stronger supervision for distillation. One crucial perspective of our method is that the one-hot/hard label should not be used in the distillation process. We show that such a simple framework can achieve state-of-the-art results without involving any commonly-used techniques, such as 1) architecture modification; 2) outside training data beyond ImageNet; 3) autoaug/randaug; 4) cosine learning rate; 5) mixup/cutmix training; 6) label smoothing; etc. On ImageNet, our method obtains 80.67% top-1 accuracy using a single crop-size of 224X224 on the vanilla ResNet-50, outperforming the previous state-of-the-arts by a remarkable margin under the same network structure. Our result can be regarded as a new strong baseline on ResNet-50 using knowledge distillation. To our best knowledge, this is the first work that is able to boost vanilla ResNet-50 to surpass 80% on ImageNet without architecture modification or additional training data. Our code and models are available at: https://github.com/szq0214/MEAL-V2.

Mitigating the risk arising from extreme events is a fundamental goal with many applications, such as the modelling of natural disasters, financial crashes, epidemics, and many others. To manage this risk, a vital step is to be able to understand or generate a wide range of extreme scenarios. Existing approaches based on Generative Adversarial Networks (GANs) excel at generating realistic samples, but seek to generate typical samples, rather than extreme samples. Hence, in this work, we propose ExGAN, a GAN-based approach to generate realistic and extreme samples. To model the extremes of the training distribution in a principled way, our work draws from Extreme Value Theory (EVT), a probabilistic approach for modelling the extreme tails of distributions. For practical utility, our framework allows the user to specify both the desired extremeness measure, as well as the desired extremeness probability they wish to sample at. Experiments on real US Precipitation data show that our method generates realistic samples, based on visual inspection and quantitative measures, in an efficient manner. Moreover, generating increasingly extreme examples using ExGAN can be done in constant time (with respect to the extremeness probability), as opposed to the exponential time required by the baseline approach.

Computing systems that can tolerate effects of errors in their communicated data values can trade this tolerance for improved resource efficiency. Many important applications of computing, such as embedded sensor systems, can tolerate errors that are bounded in their distribution of deviation from correctness (distortion). We present a channel adaptation technique which modulates properties of I/O channels typical in embedded sensor systems, to provide a tradeoff between I/O power dissipation and distortion of communicated data. We provide an efficient-to-compute formulation for the distribution of integer distortion accounting for the distribution of transmitted values. Using this formulation we implement our value-deviation-bounded (VDB) channel adaptation. We experimentally quantify the achieved reduction in power dissipation on a hardware prototype integrated with the required programmable channel modulation circuitry. We augment these experimental measurements with an analysis of the distributions of distortions. We show that our probabilistic VDB channel adaptation can provide up to a 2$\times$ reduction in I/O power dissipation. When synthesized for a miniature low-power FPGA intended for use in sensor interfaces, a register transfer level implementation of the channel adaptation control logic requires only 106 flip-flops and 224 4-input LUTs for implementing per-bit channel adaptation on serialized streams of 8-bit sensor data.

X-radiography (X-ray imaging) is a widely used imaging technique in art investigation. It can provide information about the condition of a painting as well as insights into an artist's techniques and working methods, often revealing hidden information invisible to the naked eye. In this paper, we deal with the problem of separating mixed X-ray images originating from the radiography of double-sided paintings. Using the visible color images (RGB images) from each side of the painting, we propose a new Neural Network architecture, based upon 'connected' auto-encoders, designed to separate the mixed X-ray image into two simulated X-ray images corresponding to each side. In this proposed architecture, the convolutional auto encoders extract features from the RGB images. These features are then used to (1) reproduce both of the original RGB images, (2) reconstruct the hypothetical separated X-ray images, and (3) regenerate the mixed X-ray image. The algorithm operates in a totally self-supervised fashion without requiring a sample set that contains both the mixed X-ray images and the separated ones. The methodology was tested on images from the double-sided wing panels of the \textsl{Ghent Altarpiece}, painted in 1432 by the brothers Hubert and Jan van Eyck. These tests show that the proposed approach outperforms other state-of-the-art X-ray image separation methods for art investigation applications.

We prove a categorical duality between a class of abstract algebras of partial functions and a class of (small) topological categories. The algebras are the isomorphs of collections of partial functions closed under the operations of composition, antidomain, range, and preferential union (or 'override'). The topological categories are those whose space of objects is a Stone space, source map is a local homeomorphism, target map is open, and all of whose arrows are epimorphisms.

We introduce a novel Bayesian phase estimation technique based on adaptive grid refinement method. This method automatically chooses the number particles needed for accurate phase estimation using grid refinement and cell merging strategies such that the total number of particles needed at each step is minimal. The proposed method provides a powerful alternative to traditional sampling based sequential Monte Carlo method which tend to fail in certain instances such as when the posterior distribution is bimodal. We also combine grid based and sampling based methods as hybrid particle filter where grid based method can be used to estimate a small but dominant set of parameters and Liu-West (LW) based SMC for the remaining set of parameters. Principal kurtosis analysis can be used to decide the choice of parameters for grid refinement method and for sampling based methods. We provide numerical results comparing the performance of the proposed grid refinement method with Liu-West resampling based SMC. Numerical results suggest that the proposed method is quite promising for quantum phase estimation. It can be easily adapted to Hamiltonian learning which is a very useful technique for estimating unknown parameters of a Hamiltonian and for characterizing unknown quantum devices.

Recently, Dvo\v{r}\'ak, Norin, and Postle introduced flexibility as an extension of list coloring on graphs [JGT 19']. In this new setting, each vertex $v$ in some subset of $V(G)$ has a request for a certain color $r(v)$ in its list of colors $L(v)$. The goal is to find an $L$ coloring satisfying many, but not necessarily all, of the requests. The main studied question is whether there exists a universal constant $\epsilon >0$ such that any graph $G$ in some graph class $\mathcal{C}$ satisfies at least $\epsilon$ proportion of the requests. More formally, for $k > 0$ the goal is to prove that for any graph $G \in \mathcal{C}$ on vertex set $V$, with any list assignment $L$ of size $k$ for each vertex, and for every $R \subseteq V$ and a request vector $(r(v): v\in R, ~r(v) \in L(v))$, there exists an $L$-coloring of $G$ satisfying at least $\epsilon|R|$ requests. If this is true, then $\mathcal{C}$ is called $\epsilon$-flexible for lists of size $k$. Choi et al. [arXiv 20'] introduced the notion of weak flexibility, where $R = V$. We further develop this direction by introducing a tool to handle weak flexibility. We demonstrate this new tool by showing that for every positive integer $b$ there exists $\epsilon(b)>0$ so that the class of planar graphs without $K_4, C_5 , C_6 , C_7, B_b$ is weakly $\epsilon(b)$-flexible for lists of size $4$ (here $K_n$, $C_n$ and $B_n$ are the complete graph, a cycle, and a book on $n$ vertices, respectively). We also show that the class of planar graphs without $K_4, C_5 , C_6 , C_7, B_5$ is $\epsilon$-flexible for lists of size $4$. The results are tight as these graph classes are not even 3-colorable.

A near-optimal reconstruction of the radiance of a High Dynamic Range scene from an exposure stack can be obtained by modeling the camera noise distribution. The latent radiance is then estimated using Maximum Likelihood Estimation. But this requires a well-calibrated noise model of the camera, which is difficult to obtain in practice. We show that an unbiased estimation of comparable variance can be obtained with a simpler Poisson noise estimator, which does not require the knowledge of camera-specific noise parameters. We demonstrate this empirically for four different cameras, ranging from a smartphone camera to a full-frame mirrorless camera. Our experimental results are consistent for simulated as well as real images, and across different camera settings.

CubeSats are miniature satellites used to carry experimental payloads into orbit, where it is often critical to precisely control their spatial orientation. One way to do this is through the use of magnetorquers, which can be integrated into PCBs. This technique saves considerable space and capital when compared with more common torque-rod magnetorquer systems. Here we derive a method of analyzing different PCB-integrated magnetorquer geometries, parametrizing them such that the moment and efficiency are maximized. Furthermore, by modulating the trace width, the trace number, and other electrical characteristics of the magnetorquer coil, this paper optimizes the generated magnetic moment. Both constant voltage and constant current sources are analyzed as inputs. These optimizations are then simulated in COMSOL for multiple geometries, and it is found that there exists an optimal geometry, given a specified power dissipation. Simulations verify the general trend and maxima of these derivations, barring small, consistent re-scaling in the magnitude of the coil resistance. It is also found that these PCB-magnetorquers provide a sufficient alternative to commercial coil magnetorquers - particularly in volume-restricted configurations. Optimizations for common PCB-implementable geometries on small satellites are tabulated in the Appendix.

Social gravity law widely exists in human travel, population migration, commodity trade, information communication, scientific collaboration and so on. Why is there such a simple law in many complex social systems is an interesting question. Although scientists from fields of statistical physics, complex systems, economics and transportation science have explained the social gravity law, a theoretical explanation including two dominant mechanisms, namely individual interaction and bounded rationality, is still lacking. Here we present a free utility model from the perspective of individual choice behavior to explain the social gravity law. The basic assumption is that bounded rational individuals interacting with each other will trade off the expected utility and information-processing cost to maximize their own utility. The previous explanations of the social gravity law including the maximum entropy model, the free cost model, the Logit model and the destination choice game model are all special cases under our model. Further, we extend the free utility model to the dummy network and real transportation network. This model not only helps us to better understand the underlying mechanisms of spatial interaction patterns in complex social systems, but also provides a new perspective for understanding the potential function in game theory and the user equilibrium model in transportation science.

Recent epidemiological studies have hypothesized that the prevalence of cortical cataracts is closely related to ultraviolet radiation. However, the prevalence of nuclear cataracts is higher in elderly people in tropical areas than in temperate areas. The dominant factors inducing nuclear cataracts have been widely debated. In this study, the temperature increase in the lens due to exposure to ambient conditions was computationally quantified in subjects of 50-60 years of age in tropical and temperate areas, accounting for differences in thermoregulation. A thermoregulatory response model was extended to consider elderly people in tropical areas. The time course of lens temperature for different weather conditions in five cities in Asia was computed. The temperature was higher around the mid and posterior part of the lens, which coincides with the position of the nuclear cataract. The duration of higher temperatures in the lens varied, although the daily maximum temperatures were comparable. A strong correlation (adjusted R2 > 0.85) was observed between the prevalence of nuclear cataract and the computed cumulative thermal dose in the lens. We propose the use of a cumulative thermal dose to assess the prevalence of nuclear cataracts. Cumulative wet-bulb globe temperature, a new metric computed from weather data, would be useful for practical assessment in different cities.

This paper focuses on wake on intent (WOI) techniques for platforms with limited compute and memory. Our approach of utterance-level intent classification is based on a sequence of keywords in the utterance instead of a single fixed key phrase. The keyword sequence is transformed into four types of input features, namely acoustics, phones, word2vec and speech2vec for individual intent learning and then fused decision making. If a wake intent is detected, it will trigger the power-costly ASR afterwards. The system is trained and tested on a newly collected internal dataset in Intel called AMIE, which will be reported in this paper for the first time. It is demonstrated that our novel technique with the representation of the key-phrases successfully achieved a noise robust intent classification in different domains including in-car human-machine communications. The wake on intent system will be low-power and low-complexity, which makes it suitable for always on operations in real life hardware-based applications.

In this work, we numerically study linear stability of multiple steady-state solutions to a type of steric Poisson--Nernst--Planck (PNP) equations with Dirichlet boundary conditions, which are applicable to ion channels. With numerically found multiple steady-state solutions, we obtain $S$-shaped current-voltage and current-concentration curves, showing hysteretic response of ion conductance to voltages and boundary concentrations with memory effects. Boundary value problems are proposed to locate bifurcation points and predict the local bifurcation diagram near bifurcation points on the $S$-shaped curves. Numerical approaches for linear stability analysis are developed to understand the stability of the steady-state solutions that are only numerically available. Finite difference schemes are proposed to solve a derived eigenvalue problem involving differential operators. The linear stability analysis reveals that the $S$-shaped curves have two linearly stable branches of different conductance levels and one linearly unstable intermediate branch, exhibiting classical bistable hysteresis. As predicted in the linear stability analysis, transition dynamics, from a steady-state solution on the unstable branch to a one on the stable branches, are led by perturbations associated to the mode of the dominant eigenvalue. Further numerical tests demonstrate that the finite difference schemes proposed in the linear stability analysis are second-order accurate. Numerical approaches developed in this work can be applied to study linear stability of a class of time-dependent problems around their steady-state solutions that are computed numerically.

This document gives a specification for the model used in [1]. It presents a simple way of optimizing mutual information between some input and the attractors of a (noisy) network, using a genetic algorithm. The nodes of this network are modeled as simplified versions of the structures described in the "interface theory of perception" [2]. Accordingly, the system is referred to as a "perceptual network". The present paper is an edited version of technical parts of [1] and serves as accompanying text for the Python implementation PerceptualNetworks, freely available under [3]. 1. Prentner, R., and Fields, C.. Using AI methods to Evaluate a Minimal Model for Perception. OpenPhilosophy 2019, 2, 503-524. 2. Hoffman, D. D., Prakash, C., and Singh, M.. The Interface Theory of Perception. Psychonomic Bulletin and Review 2015, 22, 1480-1506. 3. Prentner, R.. PerceptualNetworks. https://github.com/RobertPrentner/PerceptualNetworks. (accessed September 17 2020)

Automatic forecasting is the task of receiving a time series and returning a forecast for the next time steps without any human intervention. We propose an approach for automatic forecasting based on Gaussian Processes (GPs). So far, the main limits of GPs on this task have been the lack of a criterion for the selection of the kernel and the long times required for training different competing kernels. We design a fixed additive kernel, which contains the components needed to model most time series. During training the unnecessary components are made irrelevant by automatic relevance determination. We assign priors to each hyperparameter. We design the priors by analyzing a separate set of time series through a hierarchical GP. The resulting model performs very well on different types of time series, being competitive or outperforming the state-of-the-art approaches.Thanks to the priors, we reliably estimate the parameters with a single restart; this speedup makes the model efficient to train and suitable for processing a large number of time series.

Multidimensional Scaling (MDS) is one of the first fundamental manifold learning methods. It can be categorized into several methods, i.e., classical MDS, kernel classical MDS, metric MDS, and non-metric MDS. Sammon mapping and Isomap can be considered as special cases of metric MDS and kernel classical MDS, respectively. In this tutorial and survey paper, we review the theory of MDS, Sammon mapping, and Isomap in detail. We explain all the mentioned categories of MDS. Then, Sammon mapping, Isomap, and kernel Isomap are explained. Out-of-sample embedding for MDS and Isomap using eigenfunctions and kernel mapping are introduced. Then, Nystrom approximation and its use in landmark MDS and landmark Isomap are introduced for big data embedding. We also provide some simulations for illustrating the embedding by these methods.

As the number of installed meters in buildings increases, there is a growing number of data time-series that could be used to develop data-driven models to support and optimize building operation. However, building data sets are often characterized by errors and missing values, which are considered, by the recent research, among the main limiting factors on the performance of the proposed models. Motivated by the need to address the problem of missing data in building operation, this work presents a data-driven approach to fill these gaps. In this study, three different autoencoder neural networks are trained to reconstruct missing indoor environment data time-series in a data set collected in an office building in Aachen, Germany. The models are applicable for different time-series obtained from room automation, such as indoor air temperature, relative humidity and $CO_{2}$ data streams. The results prove that the proposed methods outperform classic numerical approaches and they result in reconstructing the corresponding variables with average RMSEs of 0.42 {\deg}C, 1.30 % and 78.41 ppm, respectively.

Recently, suction-based robotic systems with microscopic features or active suction components have been proposed to grip rough and irregular surfaces. However, sophisticated fabrication methods or complex control systems are required for such systems, and robust attachment to rough real-world surfaces still remains a grand challenge. Here, we propose a fully soft robotic gripper, where a flat elastic membrane is used to conform and contact parts or surfaces well, where an internal negative pressure exerted on the air-sealed membrane induces the suction-based gripping. 3D printing in combination with soft molding techniques enable the fabrication of the soft gripper. Robust attachment to complex 3D and rough surfaces is enabled by the surface-conformable soft flat membrane, which generates strong and robust suction at the contact interface. Such robust attachment to rough and irregular surfaces enables manipulation of a broad range of real-world objects, such as an egg, lime, and foiled package, without any physical damage. Compared to the conventional suction cup designs, the proposed suction gripper design shows a four-fold increase in gripping performance on rough surfaces. Furthermore, the structural and material simplicity of the proposed gripper architecture facilitates its system-level integration with other soft robotic peripherals, which can enable broader impact in diverse fields, such as digital manufacturing, robotic manipulation, and medical gripping applications.

In this paper, we propose an online speaker diarization system based on Relation Network, named RenoSD. Unlike conventional diariztion systems which consist of several independently-optimized modules, RenoSD implements voice-activity-detection (VAD), embedding extraction, and speaker identity association using a single deep neural network. The most striking feature of RenoSD is that it adopts a meta-learning strategy for speaker identity association. In particular, the relation network learns to learn a deep distance metric in a data-driven way and it can determine through a simple forward pass whether two given segments belong to the same speaker. As such, RenoSD can be performed in an online manner with low latency. Experimental results on AMI and CALLHOME datasets show that the proposed RenoSD system achieves consistent improvements over the state-of-the-art x-vector baseline. Compared with an existing online diarization system named UIS-RNN, RenoSD achieves a better performance using much fewer training data and at a lower time complexity.

We consider active learning (AL) in an uncertain environment in which trade-off between multiple risk measures need to be considered. As an AL problem in such an uncertain environment, we study Mean-Variance Analysis in Bayesian Optimization (MVA-BO) setting. Mean-variance analysis was developed in the field of financial engineering and has been used to make decisions that take into account the trade-off between the average and variance of investment uncertainty. In this paper, we specifically focus on BO setting with an uncertain component and consider multi-task, multi-objective, and constrained optimization scenarios for the mean-variance trade-off of the uncertain component. When the target blackbox function is modeled by Gaussian Process (GP), we derive the bounds of the two risk measures and propose AL algorithm for each of the above three problems based on the risk measure bounds. We show the effectiveness of the proposed AL algorithms through theoretical analysis and numerical experiments.

We analyze the solution regularity and discrete-time approximations of extended mean field control (extended MFC) problems, which seek optimal control of McKean-Vlasov dynamics whose coefficients involve mean field interactions both on the state and actions, and where objectives are optimized over open-loop strategies. We show that for a large class of extended MFC problems, the unique optimal open-loop control is 1/2-H\"{o}lder continuous in time. Based on the solution regularity, we prove that the value functions of such extended MFC problems can be approximated by those with piecewise constant controls and discrete-time state processes arising from Euler-Maruyama time stepping up to an order 1/2 error, which is optimal in our setting. We further show that any $\epsilon$-optimal controls of these discrete-time problems converge to the optimal control of the original problems. To establish the time regularity of optimal controls and the convergence of time discretizations, we extend the canonical path regularity results to general coupled McKean-Vlasov forward-backward stochastic differential equations, which are of independent interest.

Blind single image deblurring has been a challenge over many decades due to the ill-posed nature of the problem. In this paper, we propose a single-frame blind deblurring solution with the aid of Laplacian filters. Utilized Residual Dense Network has proven its strengths in superresolution task, thus we selected it as a baseline architecture. We evaluated the proposed solution with state-of-art DNN methods on a benchmark dataset. The proposed method shows significant improvement in image quality measured objectively and subjectively.

Being a very promising technology, with impressive advances in the recent years, it is still unclear how quantum computing will scale to satisfy the requirements of its most powerful applications. Although continued progress in the fabrication and control of qubits is required, quantum computing scalability will depend as well on a comprehensive architectural design considering a multi-core approach as an alternative to the traditional monolithic version, hence including a communications perspective. However, this goes beyond introducing mere interconnects. Rather, it implies consolidating the full communications stack in the quantum computer architecture. In this paper, we propose a double full-stack architecture encompassing quantum computation and quantum communications, which we use to address the monolithic versus multi-core question with a structured design methodology. For that, we revisit the different quantum computing layers to capture and model their essence by highlighting the open design variables and performance metrics. Using behavioral models and actual measurements from existing quantum computers, the results of simulations suggest that multi-core architectures may effectively unleash the full quantum computer potential.

Quantum state tomography is a powerful, but resource-intensive, general solution for numerous quantum information processing tasks. This motivates the design of robust tomography procedures that use relevant resources as sparingly as possible. Important cost factors include the number of state copies and measurement settings, as well as classical postprocessing time and memory. In this work, we present and analyze an online tomography algorithm that is designed to optimize all the aforementioned resources at the cost of a worse dependence on accuracy. The protocol is the first to give optimal performance in terms of rank and dimension for state copies, measurement settings and memory. Classical runtime is also reduced substantially. Further improvements are possible by executing the algorithm on a quantum computer, giving a quantum speedup for quantum state tomography.

In an increasing connected world, resilience is an important ability for a system to retain its original function when perturbations happen. Even though we understand small-scale resilience well, our understanding of large-scale networked resilience is limited. Recent research in network-level resilience and node-level resilience pattern has advanced our understanding of the relationship between topology and dynamics across network scales. However, the effect of uncertainty in a large-scale networked system is not clear, especially when uncertainties cascade between connected nodes. In order to quantify resilience uncertainty across the network resolutions (macro to micro), we develop an arbitrary polynomial chaos (aPC) expansion method to estimate the resilience subject to parameter uncertainties with arbitrary distributions. For the first time and of particular importance, is our ability to identify the probability of a node in losing its resilience and how the different model parameters contribute to this risk. We test this using a generic networked bi-stable system and this will aid practitioners to both understand macro-scale behaviour and make micro-scale interventions.

The problem of finding distributions of input parameters for deterministic mechanistic models to match distributions of model outputs to stochastic observations, i.e., the "Stochastic Inverse Problem" (SIP), encompasses a range of common tasks across a variety of scientific disciplines. Here, we demonstrate that SIP could be reformulated as a constrained optimization problem and adapted for applications in intervention studies to simultaneously infer model input parameters for two sets of observations, under control conditions and under an intervention. In the constrained optimization problem, the solution of SIP is enforced to accommodate the prior knowledge on the model input parameters and to produce outputs consistent with given observations by minimizing the divergence between the inferred distribution of input parameters and the prior. Unlike in standard SIP, the prior incorporates not only knowledge about model input parameters for objects in each set, but also information on the joint distribution or the deterministic map between the model input parameters in two sets of observations. To solve standard and intervention SIP, we employed conditional generative adversarial networks (GANs) and designed novel GANs that incorporate multiple generators and discriminators and have structures that reflect the underlying constrained optimization problems. This reformulation allows us to build computationally scalable solutions to tackle complex model input parameter inference scenarios, which appear routinely in physics, biophysics, economics and other areas, and which currently could not be handled with existing methods.

This paper proposes a new appliance identification scheme by introducing a novel approach for extracting highly discriminative characteristic sets that can considerably distinguish between various appliance footprints. In this context, a precise and powerful characteristic projection technique depending on fuzzy-neighbors-preserving analysis based QR-decomposition (FNPA-QR) is applied on the extracted energy consumption time-domain features. The FNPA-QR aims to diminish the distance among the between class features and increase the gap among features of dissimilar categories. Following, a novel bagging decision tree (BDT) classifier is also designed to further improve the classification accuracy. The proposed technique is then validated on three appliance energy consumption datasets, which are collected at both low and high frequency. The practical results obtained point out the outstanding classification rate of the time-domain based FNPA-QR and BDT.

A blood cell lineage consists of several consecutive developmental stages from the pluri- or multipotent stem cell to a state of terminal differentiation. Despite their importance for human biology, the regulatory pathways and gene networks that govern these differentiation processes are not yet fully understood. This is in part due to challenges associated with delineating the interactions between transcription factors (TFs) and their target genes. A possible path forward in this issue is provided by increasingly available expression data as a basis for linking differentiation stages and gene activities. Here, we present a novel hierarchical approach to identify characteristic expression peak patterns that global regulators expose along the differentiation path of cell lineages. Based on such simple patterns, we identify cell state-specific marker genes and extract TFs that likely drive their differentiation. Integration of the mean expression values of stage-specific key player genes yields a distinct peaking pattern for each lineage that is used to identify further genes in the dataset behaving similarly. Incorporating the set of TFs which regulate these genes incurred at a set of stage-specific regulators controlling the biological process of cell fate. As proof of concept, we consider two expression datasets covering key differentiation events in blood cell formation of mice.

Objective: Modern medicine needs to shift from a wait and react, curative discipline to a preventative, interdisciplinary science aiming at providing personalised, systemic and precise treatment plans to patients. The aim of this work is to present how the integration of machine learning approaches with mechanistic computational modelling could yield a reliable infrastructure to run probabilistic simulations where the entire organism is considered as a whole. Methods: We propose a general framework that composes advanced AI approaches and integrates mathematical modelling in order to provide a panoramic view over current and future physiological conditions. The proposed architecture is based on a graph neural network (GNNs) forecasting clinically relevant endpoints (such as blood pressure) and a generative adversarial network (GANs) providing a proof of concept of transcriptomic integrability. Results: We show the results of the investigation of pathological effects of overexpression of ACE2 across different signalling pathways in multiple tissues on cardiovascular functions. We provide a proof of concept of integrating a large set of composable clinical models using molecular data to drive local and global clinical parameters and derive future trajectories representing the evolution of the physiological state of the patient. Significance: We argue that the graph representation of a computational patient has potential to solve important technological challenges in integrating multiscale computational modelling with AI. We believe that this work represents a step forward towards a healthcare digital twin.

Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy.

The contributions of this paper are twofold. First, we show the potential interest of Complex-Valued Neural Network (CVNN) on classification tasks for complex-valued datasets. To highlight this assertion, we investigate an example of complex-valued data in which the real and imaginary parts are statistically dependent through the property of non-circularity. In this context, the performance of fully connected feed-forward CVNNs is compared against a real-valued equivalent model. The results show that CVNN performs better for a wide variety of architectures and data structures. CVNN accuracy presents a statistically higher mean and median and lower variance than Real-Valued Neural Network (RVNN). Furthermore, if no regularization technique is used, CVNN exhibits lower overfitting. The second contribution is the release of a Python library (Barrachina 2019) using Tensorflow as back-end that enables the implementation and training of CVNNs in the hopes of motivating further research on this area.

In this paper, we develop a knowledge-assisted deep reinforcement learning (DRL) algorithm to design wireless schedulers in the fifth-generation (5G) cellular networks with time-sensitive traffic. Since the scheduling policy is a deterministic mapping from channel and queue states to scheduling actions, it can be optimized by using deep deterministic policy gradient (DDPG). We show that a straightforward implementation of DDPG converges slowly, has a poor quality-of-service (QoS) performance, and cannot be implemented in real-world 5G systems, which are non-stationary in general. To address these issues, we propose a theoretical DRL framework, where theoretical models from wireless communications are used to formulate a Markov decision process in DRL. To reduce the convergence time and improve the QoS of each user, we design a knowledge-assisted DDPG (K-DDPG) that exploits expert knowledge of the scheduler deign problem, such as the knowledge of the QoS, the target scheduling policy, and the importance of each training sample, determined by the approximation error of the value function and the number of packet losses. Furthermore, we develop an architecture for online training and inference, where K-DDPG initializes the scheduler off-line and then fine-tunes the scheduler online to handle the mismatch between off-line simulations and non-stationary real-world systems. Simulation results show that our approach reduces the convergence time of DDPG significantly and achieves better QoS than existing schedulers (reducing 30% ~ 50% packet losses). Experimental results show that with off-line initialization, our approach achieves better initial QoS than random initialization and the online fine-tuning converges in few minutes.

Most traffic state forecast algorithms when applied to urban road networks consider only the links in close proximity to the target location. However, for longer-term forecasts also the traffic state of more distant links or regions of the network are expected to provide valuable information for a data-driven algorithm. This paper studies these expectations of using a network clustering algorithm and one year of Floating Car (FCD) collected by a large fleet of vehicles. First, a clustering algorithm is applied to the data in order to extract congestion-prone regions in the Munich city network. The level of congestion inside these clusters is analyzed with the help of statistical tools. Clear spatio-temporal congestion patterns and correlations between the clustered regions are identified. These correlations are integrated into a K- Nearest Neighbors (KNN) travel time prediction algorithm. In a comparison with other approaches, this method achieves the best results. The statistical results and the performance of the KNN predictor indicate that the consideration of the network-wide traffic is a valuable feature for predictors and a promising way to develop more accurate algorithms in the future.

Boosting is a general method to convert a weak learner (which generates hypotheses that are just slightly better than random) into a strong learner (which generates hypotheses that are much better than random). Recently, Arunachalam and Maity gave the first quantum improvement for boosting, by combining Freund and Schapire's AdaBoost algorithm with a quantum algorithm for approximate counting. Their booster is faster than classical boosting as a function of the VC-dimension of the weak learner's hypothesis class, but worse as a function of the quality of the weak learner. In this paper we give a substantially faster and simpler quantum boosting algorithm, based on Servedio's SmoothBoost algorithm.

Turbulent fluid flows are ubiquitous in nature and technology, and are mathematically described by the incompressible Navier-Stokes equations (INSE). A hallmark of turbulence is spontaneous generation of intense whirls, resulting from amplification of the fluid rotation-rate (vorticity) by its deformation-rate (strain). This interaction, encoded in the non-linearity of INSE, is non-local, i.e., depends on the entire state of the flow, constituting a serious hindrance in turbulence theory and in establishing regularity of INSE. Here, we unveil a novel aspect of this interaction, by separating strain into local and non-local contributions utilizing the Biot-Savart integral of vorticity in a sphere of radius R. Analyzing highly-resolved numerical turbulent solutions to INSE, we find that when vorticity becomes very large, the local strain over small R surprisingly counteracts further amplification. This uncovered self-attenuation mechanism is further shown to be connected to local Beltramization of the flow, and could provide a direction in establishing the regularity of INSE.

Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behavior, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternate perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network's behavior through its displacements, we show the presence of a low kinetic energy displacement bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: finding neural networks which solve the task while transporting the data as efficiently as possible. This offers a novel formulation of the learning problem which allows us to provide regularity results for the solution network, based on Optimal Transport theory. From a practical viewpoint, this allows us to propose a new learning algorithm, which automatically adapts to the complexity of the given task, and leads to networks with a high generalization ability even in low data regimes.

We derive the backpropagation algorithm for spiking neural networks composed of leaky integrate-and-fire neurons operating in continuous time. This algorithm, EventProp, computes the exact gradient of an arbitrary loss function of spike times and membrane potentials by backpropagating errors in time. For the first time, by leveraging methods from optimal control theory, we are able to backpropagate errors through spike discontinuities and avoid approximations or smoothing operations. EventProp can be applied to spiking networks with arbitrary connectivity, including recurrent, convolutional and deep feed-forward architectures. While we consider the leaky integrate-and-fire neuron model in this work, our methodology to derive the gradient can be applied to other spiking neuron models. As errors are backpropagated in an event-based manner (at spike times), EventProp requires the storage of state variables only at these times, providing favorable memory requirements. We demonstrate learning using gradients computed via EventProp in a deep spiking network using an event-based simulator and a non-linearly separable dataset encoded using spike time latencies. Our work supports the rigorous study of gradient-based methods to train spiking neural networks while providing insights toward the development of learning algorithms in neuromorphic hardware.

Computational models lie at the intersection of basic neuroscience and healthcare applications because they allow researchers to test hypotheses \textit{in silico} and predict the outcome of experiments and interactions that are very hard to test in reality. Yet, what is meant by "computational model" is understood in many different ways by researchers in different fields of neuroscience and psychology, hindering communication and collaboration. In this review, we point out the state of the art of computational modeling in Electroencephalography (EEG) and outline how these models can be used to integrate findings from electrophysiology, network-level models, and behavior. On the one hand, computational models serve to investigate the mechanisms that generate brain activity, for example measured with EEG, such as the transient emergence of oscillations at different frequency bands and/or with different spatial topographies. On the other hand, computational models serve to design experiments and test hypotheses \emph{in silico}. The final purpose of computational models of EEG is to obtain a comprehensive understanding of the mechanisms that underlie the EEG signal. This is crucial for an accurate interpretation of EEG measurements that may ultimately serve in the development of novel clinical applications.

Driven by the need for principled extraction of features from time series,we introduce the iterated-sums signature over any commutative semiring.The case of the tropical semiring is a central, and our motivating, example,as it leads to features of (real-valued) time series that are not easily availableusing existing signature-type objects.