Abstract This paper tackles optimization problems whose objective and constraints involve a trained Neural Network (NN), where the goal is to maximize \(f(\Phi(x))\) subject to \(c(\Phi(x)) \leq 0\), with \(f\) smooth, \(c\) general and non-stringent, and \(\Phi\) an already trained and possibly nonwhite-box NN. We address two challenges regarding this problem: identifying ascent directions for local search, and ensuring reliable convergence towards relevant local solutions. To this end, we re-purpose the notion of directional NN attacks as efficient optimization subroutines, since directional NN attacks use the neural structure of \(\Phi\) to compute perturbations of \(x\) that steer \(\Phi(x)\) in prescribed directions. Precisely, we develop an attack operator that computes attacks of \(\Phi\) at any \(x\) along the direction \(\nabla f(\Phi(x))\). Then, we propose a hybrid algorithm combining the attack operator with derivative-free optimization (DFO) techniques, designed for numerical reliability by remaining oblivious to the structure of the problem. We consider the cdsm algorithm, which offers asymptotic guarantees to converge to a local solution under mild assumptions on the problem. The resulting method alternates between attack-based steps for heuristic yet fast local intensification and cdsm steps for certified convergence and numerical reliability. Experiments on three problems show that this hybrid approach consistently outperforms standard DFO baselines.

0.0.0.1 Keywords:

Deep Learning, Neural Networks Attacks, Hybrid Methods, Derivative-Free Optimization, Simulation-Based Optimization, Neural Surrogate Models, Digital Twins.

We are grateful to Paul A. Patience and Bruno Blais, from the Department of Chemical Engineering at Polytechnique Montréal, for providing the Physics-Informed Neural Network used in our numerical experiment in Section 5.4. This research was supported by SCALE-AI through its Academic Research Chairs program.

1 Introduction↩︎

Neural Networks (NNs) are modelling tools acknowledged across various scientific domains. Beyond their widespread use in classification (such as in computer vision [1] and medical diagnostics [2]), NNs are increasingly deployed for applications such as, among many others, physics-informed learning [3], [4] and uncertainty quantification [5]. This allows NNs to be used as digital twins [6] of a physical system, that is, rich models of the system allowing for real-time decision-making. This evolving use has given rise to a class of optimization problems in which a trained NN \(\Phi: \mathbb{R}^n \to \mathbb{R}^m\) acts as a surrogate function mapping the decision variables to a system response. The objective is to optimize the output of a task-specific goal function \(f: \mathbb{R}^m \to \mathbb{R}\) defined over the space of outputs of \(\Phi\), subject to constraints defined via a constraints function \(c: \mathbb{R}^m \to \mathbb{R}^p\). This leads to what we refer to as the composite problem of optimization through a neural network, \[\label{problem:P} } \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in \mathbb{R}^n\end{array}\endgroup}{\mathop{\mathrm{maximize}}} \quad f(\Phi(x)) \quad subject to \quad c(\Phi(x)) \leq 0,\tag{1}\] where \((n, m, p) \in (\mathbb{N}^*)^3\) denote the dimensions of, respectively, the variables space, the NN output space, and the constraints function output space. We study Problem 1 under the following Assumption 1.

Assumption 1 (Problem requirements). The NN \(\Phi\) encodes a continuous function, the goal function \(f\) is differentiable with gradient \(\nabla f\), and the feasible set \(F \triangleq\{x \in \mathbb{R}^n: c(\Phi(x)) \leq 0\}\) induced by the constraints function \(c\) is nonempty, ample (that is, \(F \subseteq \mathrm{cl}(\mathrm{int}(F))\)) and compact.

In this work, the NN \(\Phi\) is considered already trained and not tunable anymore. Moreover, we do not impose that \(\Phi\) is given as a white-box NN. A process mapping some inputs to associated outputs is a white-box when the mathematical function it encodes is explicitly exposed, fully accessible, and analytically exploitable. In the case of a NN, that means that the architecture and weights of the NN are all given. In contrast, the algorithm proposed in this paper requires no information about the NN model besides the continuity of the function it encodes, and our numerical implementation relies only on backpropagation [7], a tool that most framework for NNs provide and that may be used with no explicit knowledge of the structure of the NN. Hence, in this work we say that \(\Phi\) is possibly a nonwhite-box NN. We remark that, in the terminology of [8] and of derivative-free optimization (DFO) [9], our setting considers \(\Phi\) as a black-box NN; however we also remark that some authors such as [10] would consider our requirement as a grey-box NN; hence our choice of terminology.

To the best of our knowledge, only a few papers tackle Problem 1 , and most do so under more restrictive assumptions. They typically assume a linear goal function \(f\), a polyhedral feasible set \(F\), and moreover that \(\Phi\) is provided as a white-box NN [11]–[15]. Yet, the widespread adoption of NNs across industrial and engineering applications suggests that instances of Problem 1 involving nonwhite-box NNs are increasingly common, for example, in cases where \(\Phi\) is given as a compiled file since this allows it to run faster at the cost of transparency. To motivate the importance of tackling this broader setting, let us present two contexts where it naturally appears.

1.0.0.1 Simulation-based optimization.

A classical framework in DFO [9], [16] considers problems of simulation-based optimization [17]–[19], that have the form “\(\mathop{\mathrm{maximize}}_{x \in \mathbb{R}^n} f(B(x))\) subject to \(c(B(x)) \leq 0\)” where \(f\) and \(c\) are defined as in Problem 1 and the mapping \(B: \mathbb{R}^n \to \mathbb{R}^m\) denotes an intractable process that, typically, runs a costly numerical simulation parameterized by \(x \in \mathbb{R}^n\). This setting, popular in engineering design [20], [21], has the same composite structure as Problem 1 , differing only in that the intermediate mapping is the simulator \(B\) rather than the NN \(\Phi\). Yet, an increasing trend replaces simulators \(B\) by trained NNs \(\Phi\) [22] acting as digital twins, which yields instances of Problem 1 with key advantages: evaluating \(\Phi\) is typically orders of magnitude faster than running \(B\); and even if treated as a nonwhite-box, \(\Phi\) retains a neural architecture that can be exploited (for example by backpropagation). This motivates the development of dedicated optimization algorithms.

1.0.0.2 Counterfactual explanations in decision-focused learning.

Counterfactual explanations [10], [23] for a NN input consist in minimal changes to the input that induce a desired change in the output. When considering a classification task for a NN classifier \(\Phi: \mathbb{R}^n \to \llbracket 1,m\rrbracket\), counterfactual explanations \(x_\mathrm{cfa}\in \mathbb{R}^n\) for an input \(x_\mathrm{ini}\in \mathbb{R}^n\) and a target class \(y^\sharp \neq \Phi(x_\mathrm{ini})\) are solutions to “\(\mathop{\mathrm{minimize}}_{x \in \mathbb{R}^n} \lVert{x-x_\mathrm{ini}}\rVert^2\) subject to \(\Phi(x) = y^\sharp\)”. The notion also extends to decision-focused learning and contextual optimization [24], [25], where the NN \(\Phi: \mathbb{R}^n \to \mathbb{R}^m\) maps a so-called context \(x \in \mathbb{R}^n\) to parameters for a downstream optimization task “\(\mathop{\mathrm{minimize}}_{y \in \mathcal{C}} g(\Phi(x),y)\)”, involving some function \(g\) usually convex and some set \(\mathcal{C}\) that is typically combinatorial, for which an optimal decision rule \(y^*(x) \in \mathop{\mathrm{argmin}}_{y \in \mathcal{C}} g(\Phi(x), y)\) may be selected. In this setting, a counterfactual explanation of a context \(x_\mathrm{ini}\in \mathbb{R}^n\) consists in a context \(x_\mathrm{cfa}\in \mathbb{R}^n\) close to \(x_\mathrm{ini}\) and such that a given target decision \(y^\sharp \in \mathcal{C}\), usually proposed by an expert or a benchmark policy, is preferable to \(y^*(x_\mathrm{ini})\) for the problem induced by \(\Phi(x_\mathrm{cfa})\) [26], [27]. This results in the problem of \(\varepsilon\)-relative counterfactual, for any \(\varepsilon \in \mathbb{R}_+\), which aligns with Problem 1 since it is given by \[\underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in \mathbb{R}^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \left\lVert{x-x_\mathrm{ini}}\right\rVert^2 \quad subject to \quad (1+\varepsilon)g(\Phi(x), y^\sharp) \leq g(\Phi(x),y^*(x_\mathrm{ini})).\]

1.0.0.3 Contributions.

This paper addresses the growing need to solve increasingly challenging instances of Problem 1 , particularly in settings where the NN \(\Phi\) is not provided as a white-box. Our main insight is that a core challenge in Problem 1 , of identifying ascent directions for \(f \circ \Phi\) since \(\Phi\) has a complex structure, can be alleviated through directional NN attacks. Indeed, directional NN attacks exploit the structure of a NN to find a perturbation of the input that steers its outputs in a desired direction. This paper formalizes the idea to solve Problem 1 by iteratively computing directional attacks of \(\Phi\) at the incumbent solution \(x^k\) in the direction given by \(\nabla f(\Phi(x^k))\). Importantly, state-of-the-art NN attack techniques span a broad range of access levels: some assume white-box access to the NN, while others require only the ability to backpropagate, making them usable when \(\Phi\) is not a white-box. Moreover, this approach may be hybridized with existing optimization algorithms, acting as an additional step at all iterations to search for ascent directions using explicitly the neural structure of \(\Phi\). In particular, we hybridize this so-called attack step with DFO algorithms that aim for global search capability and numerical reliability in problems with nonconvexity, and nondifferentiability or poor gradient information. Therefore, in this paper,

we formalize the concept of directional NN attacks, and we show that they can be repurposed to compute ascent directions for Problem 1 . We highlight that standard NN attack algorithms can be used off-the-shelf for this purpose, and we describe how to embed them in a local optimization framework. While powerful when successful, this approach is inherently local and sometimes prone to failure for various technical reasons. This contribution forms the content of Section 3;
we hybridize the above attack-based approach with the covering direct search method (cdsm) [28], a DFO method that is guaranteed to asymptotically converge to a local solution to Problem 1 under Assumption 1. Our resulting hybrid method algorithm combines, at each iteration, two concepts with natural synergy: an attack step leveraging the internal structure of \(\Phi\) for fast local improvement, and steps from cdsm that allow for globalization strategies and asymptotic convergence towards a local solution. This contribution is detailed in Section 4;
we conduct numerical experiments on three problems to assess our hybrid algorithm against DFO baselines. The first problem acts as a proof of concept, while the other two are drawn from the simulation-based optimization and counterfactual explanation contexts highlighted in our motivation. Our experiments highlight that the attack step is not a reliable standalone method, but it nevertheless contributes significantly to the performance of our hybrid method. Overall, our hybrid method appears faster than the DFO baselines thanks to the attack step, and as reliable thanks to the steps from cdsm. Details about our experiments are given in Section 5.

We leave a literature review to Section 2 and a discussion foreseeing future work to Section 6. We conclude this section by introducing some notation used throughout the paper.

1.0.0.4 Notation.

We denote by \(\left<\cdot,\cdot\right>\) the dot product in \(\mathbb{R}^m\), by \(\lVert{\cdot}\rVert\) some norm in \(\mathbb{R}^n\), by \((e_i)_{i=1}^{n}\) the vectors of the canonical basis of \(\mathbb{R}^n\), and by \(\mathbb{1} \triangleq\sum_{i=1}^{n}e_i\) the vector of all ones in \(\mathbb{R}^n\). For all sets \(\mathcal{D}\subseteq \mathbb{R}^n\), we denote the positive spanning set of \(\mathcal{D}\) by \(\mathrm{PSpan}(\mathcal{D})\). For all \(r \in \mathbb{R}_+\), all \(x \in \mathbb{R}^n\), and all \(y \in \mathbb{R}^m\), we respectively denote by \(\mathcal{B}^n_r(x)\) and \(\mathcal{B}^m_r(y)\) the closed ball of radius \(r\) in \(\mathbb{R}^n\) centred at \(x\) and the closed ball of radius \(r\) in \(\mathbb{R}^m\) centred at \(y\). We omit the parentheses for the balls centred at \(0\), that is, \(\mathcal{B}_r^n \triangleq\mathcal{B}^n_r(0)\) and \(\mathcal{B}_r^m \triangleq\mathcal{B}^m_r(0)\). We denote the rectified linear unit function by \(\texttt{ReLU}: v \in \mathbb{R}^a \mapsto [\max(v_i,0)]_{i=1}^{a} \in \mathbb{R}^a\) and the softmax function by \(\texttt{softmax}: v \in \mathbb{R}^a \mapsto [\exp(v_i) / \sum_{j=1}^{a}\exp(v_j)]_{i=1}^{a} \in [0,1]^a\), regardless of the dimension \(a \in \mathbb{N}^*\) of the input vector \(v\). For all couples \((x_1, x_2) \in \mathbb{R}^n \times \mathbb{R}^n\), we say that \(x_2\) dominates \(x_1\) (with respect to Problem 1 ) if \(c(\Phi(x_2)) \leq 0\) and \(f(\Phi(x_2)) > f(\Phi(x_1))\). Similarly, for all couples \((x,d) \in \mathbb{R}^n \times \mathbb{R}^n\), we say that \(d\) is an ascent direction emanating from \(x\) (with respect to Problem 1 ) if \(x+d\) dominates \(x\). Note that in this work, we allow for non-unitary directions.

2 Related Literature↩︎

A few prior studies tackled Problem 1 , but under more restrictive assumptions [11]–[15]. They typically assume a linear goal function \(f\), a polyhedral feasible set \(F\), and moreover that \(\Phi\) is provided as a white-box NN. These approaches effectively exploit the structure of \(\Phi\), but their requirements contrast with the reality of many practical applications, where \(\Phi\) may be deployed as an opaque or compiled artifact to maximize inference speed, which makes white-box assumptions and layer-wise manipulations impractical.

The general setting of Problem 1 has been less explored. It relates to DFO because of the possible nonwhite-box nature of \(\Phi\) and the potential nonsmoothness of \(f \circ \Phi\). We are not aware of any DFO method specifically designed for Problem 1 , but most general-purpose DFO methods [9], [16], [29] may be applied. In this paper, we build our hybrid method upon the covering direct search method (cdsm) algorithm [28]. This choice is motivated by two aspects: the cdsm is an extension of the popular and widely studied direct search method [9] from DFO, and it has guarantees of convergence to a local solution to Problem 1 under Assumption 1. Nevertheless, considering the cdsm is not stringent. As shown in [28], many DFO methods may be adapted to inherit the same convergence properties as cdsm when enhanced with a covering step.

The idea of leveraging the neural structure of \(\Phi\) to identify ascent directions for \(f \circ \Phi\) relates to the literature on NN attacks. The seminal work of [30] introduces the notion of NN attacks. There are now several solvers to compute NN attacks [31]–[34], and we design our attack operator so that it may leverage any of them. We also remark that a desirable property of NN attacks is that they return successful attacks whenever some exists, which has connection with the notion of NN verification [35]–[37], an active research area [38], [39]. Neural network verification asks whether, for a given NN \(\Phi\), an input set \(X\) contains an element \(x\) such that \(\Phi(x)\) lies within a specified output set \(Y\). NN verification is typically a computationally demanding problem, and most parts of the \((\alpha,\beta)\)-CROWN solver [36] address precisely this verification task, albeit primarily in binary classification contexts.

In short, our work significantly departs from the existing literature on optimization through NNs by addressing Problem 1 without assuming that \(\Phi\) is a white-box NN. It also departs from the existing literature on DFO by proposing a hybrid method that explicitly leverages the neural structure of \(\Phi\).

3 Optimization Leveraging Directional `NN` Attacks↩︎

This section addresses the first objective of the paper: establishing directional NN attacks (formalized in Section 3.1) as a practical tool for solving Problem 1 . Specifically, in Section 3.2 we demonstrate that such attacks, when appropriately constructed, can be used to identify ascent directions (even when \(\Phi\) is a nonwhite-box NN model).

3.1 Directional `NN` Attacks↩︎

First, we contextualize the notion of NN attack and illustrate its purpose in Section 3.1.1. Then we formalize the notion of directional NN attacks in Section 3.1.2. Finally, we discuss practical approaches for computing such attacks in Section 3.1.3.

3.1.1 Background↩︎

The notion of NN attacks originates from the seminal work of [30], which empirically demonstrates a striking vulnerability of many neural networks: small, carefully crafted perturbations of an input can lead to large changes in the output of the network. The notion of NN attacks is initially formalized for NNs focusing on classification, and it is a central topic in adversarial machine learning, used both to manipulate models’ classifications and to evaluate robustness. A NN attack consists in finding an alteration of an input (within a ball of pre-defined radius) that changes the classification from the class predicted initially to any other. Similarly, a targeted NN attack consists in altering the input in order to make it classified as a specific class. In the context of the present work, the NNs we consider may not perform classification, but nevertheless the notion of NN attacks admits a direct extension that we formalize in Section 3.1.2. Before going to this point, let us illustrate the observation from [30] that a NN classification may be drastically vulnerable to slight alterations of the input.

Consider PyTorch’s ResNet18 NN for image classification (with its default training). This network is designed to classify images according to \(1000\) pre-selected classes. That is, \(\texttt{ResNet18}\) takes as inputs RGB images with \(224 \times 224\) pixels, represented as tensors in \(\mathbb{I}\triangleq[0,1]^{3\times224\times224}\), and outputs vectors in the 1000-dimensional space \(\mathbb{P}^{1000} \triangleq\{p \in [0,1]^{1000}: \sum_{\ell=1}^{1000} p_\ell = 1\}\), where each component represents the plausibility that the image depicts an item from the associated class. Then the image is classified to the class \(\ell \in \llbracket 1,1000\rrbracket\) with the highest value \(p_\ell\). Consider the image in Figure 1 (left), which depicts a Samoyed dog. After preprocessing, the image becomes a tensor \(x \in \mathbb{I}\) (centre left), and ResNet18 correctly classifies it as a Samoyed with \(88\%\) confidence. Then, consider a targeted attack aiming to shift the classification of \(x\) from a Samoyed to a crane. This consists in finding a small perturbation \(d\) (e.g., with \(\lVert{d}\rVert_\infty \leq 10^{-2}\)) that shifts the network’s output \(\texttt{ResNet18}(x+d)\) towards the one-hot vector associated to the \("\mathrm{Crane}"\) class. The altered image \(x+d\) (centre right) remains visually indistinguishable from the original, yet the classification changes drastically: ResNet18 assigns nearly \(100\%\) confidence to the “\(\mathrm{Crane}\)” class. As shown on the right of the figure, \(d\) alters most of the pixels, but only slightly so that the result is visually unchanged.

Figure 1: Targeted attack on ResNet18. (Left) Image of a Samoyed dog. (Centre left) Preprocessed image and its classification. (Centre right) Attack of the preprocessed image, targeting the class "\mathrm{Crane}" and allowing to alter each pixel by at most 10^{-2} units, and its classification. (Right) Magnification of the image alteration performed by the attack. — Figure 1: Targeted attack on `ResNet18`. (Left) Image of a Samoyed dog. (Centre left) Preprocessed image and its classification. (Centre right) Attack of the preprocessed image, targeting the class \("\mathrm{Crane}"\) and allowing to alter each pixel by at most \(10^{-2}\) units, and its classification. (Right) Magnification of the image alteration performed by the attack.

3.1.2 Formal Definition↩︎

We now formalize the notion of directional NN attack in Definition 1. We build it from those of a targeted NN attack, which we therefore introduce first. For simplicity, we state both notions directly in terms of the NN \(\Phi\) and the constraints function \(c\). Their formulations also involve a norm \(\lVert{\cdot}\rVert\) and a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\) that are left generic since numerical tools computing NN attacks implement several choices. We discuss popular practical choices for \(\lVert{\cdot}\rVert\) and \(\mathcal{L}\) in Remark 1. We also adapt Definition 1 to those of directional NN attacks with respect to some components in Remark 2.

First, the notion of targeted NN attack must be adapted from the context of NNs doing classification. Consider a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\) and a norm \(\lVert{\cdot}\rVert\). For all \((x,r,y) \in \mathbb{R}^n \times \mathbb{R}_+ \times \mathbb{R}^m\), a targeted attack on \(\Phi\) at \(x\) with radius \(r\) and target \(y\) (with respect to \(\mathcal{L}\) and the ball \(\mathcal{B}_r^n\) in the \(\lVert{\cdot}\rVert\) norm) consists in finding an input alteration \(d \in \mathbb{R}^n\) feasible for the problem \[\label{problem:targeted95attack} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}d \in \mathcal{B}_r^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(\Phi(x+d) , y\right) \quad subject to \quad c(\Phi(x+d)) \leq 0.\tag{2}\] All feasible elements are said to be feasible attacks, all global solutions are said to be optimal attacks, and all feasible attacks \(d\) satisfying \(\mathcal{L}(\Phi(x+d),y) < \mathcal{L}(\Phi(x),y)\) are said to be successful attacks. From that notion, we may formalize those of a directional NN attack as Definition 1.

Definition 1 (Directional NN attack). Consider a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\) and a norm \(\lVert{\cdot}\rVert\). For all \(x \in \mathbb{R}^n\), define the differential NN \(\Phi_x: d \in \mathbb{R}^n \mapsto \Phi(x+d)-\Phi(x) \in \mathbb{R}^m\). Then, for all points \(x \in \mathbb{R}^n\), all radii \(r \in \mathbb{R}_+\), and all directions \(u \in \mathbb{R}^m\), an attack of \(\Phi\) at \(x\) of radius \(r\) in direction \(u\)* (with respect to \(\mathcal{L}\) and the ball \(\mathcal{B}_r^n\) in the \(\lVert{\cdot}\rVert\) norm) consists in a targeted attack on \(\Phi_x\) at \(0\) with radius \(r\) and target \(u\). That is, a directional NN attack consists in finding a feasible element for the problem \[\label{problem:directional95attack} _\mathcal{L}(x, r, u)} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}d \in \mathcal{B}_r^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(\Phi_x(d) , u\right) \quad subject to \quad c(\Phi(x)+\Phi_x(d)) \leq 0.\tag{3}\] We denote by \(\mathcal{A}_\mathcal{L}(x, r, u)\) the set of feasible attacks, by \(\mathcal{A}_\mathcal{L}^+(x, r, u)\) the set of successful attacks, and by \(\mathcal{A}_\mathcal{L}^*(x, r, u)\) the set of optimal attacks.*

Remark 1. As Definition 1 stresses, all solvers from the literature designed for targeted attacks may be used off-the-shelf to compute directional attacks, since the latter is a specific instance of the former. Moreover, Definition 1 is compatible with any loss function and any norm. However, as shown in Section 3.1.3, many numerical tools are restricted to the \(\lVert{\cdot}\rVert_\infty\) norm, and either the square-error loss function \(\mathcal{L}_{\mathrm{SE}}\) or the cross-entropy loss function \(\mathcal{L}_{\mathrm{CE}}\). These two losses are defined as follows, for all \((y_1,y_2) \in \mathbb{R}^m \times \mathbb{R}^m\), and by applying the logarithm component-wise, \[\label{equation:usual95losses}\mathcal{L}_{\mathrm{SE}}(y_1,y_2) \triangleq\left\lVert{y_2-y_1}\right\rVert_2^2 \quad and \quad \mathcal{L}_{\mathrm{CE}}(y_1,y_2) \triangleq-\left<\ln(\texttt{softmax}(y_1))~,~\texttt{softmax}(y_2)\right>.\qquad{(1)}\]

Remark 2. Definition 1 is designed so that, at any \(x\) and any direction \(u\), an optimal attack direction \(d\) makes all components of \(\Phi_x(d)\) to align with all components of \(u\). However, in some contexts (such as those in Remark 6), we may want for \(\Phi_x(d)\) to match \(u\) with respect to some components of only. For ease of presentation, we adapt Definition 1 only to the case where the first \(a\) components of \(u\) are of interest. We then seek a direction \(d \in \mathbb{R}^n\) such that the first \(a\) components of \(\Phi_x(d)\) agree with the first \(a\) components of \(u\) while the remaining \(m-a\) components of \(\Phi_x(d)\) are free. Given a loss function \(\mathcal{L}: \mathbb{R}^a \times \mathbb{R}^a \to \mathbb{R}_+\), a directional NN attack of \(\Phi\) at \(x\) with radius \(r\) in the first \(a\) components of the direction \(u\)* consists in defining \(A \triangleq\begin{bmatrix} I_a ~ 0\end{bmatrix} \in \mathbb{R}^{a \times m}\) and seeking for \(d \in \mathbb{R}^n\) feasible for the problem \[\underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}d \in \mathcal{B}_r^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(A\Phi_x(d) , Au\right) \quad subject to \quad c(\Phi(x)+\Phi_x(d)) \leq 0.\]*

3.1.3 Numerical Tools↩︎

Several numerical tools are available for computing NN attacks. Most of them are implemented in PyTorch [40] or TensorFlow [41], and rely on one of the ?? . The open-source solvers we are aware of are listed in Table 1. While these tools are designed for targeted attacks only, Definition 1 shows how to use them to compute directional attacks. Consequently, all listed solvers can be used to compute directional attacks as well. In addition, all solvers except \((\alpha,\beta)\)-CROWN are compatible with nonwhite-box NN models since both PyTorch and TensorFlow allow backpropagation. However, none of the solvers from this list currently supports constraints on the input, so we introduce a reformulation in Section 3.2.2 to address this limitation.

Table 1: Solvers for directional `NN` attacks we are aware of. The acronyms PT and TF stand for PyTorch and TensorFlow.
Software	Frameworks	Losses	Norms	Additional requirements
\((\alpha,\beta)\)-`CROWN` [34]	PT	\(\mathcal{L}_{\mathrm{SE}}\)	\(\infty\)	\(c \equiv 0\), \(\Phi\) white-box `ReLU`-based model
FoolBox [33]	PT, TF	\(\mathcal{L}_{\mathrm{CE}}\)	\(2\) or \(\infty\)	\(c \equiv 0\)
ART [32]	many	\(\mathcal{L}_{\mathrm{CE}}\)	\(2\) or \(\infty\)	\(c \equiv 0\)
Torchattacks [31]	PT	\(\mathcal{L}_{\mathrm{CE}}\)	\(2\) or \(\infty\)	\(c \equiv 0\), \(\Phi: [0,1]^n \to \mathbb{R}^m\)

3.2 Optimizing Through Directional `NN` Attacks↩︎

We now leverage directional NN attacks to solve Problem 1 . Specifically, we show that, for any point \(x \in \mathbb{R}^n\), a directional attack of \(\Phi\) at \(x\) in the direction \(\nabla f(\Phi(x))\) likely yields an ascent direction for Problem 1 . This section formalizes this idea and analyzes its theoretical properties. To this end, we introduce the attack operator in Definition 2, which performs the aforementioned attack. The norm is left unspecified in this definition since it does not impact the theoretical properties of the attack operator, but in practice we consider the \(\lVert{\cdot}\rVert_\infty\) norm.

Definition 2 (attack operator). Given a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\), we name attack operator* any set-valued function \(\texttt{attack}: \mathbb{R}^n \times \mathbb{R}_+ \to 2^{\mathbb{R}^n}\) such that \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}(x, r, \nabla f(\Phi(x)))\) holds for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\).*

Note that many functions satisfy Definition 2, e.g., the empty-set operator \(\texttt{attack}\equiv \emptyset\) and the ideal operator given by \(\texttt{attack}(x,r) \triangleq\mathcal{A}_\mathcal{L}^*(x, r, \nabla f(\Phi(x)))\) for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\). Nevertheless, in practice, we choose a numerical solver from the literature, and for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) we define \(\texttt{attack}(x,r)\) as the output of that solver tackling Problem 3 at \(u \triangleq\nabla f(\Phi(x))\). Definition 2 is tailored to encompass all possible outputs that a solver may return. Indeed, the solver either fails (so it returns \(\emptyset\), which is allowed) or returns a feasible attack \(d \in \mathcal{A}(x,r,u)\) that has no guarantee to be optimal or even successful (hence, we do not impose that \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^*(x,r,u)\) or \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^+(x,r,u)\)).

In Section 3.2.1, we prove that if the \(\texttt{attack}\) operator is actually guaranteed to identify successful attacks, then its outputs possess guarantees to be ascent directions for Problem 1 . In Section 3.2.2, we discuss that although this setting is not fully supported by existing solvers, an attack operator defined via current NN attacks solvers already acts as a good heuristic for computing ascent directions.

3.2.1 Favourable Setting for the `attack` Operator↩︎

This section establishes a theoretical setting guaranteeing that the attack operator yields an ascent direction for Problem 1 whenever one exists. This setting requires that the \(\texttt{attack}\) operator returns successful attack directions when some exist, and that the loss function is well-suited in a sense that we introduce below. We formalize these requirements in Assumption 2 and our claim in Proposition 1.

We say that a function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\) satisfies the Well-Suited Loss Property (WSLP) when \[\label{equation:well95suited95losses} } \forall (y_1,y_2) \in \mathbb{R}^m \times \mathbb{R}^m, \quad \mathcal{L}(y_1,y_2) < \mathcal{L}(0,y_2) \implies \left<y_1,y_2\right> > 0.\tag{4}\] Roughly speaking, the WSLP ensures that for any \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) and with \(u \triangleq\nabla f(\Phi(x))\), successful attacks for Problem 3 are alterations of \(x\) that drive the output of \(\Phi\) in the direction of the gradient of \(f\). Thus, according to the first-order approximation of \(f\) near \(\Phi(x)\), the point \(\Phi(x+d)\) likely satisfies \(f(\Phi(x+d)) > f(\Phi(x))\). This observation is formalized in Proposition 1.

Assumption 2 (Successful attack operator). The \(\texttt{attack}\) operator relies on a loss function \(\mathcal{L}\) satisfying the WSLP, and for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\), the set it returns satisfies \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x)))\). Moreover, for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) such that \(\mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x)))\) is nonempty, \(\texttt{attack}(x,r)\) is nonempty.

Proposition 1. Under Assumptions 1 and 2, for all \(x \in \mathbb{R}^n\), there exists \(\overline{r}(x) > 0\) such that for all \(r \in [0,\overline{r}(x)]\), every \(d \in \texttt{attack}(x, r)\) is an ascent direction for Problem 1 emanating from \(x\).

Proof. Let Assumptions 1 and 2 hold. Define \[\begin{array}{rlcl} \forall y \in \mathbb{R}^m, & \overline{\rho}(y) & \triangleq & \max\{ \rho \in [0,+\infty] : \quad (\forall u \in \mathcal{B}^m_\rho ~\text{such that}~ \left<u,\nabla f(y)\right> > 0,~ f(y+u) > f(y)) \}, \\ \forall x \in \mathbb{R}^n, & \overline{r}(x) & \triangleq & \max\{ r \in [0,+\infty] : \quad (\forall d \in \mathcal{B}^n_r,~ \lVert{\Phi_x(d)}\rVert \leq \overline{\rho}(\Phi(x))) \}. \end{array}\] First, for all \(y \in \mathbb{R}^m\), the first-order approximation of \(f\) near \(y\) ensures that \(\overline{\rho}(y) > 0\). Moreover, for all \(x \in \mathbb{R}^n\), the continuity of \(\Phi\) from Assumption 1 ensures that \(\overline{r}(x) > 0\). We deduce that \[\label{equation:property95in95proof95careful95attacks}\forall d \in \mathcal{B}^n_{\overline{r}(x)}: \left<\Phi_x(d)~,~\nabla f(\Phi(x))\right> > 0, \qquad f(\Phi(x+d)) > f(\Phi(x)).\tag{5}\] If moreover \(c(\Phi(x+d)) \leq 0\), then \(d\) is an ascent direction emanating from \(x\). Second, Assumption 2 ensures that for all \(x \in \mathbb{R}^n\) and all \(r \in [0,\overline{r}(x)]\), all \(d \in \texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x)))\) satisfy \[\begin{array}{rrl} & c(\Phi(x+d)) \leq 0 & \text{since}~ d ~\text{is feasible}, \\[0.5em] & d \in \mathcal{B}^n_{\overline{r}(x)} & \text{since}~ \left\lVert{d}\right\rVert \leq r \leq \overline{r}(x), \\[0.5em] & \mathcal{L}(\Phi_x(d), \nabla f(\Phi(x))) < \mathcal{L}(0, \nabla f(\Phi(x))) & \text{since}~ d ~\text{is a successful attack}, \\ so & \left<\Phi_x(d)~,~\nabla f(\Phi(x))\right> > 0 & \text{since}~ \mathcal{L}~satisfies the \ref{equation:well95suited95losses}. \end{array}\] The result follows directly thanks to 5 . ◻

Proposition 1 validates the first goal of this paper: directional NN attacks may be used as tools for solving Problem 1 . Under Assumption 2, for all \(x \in \mathbb{R}^n\), all attacks identified by \(\texttt{attack}(x,r)\) with \(r \leq \overline{r}(x)\) are ascent directions for Problem 1 . Remark 3 shows an algorithmic way to search for a radius \(0 < r \leq \overline{r}(x)\), Remark 4 discusses whether \(\texttt{attack}(x,r) = \emptyset\), and Remark 5 adapts our methodology to the regularity of \(f\). Finally, Remark 6 focuses on cases where \(f\) has active subspaces, which require special care to avoid a deterioration of the performance of the attack operator.

However, an attack operator constructed from any solver in Table 1 violates Assumption 2, so it may not yield a reliable optimization algorithm. Nevertheless, such an attack operator remains a valuable component within a broader optimization strategy. Section 3.2.2 substantiates these claims.

Remark 3. Given a point \(x \in \mathbb{R}^n\) and under Assumptions 1 and 2, Proposition 1 shows that we must select a radius \(r \in ]0,\overline{r}(x)]\) to ensure that all \(d \in \texttt{attack}(x,r)\) are ascent directions. Finding such a radius is not difficult in general. Indeed, \(\overline{r}(x)\) is strictly positive (even though it may be difficult to compute since it depends on the local Lipschitz constant of \(\Phi\) at \(x\)), so a workaround to find some \(r \in ]0,\overline{r}(x)]\) is to initialize \(r \triangleq 1\) and halve it until any \(d \in \texttt{attack}(x,r)\) is an ascent direction.

Remark 4. For all \(x \in \mathbb{R}^n\) and all \(r \in [0,\overline{r}(x)]\), Proposition 1 does not ensure that \(\texttt{attack}(x,r) \neq \emptyset\). It is possible to prove that if either \(\mathcal{B}_{\overline{r}(x)}^n(x) \cap F = \emptyset\) or \(x\) satisfies a necessary optimality conditions for Problem 1 (given by \(\left<\Phi_x(d),\nabla f(\Phi(x))\right> \leq 0\) for all \(d \in \mathcal{B}_{\overline{r}(x)}^n\) such that \(x+d \in F\)), then \(\texttt{attack}(x, r)\) is empty for all \(r \in [0,\overline{r}(x)]\). However, the reciprocal implication does not hold. It is theoretically possible that \(\mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x))) = \emptyset\) at a point \(x \in \mathbb{R}^n\) that does not satisfy necessary optimality conditions, for all \(r \in [0,\overline{r}(x)]\), so the attack operator returns a void set of attacks at \(x\).

Remark 5. The \(\texttt{attack}\) operator computes directional NN attacks at all \(x \in \mathbb{R}^n\) in the gradient ascent direction \(\nabla f(\Phi(x))\) only. Yet, others directions may be considered depending on the regularity of \(f\), such as the Hessian ascent direction \([\nabla^2 f(\Phi(x))]^{-1} \nabla f(\Phi(x))\) when \(f\) has a Hessian \(\nabla^2 f\). When \(f\) is nonsmooth, subgradient ascent directions, simplex gradient ascent directions [42], [43] or even simplex Hessian [44], may be considered. For all \(y \in \mathbb{R}^m\) and all \(r \in \mathbb{R}_+^*\), the (canonical forward) simplex gradient of \(f\) of radius \(r\) at \(y\) is the vector \(\nabla_r f(y) \triangleq r^{-1}[f(y+re_i)-f(y)]_{i=1}^{m}\), and the (canonical forward) simplex Hessian of \(f\) of radius \(r\) at \(y\) is the matrix \(\nabla_r^2 f(y) \triangleq r^{-1}[\nabla_r f(y+re_j) - \nabla_r f(y)]_{j=1}^{m}\).

Remark 6. The attack operator may fall short of its best potential when \(f\) has an active subspace [45], [46] of dimension \(a \ll m\). We say that \(f\) admits such a subspace if there exists a mapping \(g: \mathbb{R}^a \to \mathbb{R}\) and a projection matrix \(A \triangleq\begin{bmatrix} I_a ~ 0\end{bmatrix} \in \mathbb{R}^{a \times m}\) such that \(f(y) = g(Ay)\) for all \(y \in \mathbb{R}^m\). In this case, for all \(x \in \mathbb{R}^n\), the gradient takes the form , so a directional NN attack from Definition 1 seeks for a direction \(d \in \mathbb{R}^n\) such that . However, the last \(m-a\) components of \(\Phi_x(d)\) are irrelevant, since they do not affect \(f\). If \(a\) is known, the attack operator can be adapted as in Remark 2; otherwise, we may learn \(A\) on the fly [47] and adapt accordingly. Active subspaces arise naturally, for example, when \(\Phi\) has the form \(\Phi(x) = [\Psi(x), x]\) for all \(x \in \mathbb{R}^n\), with \(\Psi: \mathbb{R}^n \to \mathbb{R}^k\) another NN, allowing to model constraints on \(x\) via the function \(c\). In such cases, the objective of the problem may depend only on \(\Psi(x)\), so the goal function \(f: \mathbb{R}^{k+n} \to \mathbb{R}\) has an active subspace of dimension \(a = k\).

3.2.2 Practical Construction of `attack` Operators↩︎

Proposition 1 describes an attack operator that returns ascent directions for Problem 1 . However, its setting (enclosed by Assumption 2) may not hold in practice, where we construct an attack operator by selecting a solver \(\mathrm{S}\) and, for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\), defining \(\texttt{attack}(x,r)\) as the output of \(\mathrm{S}\) solving Problem 3 at \(u \triangleq\nabla f(\Phi(x))\). This section discusses the gap between Assumption 2 and practical settings, and introduces a heuristic yet efficient attack operator from existing solvers. This heuristic operator is stated in Definition 3, and its performance is assessed in Section 5.

Assumption 2 primarily fails in practice because its setting is incompatible with solvers listed in Table 1, for four reasons. First, Assumption 2 requires the attack operator to return a successful attack whenever one exists. To our best knowledge, only \((\alpha,\beta)\)-CROWN [34] offers such guarantees, but under the requirement that \(\Phi\) is a white-box. Second, most solvers use the loss function \(\mathcal{L}_{\mathrm{CE}}\), which does not satisfy the WSLP, and moreover, they do not easily allow for a change of the loss function. Among the two ?? , only \(\mathcal{L}_{\mathrm{SE}}\) satisfies the WSLP, and only \((\alpha,\beta)\)-CROWN relies on \(\mathcal{L}_{\mathrm{SE}}\). Third, no solver handles Problem 3 exactly, as none supports input constraints. Fourth, some solvers assume input domains restricted to \([0,1]^n\) rather than \(\mathbb{R}^n\). For these last two drawbacks, we introduce a workaround to make existing solvers compatible with our framework.

First, we reformulate Problem 1 as the unconstrained problem \[\label{problem:P95practical} }} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in \mathbb{R}^n\end{array}\endgroup}{\mathop{\mathrm{maximize}}} \quad \widetilde{f}(\widetilde{\Phi}(x)),\tag{6}\] where \[\widetilde{\Phi}:\left\{\begin{array}{ccl} \mathbb{R}^n & \to & \mathbb{R}^{m+p}\\ x & \mapsto & \begin{bmatrix} \Phi(x) \\ \texttt{ReLU}(c(\Phi(x))) \end{bmatrix} \end{array}\right. \qquad and \qquad \widetilde{f}:\left\{\begin{array}{ccl} \mathbb{R}^{m+p} & \to & \mathbb{R}\\ \begin{bmatrix} y \\ z \end{bmatrix} & \mapsto & f(y)-\left\lVert{z}\right\rVert_2^2. \end{array}\right.\] Second, for all \(r \in \mathbb{R}_+\) we define the affine map \(d_r: \delta \mapsto 2r(\delta - \mathbb{1}/2)\) that maps \([0,1]^n\) to the ball \(\mathcal{B}^n_r\) in the \(\lVert{\cdot}\rVert_\infty\) norm, and for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) we define the scaled differential network \[\widetilde{\Phi}_{(x,r)}:\left\{\begin{array}{ccl} [0,1]^n & \to & \mathbb{R}^m\\ \delta & \mapsto & \widetilde{\Phi}_x(d_r(\delta)), \end{array}\right.\] so that for any \((x, r, u) \in \mathbb{R}^n \times \mathbb{R}+ \times \mathbb{R}^{m+p}\), we can compute a directional attack on \(\widetilde{\Phi}\) at \(x\) of radius \(r\) in direction \(u\) by solving a targeted attack on \(\widetilde{\Phi}_{(x,r)}\) at \(\mathbb{1}/2\) with radius \(1/2\) and target \(u\). That is, we solve the problem \[\label{problem:directional95attack95practical} }_\mathcal{L}(x,r,u)} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}\delta \in \mathcal{B}_{1/2}^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(\widetilde{\Phi}_{(x,r)}(\mathbb{1}/2+\delta) , u\right),\tag{7}\] which is compatible with most practical solvers. By construction, any \(\delta\) has the same classification (optimal, successful, or unsuccessful) for this problem as \(d_r(\mathbb{1}/2+\delta)\) for the original directional attack of \(\widetilde{\Phi}\) at \(x\) with radius \(r\). We now express the practical \(\texttt{attack}\) operator accordingly, in Definition 3.

Definition 3 (practical attack operator). Let \(\mathrm{S}\) be a solver for targeted NN attacks. The \(\texttt{attack}\) operator induced by \(\mathrm{S}\)* is the set-valued function \(\texttt{attack}_\mathrm{S}: \mathbb{R}^n \times \mathbb{R}_+ \to 2^{\mathbb{R}^n}\) that, for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\), maps \((x,r)\) to the set of outputs of \(\mathrm{S}\) solving Problem 7 with \(u \triangleq\nabla \widetilde{f}(\widetilde{\Phi}(x))\), using the loss \(\mathcal{L}\) fixed by \(\mathrm{S}\).*

We emphasize that the current gap between theoretical and practical attack operators may narrow as solver implementations advance. The workaround above is heuristic since it consists in an inexact relaxation of Problem 1 and since the \(\texttt{attack}_\mathrm{S}\) operator may return attack directions that would be infeasible for the original problem. In our numerical experiments in Section 5, we consider the \(\texttt{attack}_\mathrm{S}\) operator defined using the Torchattacks library [31] as the solver \(\mathrm{S}\) (specifically, its implementations of the fgsm [48] and pgd [49] algorithms under the \(\lVert{\cdot}\rVert_\infty\) norm and the \(\mathcal{L}_{\mathrm{CE}}\) loss function).

4 Hybrid Algorithm Leveraging `NN` Attacks and `DFO`↩︎

This section addresses the second goal of the paper, of designing a hybrid optimization algorithm that combines directional NN attacks (as in Section 3) with techniques from derivative-free optimization (DFO) to tackle Problem 1 regardless of the structure of \(\Phi\). The field of DFO [9], [16] studies optimization methods that assume little accessible structure on the problem. The books [9], [16] and the survey [29] give broad coverage of all classes of DFO techniques. In this work, we focus on the direct search methods (dsm) [9] because the covering``dsm (cdsm) [28] variant possesses the strongest convergence properties among DFO methods.

Section 4.1 reviews the cdsm [28] and its asymptotic convergence guarantees towards a local solution. Section 4.2 then presents our hybrid algorithm (Algorithm 3) along with its convergence result (Theorem 1) inherited from those of the cdsm.

Before going through this section, let us stress that our main motivation to design our hybrid method from DFO methods is to address our lack of assumptions about the structure of \(\Phi\). The core idea of our methodology is to hybridize NN attacks with an optimization algorithm that is well-suited for the problem at hand. Then, in contexts where \(\Phi\) is a white-box NN, it may be preferable to consider another algorithm that exploits the explicit neural architecture of \(\Phi\), instead of the cdsm or any DFO algorithms that are oblivious to this structure. We leave this discussion about Algorithm 3 to Section 6. However, we also stress that the independence of the DFO methods to the structure of \(\Phi\) also provides numerical reliability, although at the cost of speed. Algorithm 3 may therefore consist in a baseline method to assess the performance of more sophisticated algorithms that exploit the structure of \(\Phi\).

4.1 The `cdsm` Algorithm↩︎

All dsm algorithms proceed iteratively, with each iteration comprising two steps. The first one is named the search step. It is optional, but it allows for many user-defined strategies (e.g., globalization techniques) with few restrictions. The second one is named the poll step. This step is mandatory and has a more stringent definition, as it underpins all theoretical guarantees. Historically, the dsm class has been split into two subclasses: mesh dsm and sufficient increase dsm, each defining some specific restrictions on the search and poll steps. We refer to [9] for mesh dsm, and to [16] for sufficient increase dsm. Both subclasses of dsm ensure that some limit points they generate satisfy necessary optimality conditions for Problem 1 . Nevertheless, an advance on dsm unifies these two subclasses. The cdsm(covering``dsm) [28] introduces a third step named the covering step, which also has a rigid definition but overrides the convergence properties of dsm. This step, when properly defined and under mild assumptions, ensures convergence to local solutions regardless of the implementations of the search and poll steps. To our knowledge, at the time of writing the cdsm is the only DFO algorithm with this general convergence guarantee (although [28] highlights that the covering step may be added into most DFO algorithms and provide them the same convergence properties). We now overview the search and poll steps as allowed in the cdsm, and then we introduce the covering step.

The search step offers a light framework for evaluating trial points chosen by the user. Its purpose in practice is to allow for globalization strategies, such as the variable neighbourhood search [50], [51]. At each iteration \(k \in \mathbb{N}\), the search step usually relies on the current incumbent solution \(x^k\), the current poll radius \(r^k\) (to be specified in the next paragraph), and the trial points history \(\mathcal{H}^k\) that consists of the set of all points evaluated by the algorithm up to iteration \(k\); and the set \(\texttt{search}(x^k,r^k,\mathcal{H}^k) \subseteq \mathbb{R}^n\) is only required to be finite. Accordingly, we formalize a search operator as follows.

Definition 4 (search operator). A set-valued function \(\texttt{search}: \mathbb{R}^n \times \mathbb{R}_+ \times 2^{\mathbb{R}^n} \to 2^{\mathbb{R}^n}\) is a search operator if \(\texttt{search}(x,r,\mathcal{H}) \subseteq \mathbb{R}^n\) is empty or finite for all \((x,r,\mathcal{H}) \in \mathbb{R}^n \times \mathbb{R}_+ \times 2^{\mathbb{R}^n}\).

The poll step has a more restrictive definition. It consists, at all iterations \(k \in \mathbb{N}\), in a local search around \(x^k\) of some radius \(r^k > 0\). Most of the literature defines \(\texttt{poll}(x^k,r^k,\mathcal{H}^k)\) as a positive spanning set with all elements having a norm of at most \(r^k\), that is, \(\mathrm{PSpan}(\texttt{poll}(x^k,r^k,\mathcal{H}^k)) = \mathbb{R}^n\) with \(\texttt{poll}(x^k,r^k,\mathcal{H}^k) \subseteq \mathcal{B}^n_{r^k}\). We formalize the poll operator accordingly. A popular strategy [52] samples a unit vector \(v^k\), builds the matrix \(M^k \triangleq I-2(v^k)(v^k)^\top\) (which is orthonormal), and sets \(\texttt{poll}(x^k,r^k,\mathcal{H}^k) \triangleq\{\pm r^kM^ke_i\}_{i=1}^{n}\). In addition, popular approaches such as [53], [54] use the set \(\mathcal{H}^k\) to construct a local quadratic model of \(f \circ \Phi\) near \(x^k\) and add the gradient of that model at \(x^k\) to the set of trial directions.

Definition 5 (poll operator). A set-valued function \(\texttt{poll}: \mathbb{R}^n \times \mathbb{R}_+ \times 2^{\mathbb{R}^n} \to 2^{\mathbb{R}^n}\) is a poll operator if \(\mathrm{PSpan}(\texttt{poll}(x,r,\mathcal{H})) = \mathbb{R}^n\) and \(\texttt{poll}(x,r,\mathcal{H}) \subseteq \mathcal{B}^n_r\) hold for all \((x,r,\mathcal{H}) \in \mathbb{R}^n \times \mathbb{R}_+ \times 2^{\mathbb{R}^n}\).

Finally, the covering step [28] may be added to the dsm on top of the search and poll steps. This step consists, at all iterations \(k \in \mathbb{N}\), in evaluating points in the ball \(\mathcal{B}^n_1(x^k)\) that are far enough from the trial points history \(\mathcal{H}^k\). The fixed radius (here set to \(1\) for simplicity) ensures that all accumulation points of \((x^k)_{k \in \mathbb{N}}\) are local solutions. Below, we consider a concise definition of the covering operator for simplicity, although we remark that [28] formalizes others that allow for an easier computation.

Definition 6 (covering operator). The covering operator* is the set-valued function \(\texttt{covering}: \mathbb{R}^n \times 2^{\mathbb{R}^n} \to 2^{\mathbb{R}^n}\) defined by \(\texttt{covering}(\cdot,\emptyset) \equiv \{0\}\) and by \(\texttt{covering}(x,\mathcal{H}) \triangleq\mathop{\mathrm{argmax}}_{d \in \mathcal{B}^n_1} \mathrm{dist}(x+d,\mathcal{H})\) for all \((x,\mathcal{H}) \in \mathbb{R}^n \times 2^{\mathbb{R}^n}\) with \(\mathcal{H}\neq \emptyset\).*

The cdsm algorithm is summarized in Algorithm 2 below. As we now formalize in Proposition 2, its convergence properties established in [28] hold for Problem 1 .

Figure 2: cdsm algorithm to solve Problem 1 . — Figure 2: `cdsm` algorithm to solve Problem 1 .

Proposition 2 (Adapted from [28]). Let \((x^k)_{k \in \mathbb{N}}\) be the sequence of incumbent solutions generated by Algorithm 2 solving Problem 1 under Assumption 1. Then \((x^k)_{k \in \mathbb{N}}\) admits at least one accumulation point, and all of them are local solutions to Problem 1 .

4.2 A Hybrid Algorithm Using Directional `NN` Attacks and `cdsm`↩︎

The cdsm described in Algorithm 2 is purely based on DFO techniques, and thus it ignores the structure of Problem 1 . Thus, the cdsm remains applicable to hard instances, but it usually converges slowly in practice. In contrast, our technique from Section 3.2.2, relying on directional NN attacks, offers a heuristic yet potentially efficient intensification strategy. Then, each of these two approaches offsets the other’s limitations. Motivated by this synergy, we propose a hybrid algorithm that combines directional NN attacks with the steps of the cdsm. At each iteration \(k \in \mathbb{N}\), our algorithm first attempts a directional NN attack via the attack operator from Section 3.2. The outcome of this attack determines how the algorithm proceeds: (i) if the attack yields a sufficient increase (formalized in Definition 7 below), the iteration is accepted and the cdsm steps are skipped, (ii) if it yields a simple increase, the iteration continues with the cdsm steps applied from the improved point, (iii) if the attack fails, the cdsm steps proceed from the current point. This logic is formalized in Algorithm 3 given below.

Definition 7 (Sufficient increase). Given a couple \((\tau,\varepsilon) \in \mathbb{R}_+^* \times \mathbb{R}_+^*\), the sufficient increase function* is the function \[\rho:\left\{\begin{array}{ccl} \mathbb{R}^n \times \mathbb{R}^n & \to & \{0,1\}\\ (x_1, x_2) & \mapsto & 1 ~\text{if}~ \dfrac{f(\Phi(x_1))-f(\Phi(x_2))}{\left\lvert{f(\Phi(x_2))}\right\rvert+\varepsilon} \geq \tau, ~0~ \text{otherwise.} \end{array}\right.\] For all \((x_1, x_2) \in \mathbb{R}^n \times \mathbb{R}^n\), we say that \(x_1\) yields a sufficient increase over \(x_2\) if \(\rho(x_1, x_2) = 1\).*

Figure 3: Hybrid (directional NN attacks with cdsm) algorithm to solve Problem 1 . — Figure 3: Hybrid (directional `NN` attacks with `cdsm`) algorithm to solve Problem 1 .

This algorithm inherits the convergence analysis of the cdsm, as formalized in the next Theorem 1.

Theorem 1. Let \((x^k)_{k \in \mathbb{N}}\) be the sequence of incumbent solutions generated by Algorithm 3 solving Problem 1 under Assumption 1. Then \((x^k)_{k \in \mathbb{N}}\) admits at least one accumulation point, and all such points are local solutions to Problem 1 .

Proof. This proof relies on [28], which claims the following. Consider that Assumption 1 holds, and define \(x^0 \in F\) and \(\mathcal{H}^0 \subseteq \mathbb{R}^n\). For all \(k \in \mathbb{N}\), set \(\mathcal{T}_{\tt{C}}^k \triangleq\{x^k+d,~ d \in \texttt{covering}(x^k,\mathcal{H}^k)\}\) and select \(\mathcal{H}^{k+1}\) such that \(\mathcal{H}^k \cup \mathcal{T}_{\tt{C}}^k \subseteq \mathcal{H}^{k+1}\) and select \(x^{k+1} \in \mathop{\mathrm{argmax}}f(\Phi(\mathcal{H}^{k+1} \cap F))\). Then \((x^k)_{k \in \mathbb{N}}\) admits at least one accumulation point, and all of them are local solutions to Problem 1 .

Let \((x^{k+1}, \mathcal{H}^{k+1})_{k \in K}\) be the sequence generated by Algorithm 3 under Assumption 1, and denote by \(K \subseteq \mathbb{N}\) the set of all iterations at which the covering step is executed. We show that \(K\) contains all sufficiently large integers. Indeed, \((x^{k+1}, \mathcal{H}^{k+1})_{k \in K}\) therefore satisfies [28] by construction, and \((x^k)_{k \in \mathbb{N}}\) and \((x^{k+1})_{k \in K}\) share the same set of accumulation points, so the result follows. By Assumption 1, \(f(\Phi(F))\) is bounded above since \(F\) is compact and \(f \circ \Phi\) is continuous. Then, we get that \(\lim_{k \in \mathbb{N}} f(\Phi(x^k)) \leq \sup f(\Phi(F)) < +\infty\) since, by construction, \((f(\Phi(x^k)))_{k \in \mathbb{N}}\) is increasing and \(x^k \in F\) for all \(k \in \mathbb{N}\). Moreover, by construction, \(\mathbb{N}\setminus K\) contains exactly the iterations skipping the covering step, so \(\mathbb{N}\setminus K = \{k \in \mathbb{N}: \rho(t_\texttt{atk}^k, x^k) = 1\) and \(x^{k+1} = t_\texttt{atk}^k\}\). Then, each \(k \in \mathbb{N}\setminus K\) raises \(\rho(x^{k+1},x^k) = 1\), and then \(f(\Phi(x^{k+1})) \geq f(\Phi(x^k)) + \tau(\left\lvert{f(\Phi(x^k))}\right\rvert+\varepsilon) \geq f(\Phi(x^k)) + \tau\varepsilon\). We deduce that \(\mathbb{N}\setminus K\) contains at most \((\tau\varepsilon)^{-1}(\sup f(\Phi(F))-f(x^0))\) elements, which is a finite quantity so it follows that \(K\) contains all sufficiently large integers, as desired. ◻

5 Numerical Experiments↩︎

In this section, we evaluate the numerical performance and behaviour of Algorithm 3 on three problems with diverse structures. On each problem, we conduct the next three analyses.

5.0.0.1 Experiment 1.

General performance comparison. We evaluate the performance of Algorithm 3 against three baselines (detailed in Section 5.1): two from the DFO literature, and one consisting solely of repeated calls to the attack operator. For each algorithm, we track the sequence of all evaluated points and we record the best objective value found as a function of the evaluation budget.

5.0.0.2 Experiment 2.

Contribution of each step. We analyze the respective roles of the attack and cdsm steps within Algorithm 3. At each iteration, we record which step contributes to improving the current incumbent solution. Precisely, we track whether the attack step leads to a sufficient increase skipping the cdsm steps, a simple increase, or a failure. In the latter two cases, we moreover track which cdsm step succeeds, if any. For comparison, we also conduct the same analysis for the baseline methods.

5.0.0.3 Experiment 3.

Alternative attack operators. We evaluate the effectiveness of the two \(\texttt{attack}_\mathrm{S}\) operator proposed in Section 3.2.2, based on the Torchattacks [31] versions of the fgsm [48] and pgd [49] algorithms respectively. To this end, we extract a sample \((x^{k(j)})_{j \in \mathbb{J}}\) (with \(\mathbb{J}\subset \mathbb{N}\)) of incumbent solutions from the sequence \((x^k)_{k \in \mathbb{N}}\) generated by Algorithm 3. This sample spans a range of objective values for Problem 1 , enabling evaluation across different optimization stages. For each \(j \in \mathbb{J}\) and several radii \(r \in \mathbb{R}_+\), we test whether \(\texttt{attack}_\mathrm{S}(x^{k(j)}, r)\) yields an ascent direction. We also track the runtime and the associated values \(f(\Phi(x^{k(j)}))\) to relate ascent potential with point quality.

The full numerical setup is presented in Section 5.1. Section 5.2 explores a proof-of-concept task to align the prediction returned by the ResNet18 network with a fixed target. Section 5.3 addresses a counterfactual explanation task involving a NN that generates Warcraft maps, as proposed in [27]. Finally, Section 5.4 tackles a chemical engineering optimization problem of maximizing the production of some bio-diesel, using a Physics-Informed NN (PINN) from [55] which models a chemical reaction based on reaction time and the power of a heat flux.

Overall, our experiments show that Algorithm 3 outperforms the three baselines. It also appears that the attack step contributes to the performance of Algorithm 3 in its early iterations (when the incumbent solution remains far from the solution), since it often leads to a sufficient increase that allows skipping the cdsm steps. However, the performance of the attack step decreases in later iterations, so the cdsm steps are performed more often at the end of the optimization process. Finally, the two variants of the \(\texttt{attack}_\mathrm{S}\) operator have similar potential. Both are efficient when at low-quality points, and both get a drop in performance as solutions converge. However, the fgsm algorithm is sensibly faster than the pgd algorithm. This suggests implementing the \(\texttt{attack}_\mathrm{S}\) operator via fgsm rather than pgd, since fgsm is faster than pgd while its lower accuracy has little impact on the performance.

5.1 Experimental Setup↩︎

As discussed in Section 3.2.2, our experiments solve the relaxed Problem 6 rather than Problem 1 . This relaxation enables the use of existing solvers for NN attacks. The methods compared in our experiments are described below. They all adopt the relaxed problem, to ensure a fair comparison.

5.1.0.1 Method \(\mathbb{M}_{\texttt{atk}}\): directional `NN` attacks only.

This method uses the \(\texttt{attack}_\mathrm{S}\) operator from Definition 3, instantiated with the Torchattacks [31] implementation of the fgsm algorithm [48] under \(\lVert{\cdot}\rVert_\infty\) and \(\mathcal{L}_{\mathrm{CE}}\) loss function. We conducted preliminary experiments with a variant relying on the pgd algorithm [49] instead, but we observed that this stronger attack yields similar results, so we favour fgsm since it is faster. The algorithm reads as follows, given \(x^0 \in \mathbb{R}^n\) and \(r^0 \in \mathbb{R}_+^*\) and some fine-tuned expansion and shrinking radius parameters: \[\forall k \in \mathbb{N}, \quad \left\{\begin{array}{lll} \mathcal{T}_{\tt{A}}^k & \triangleq& \{x^k+d,~ d \in \texttt{attack}_\mathrm{S}(x^k, r^k) \cup \texttt{attack}_\mathrm{S}(x^k, \frac{11}{10}r^k)\}, \\ t_{\tt{A}}^k & \triangleq& \mathop{\mathrm{argmax}}\widetilde{f}(\widetilde{\Phi}(\mathcal{T}_{\tt{A}}^k)), \\ x^{k+1} & \triangleq& \mathop{\mathrm{argmax}}\{\widetilde{f}(\widetilde{\Phi}(t_{\tt{A}}^k)), \widetilde{f}(\widetilde{\Phi}(x^k))\}, \\ r^{k+1} & \triangleq& \frac{11}{10}r^k ~\text{if}~ t_{\tt{A}}^k \neq x^k,~ \frac{2}{3}r^k ~\text{otherwise}. \end{array}\right.\]

5.1.0.2 Method \(\mathbb{M}_{\texttt{rls}}\): random line searches.

This baseline follows the random line search (rls) method [56], known for consistent empirical performance in high-dimensional DFO settings despite its theoretical guarantees weaker than those of cdsm. We implement the method as follows, given \((x^0,r^0) \in \mathbb{R}^n \times \mathbb{R}_+^*\) and some fine-tuned numerical values: \[\forall k \in \mathbb{N}, \quad \left\{\begin{array}{lll} d_{\tt{L}}^k & \triangleq& \text{random (uniform) draw on the unit sphere of}~ \mathbb{R}^n, \\ \mathcal{T}_{\tt{L}}^k & \triangleq& \{x^k+rd_{\tt{L}}^k,~ r \in \{\frac{13}{10}r^k, r^k, \frac{10}{13}r^k\} \}, \\ t_{\tt{L}}^k & \triangleq& \mathop{\mathrm{argmax}}\widetilde{f}(\widetilde{\Phi}(\mathcal{T}_{\tt{L}}^k)), \\ x^{k+1} & \triangleq& \mathop{\mathrm{argmax}}\{\widetilde{f}(\widetilde{\Phi}(t_{\tt{L}}^k)), \widetilde{f}(\widetilde{\Phi}(x^k))\}, \\ r^{k+1} & \triangleq& \left\lVert{t_{\tt{L}}^k-x^k}\right\rVert ~\text{if}~ t_{\tt{L}}^k \neq x^k,~ \frac{2}{3}r^k ~\text{otherwise}. \end{array}\right.\]

5.1.0.3 Method \(\mathbb{M}_{\texttt{cdsm}}\): `cdsm` baseline.

This method runs Algorithm 2 on Problem 6 . For each \(k \in \mathbb{N}\), the covering step is simplified so it evaluates a random point in the ball \(\mathcal{B}_1(x^k)\) chosen uniformly, the poll step follows the scheme from [53], and the search step evaluates one point in a random (uniform) direction with length \(r^0\sqrt{s^k}\), where \(s^k\) is the number of search steps executed up to iteration \(k\).

5.1.0.4 Method \(\mathbb{M}_{\texttt{hyb}}\): hybrid method.

This method runs Algorithm 3 on Problem 6 . We define the attack step via the attack operator from Definition 3 and the cdsm steps as in the \(\mathbb{M}_{\texttt{cdsm}}\) method. The sufficient increase function for the attack step is defined via \((\tau,\varepsilon) \triangleq(10^{-3},10^{-10})\).

Each algorithm terminates when all radii become smaller than \(10^{-5}\). We implement our methods, available at this GitHub repository, in Python \(3.12.2\) and run them on a single-thread Intel Xeon Gold 6258R CPU cadenced at \(2.70\)GHz. The entire experimental process (each experiment sequentially, with each method run sequentially within each experiment) runs in seven hours for Section 5.2, two hours for Section 5.3, and ten minutes for Section 5.4. However, it is possible to parallelize most of these computations, as discussed in the repository.

5.2 Proof-of-concept problem: Target Image Recovery↩︎

This task builds upon the ResNet18 classifier discussed in Section 3.1.1 and on the barycentric image function \(\texttt{bary}: \mathbb{I}^n \times \mathbb{R}^n \to \mathbb{I}\) defined by \(\texttt{bary}(I;w) \triangleq\sum_{\ell=1}^{n} w_\ell I_\ell\) for all sets \(I \triangleq(I_\ell)_{\ell=1}^{n} \in \mathbb{I}^n\) of \(n\) images and all vectors \(w \in \mathbb{R}^n\) of weights. We fix \(n \triangleq 100\) and we select \(I \in \mathbb{I}^n\) such that the input images are mutually dissimilar. Then, we define \[\Phi:\left\{\begin{array}{ccl} \mathbb{R}^n & \to & \mathbb{P}^{1000}\\ x & \mapsto & \texttt{ResNet18}(\texttt{bary}(I;\texttt{softmax}(x))) \end{array}\right. \quad and \quad f:\left\{\begin{array}{ccl} \mathbb{P}^{1000} & \to & \mathbb{R}\\ y & \mapsto & -\left\lVert{y-\texttt{ResNet18}(I_1)}\right\rVert \end{array}\right.\] and \(c: x \mapsto \begin{bmatrix} (x_i-10)(x_i+10) \end{bmatrix}_{i=1}^{n}\), defining the feasible set \(F \triangleq[-10, 10]^n\). The solution is the vector \(x^* \in \mathbb{R}^n\) with \(x^*_1 \triangleq 10\) and \(x^*_i \triangleq-10\) for all \(i \neq 1\), and we initialize \(x^0 \triangleq 0\). Although synthetic, this setup enables an assessment of algorithmic behaviour in a simple and controlled setting.

Let us begin by analyzing the results of on this problem. Figure 4 shows that among the three baselines, only the \(\mathbb{M}_{\texttt{cdsm}}\) method converges to a near-optimal solution. The \(\mathbb{M}_{\texttt{rls}}\) method evaluates numerous points with little progress, as confirmed by Figure 5 showing that few of its \(1000\) iterations are successful. This is likely due to a narrow cone of ascent directions near the initial point, a known difficulty in DFO. The \(\mathbb{M}_{\texttt{atk}}\) method also underperforms, halting early on a suboptimal point with most iterations providing only marginal improvements. In contrast, the \(\mathbb{M}_{\texttt{cdsm}}\) method succeeds in closely approximating the optimum, albeit with numerous trial points. The \(\mathbb{M}_{\texttt{hyb}}\) method outperforms all the baselines, as it requires significantly fewer points to evaluate to reach the solution.

Figure 4: in the Target Image Recovery problem from Section 5.2.

We now focus to , depicted in Figure 5. The \(\mathbb{M}_{\texttt{cdsm}}\) method mostly progresses with the poll step, while the covering and search steps have a limited contribution to the process. This domination of the poll step among the cdsm components carries over to the \(\mathbb{M}_{\texttt{hyb}}\) method. However, most of the performance of the \(\mathbb{M}_{\texttt{hyb}}\) method results from the attack step, which succeeds roughly half of the time and often yields a sufficient increase skipping the cdsm steps. This highlights the strength of combining attack techniques with a local search. We believe that the poll step assists the attack step by repositioning the point at which the attack is performed in case of failure. For example, the neighbourhood of \(x^0\) appears challenging for the attack step (as shown by the stagnation of the \(\mathbb{M}_{\texttt{atk}}\) method in ), but the poll step identifies new incumbent solutions at which the attack step performs better.

Figure 5: in the Target Image Recovery problem from Section 5.2.

Finally, confirms the reliability of the attack operator in identifying ascent directions throughout the optimization process. Figure 6 shows that fgsm and pgd attacks are similarly effective, so we favour fgsm since it is six times faster on average.

Figure 6: in the Target Image Recovery problem from Section 5.2.

5.3 Application: Counterfactual Warcraft Maps↩︎

Our second problem, adapted from [27], focuses on optimal counterfactual explanations for structured prediction models involving a NN providing parameters of a combinatorial optimization layer. We adapt the flagship application discussed in [27], which searches for \(\varepsilon\)-relative counterfactual Warcraft maps with respect to the lightest path problem, in the sense defined in Section 1.

The video game Warcraft is a tactical board game where some areas of the maps (represented as images in the space \(\mathbb{W}\triangleq[0,1]^{3 \times 96 \times 96}\) endowed with some norm \(\lVert{\cdot}\rVert_\mathbb{W}\)) are lighter to cross than others. To reflect this, [27] designs a NN \(\texttt{costmap}: \mathbb{W}\to \mathbb{C}\triangleq\mathbb{R}_+^{12 \times 12}\) that splits any map \(\mathcal{W}\in \mathbb{W}\) into \(12 \times 12\) areas of \(8 \times 8\) pixels and estimates the cost for a character to enter all areas. We endow the space \(\mathbb{C}\) with the sum-of-entries norm \(\lVert{\cdot}\rVert_\mathbb{C}\) and the entry-wise inner product \(\odot\). Thus, for any Warcraft map \(\mathcal{W}\in \mathbb{W}\) and any path \(\mathrm{p}\in \{0,1\}^{12 \times 12}\) crossing the map, the cost of \(\mathrm{p}\) along \(\mathcal{W}\) reads as \[\texttt{costpath}(\mathcal{W};\mathrm{p}) \triangleq\left\lVert{\texttt{costmap}(\mathcal{W}) \odot \mathrm{p}}\right\rVert_\mathbb{C}.\] We consider a Warcraft map \(\mathcal{W}_\mathrm{ini}\in \mathbb{W}\) and its cost map \(\mathcal{C}_\mathrm{ini}\triangleq\texttt{costmap}(\mathcal{W}_\mathrm{ini}) \in \mathbb{C}\), and we compute the lightest path \(\mathrm{p}^*_\mathrm{ini}\in \{0,1\}^{12 \times 12}\) joining the Northwestern corner of \(\mathcal{W}_\mathrm{ini}\) to the Southeastern one. Then we pick an alternative path \(\mathrm{p}^\sharp \neq \mathrm{p}^*_\mathrm{ini}\) joining the same corners. Our goal is to produce an alternative map that remains close to \(\mathcal{W}_\mathrm{ini}\) and favours this alternative path over the original one. More specifically, we seek for a \(\varepsilon\)-relative counterfactual explanation of \(\mathcal{W}_\mathrm{ini}\) with respect to the function \(\texttt{costpath}\) and the path \(\mathrm{p}^\sharp\), where we fix \(\varepsilon \triangleq 1\). We therefore search for a map \(\mathcal{W}_\mathrm{cfa}\in \mathbb{W}\) that solves the problem \[\underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}\mathcal{W}\in \mathbb{W}\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \left\lVert{\mathcal{W}-\mathcal{W}_\mathrm{ini}}\right\rVert_\mathbb{W}^2 \quad subject to \quad (1+\varepsilon)\texttt{costpath}(\mathcal{W};\mathrm{p}^\sharp) \leq \texttt{costpath}(\mathcal{W};\mathrm{p}^*_\mathrm{ini}).\] In other words, the goal is to find an alternative map \(\mathcal{W}_\mathrm{cfa}\in \mathbb{W}\) as close as possible to \(\mathcal{W}_\mathrm{ini}\), but on which the path \(\mathrm{p}^\sharp\) is at least twice lighter than the path \(\mathrm{p}^*_\mathrm{ini}\). We illustrate this problem in Figure 7.

Figure 7: (First line) Warcraft map \mathcal{W}_\mathrm{ini}, its associated \texttt{costmap} output, the lightest path \mathrm{p}^*_\mathrm{ini} to reach the South-East from the North-West, and an alternative path \mathrm{p}^\sharp. (Second line) Similar displays for a counterfactual map \mathcal{W}_\mathrm{cfa} with respect to \mathrm{p}^\sharp. This counterfactual is likely not optimal, since the two maps share limited similarities besides the surface of the mountainous area. Nevertheless, \mathcal{W}_\mathrm{cfa} is well suited for \mathrm{p}^\sharp since the mountain has a gorge exactly where \mathrm{p}^\sharp crosses. — Figure 7: (First line) Warcraft map \(\mathcal{W}_\mathrm{ini}\), its associated \(\texttt{costmap}\) output, the lightest path \(\mathrm{p}^*_\mathrm{ini}\) to reach the South-East from the North-West, and an alternative path \(\mathrm{p}^\sharp\). (Second line) Similar displays for a counterfactual map \(\mathcal{W}_\mathrm{cfa}\) with respect to \(\mathrm{p}^\sharp\). This counterfactual is likely not optimal, since the two maps share limited similarities besides the surface of the mountainous area. Nevertheless, \(\mathcal{W}_\mathrm{cfa}\) is well suited for \(\mathrm{p}^\sharp\) since the mountain has a gorge exactly where \(\mathrm{p}^\sharp\) crosses.

However, the above problem has no constraints guaranteeing that the image encoded by \(\mathcal{W}_\mathrm{cfa}\) is visually similar to \(\mathcal{W}_\mathrm{ini}\), or more generally to a real map from the game. To enforce this requirement, we rely on another NN from [27], that we denote by \(\texttt{warcraft}: \mathbb{X}\to \mathbb{W}\) (where \(\mathbb{X}\triangleq\mathbb{R}^n\) with \(n \triangleq 64\)). This NN is designed to generate credible Warcraft maps from abstract input vectors, where credible means that the images generated by \(\texttt{warcraft}\) may not represent maps that truly exist in-game, but that look like real ones. By design, all \(x \in \mathbb{R}^n\) with \(\lvert{\lVert{x}\rVert_\mathbb{X}-\sqrt{n}}\rvert \leq 1\) yield \(\texttt{warcraft}(x)\) being credible. We consider that we know the vector \(x_\mathrm{ini}\in \mathbb{R}^n\) that generates \(\mathcal{W}_\mathrm{ini}\) as \(\mathcal{W}_\mathrm{ini}\triangleq\texttt{warcraft}(x_\mathrm{ini})\). By design, any \(x \approx x_\mathrm{ini}\) generates \(\mathcal{W}\triangleq\texttt{warcraft}(x) \approx \mathcal{W}_\mathrm{ini}\), so \(\lVert{x-x_\mathrm{ini}}\rVert_\mathbb{X}\) acts as a surrogate of \(\lVert{\mathcal{W}-\mathcal{W}_\mathrm{ini}}\rVert_\mathbb{W}\). Moreover, to enforce even further proximity between maps, we consider \(\lVert{\texttt{costmap}(\mathcal{W})-\mathcal{C}_\mathrm{ini}}\rVert_\mathbb{C}\) as another surrogate of \(\lVert{\mathcal{W}-\mathcal{W}_\mathrm{ini}}\rVert_\mathbb{W}\). Then, instead of searching for \(\mathcal{W}_\mathrm{cfa}\in \mathbb{W}\) directly, we seek for a counterfactual \(x_\mathrm{cfa}\in \mathbb{X}\) to then obtain \(\mathcal{W}_\mathrm{cfa}\triangleq\texttt{warcraft}(x_\mathrm{cfa})\). We also use the surrogate norms instead of the true norm. We therefore solve \[\begin{array}{cl} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in \mathbb{X}\end{array}\endgroup}{\mathop{\mathrm{minimize}}} & \begin{array}{l}\left\lVert{x-x_\mathrm{ini}}\right\rVert_\mathbb{X}^2 + \left\lVert{\texttt{costmap}(\texttt{warcraft}(x))-\mathcal{C}_\mathrm{ini}}\right\rVert_\mathbb{C}^2\end{array} \\ \\ subject to & \begin{array}[t]{ll}\left\lVert{x}\right\rVert_\mathbb{X}\in \left[\sqrt{n}-1 , \sqrt{n}+1\right], \\ (1+\varepsilon)\texttt{costpath}(\texttt{warcraft}(x);\mathrm{p}^\sharp) \leq \texttt{costpath}(\texttt{warcraft}(x);\mathrm{p}^*_\mathrm{ini}). \end{array} \end{array}\] This reformulation is easier to solve than the original problem, since \(\mathbb{X}\) is a usual vector space instead of a space of images, and its dimension \(n = 64\) is much smaller than those of \(\mathbb{W}\) (\(3 \times 96 \times 96 = 27648\)). Moreover, for any \(x \in \mathbb{X}\), we do not need to record \(\texttt{warcraft}(x)\). Either the objective and the constraint involving the \(\texttt{costpath}\) function may be expressed in terms of \(\texttt{costmap}(\texttt{warcraft}(x))\) directly.

We fit this problem in the framework of Problem 1 as follows. First, the NN \(\Phi\) is \[\Phi:\left\{\begin{array}{ccl} \mathbb{X} & \to & \mathbb{Y}\triangleq\mathbb{X}\times \mathbb{C}\\ x & \mapsto & \left(x, \texttt{costmap}\left(\texttt{warcraft}\left(x\right)\right) \right) \end{array}\right.\] and the goal function \(f\) is given by \[f:\left\{\begin{array}{ccl} \mathbb{Y} & \to & \mathbb{R}\\ y \triangleq(x,c) & \mapsto & -\left( \left\lVert{x-x_\mathrm{ini}}\right\rVert_\mathbb{X}^2 + \left\lVert{c-\mathcal{C}_\mathrm{ini}}\right\rVert_\mathbb{C}^2\right). \end{array}\right.\]

Next, we define the constraints function by \[c:\left\{\begin{array}{ccl} \mathbb{Y} & \to & \mathbb{R}^2\\ y \triangleq(x,c) & \mapsto & \begin{bmatrix} \left(\left\lVert{x}\right\rVert_\mathbb{X}-\sqrt{n}-1\right)\left(\left\lVert{x}\right\rVert_\mathbb{X}-\sqrt{n}+1\right) \\[0.25em] (1+\varepsilon)\left\lVert{c \odot \mathrm{p}^\sharp}\right\rVert_\mathbb{C}- \left\lVert{c \odot \mathrm{p}^*_\mathrm{ini}}\right\rVert_\mathbb{C} \end{bmatrix}, \end{array}\right.\] so that all \(x \in \mathbb{X}\) with \(c(\Phi(x)) \leq 0\) yield a credible and valid counterfactual explanation as \(\texttt{warcraft}(x)\); since \(\lvert{\lVert{x}\rVert_\mathbb{X}-\sqrt{n}}\rvert \leq 1\) and \((1+\varepsilon)\texttt{costpath}(\texttt{warcraft}(x),\mathrm{p}^\sharp) \leq \texttt{costpath}(\texttt{warcraft}(x),\mathrm{p}^*_\mathrm{ini})\). We initialize all methods with \(x^0 \triangleq x_\mathrm{ini}\), and the counterfactual maps calculated by each method are shown in Figure 8. The returned maps are all visually close. This suggests that the late-stage refinements in this problem consists of fine-tuning. For example, all generated maps exhibit a gorge through the mountain to lighten the path \(\mathrm{p}^\sharp\), though the precise placement of this gorge varies across solutions.

Figure 8: Warcraft maps considered in Section 5.3. Columns 1 and 2 are related to x \triangleq x_\mathrm{ini}. Other groups of columns are related to x being the solution returned by each method we test in our experiments. Each groups of two consecutive columns displays \texttt{warcraft}(x) and \texttt{costmap}(\texttt{warcraft}(x)), then a visualization of the paths \mathrm{p}^*_\mathrm{ini} and \mathrm{p}^\sharp and their costs. — Figure 8: Warcraft maps considered in Section 5.3. Columns \(1\) and \(2\) are related to \(x \triangleq x_\mathrm{ini}\). Other groups of columns are related to \(x\) being the solution returned by each method we test in our experiments. Each groups of two consecutive columns displays \(\texttt{warcraft}(x)\) and \(\texttt{costmap}(\texttt{warcraft}(x))\), then a visualization of the paths \(\mathrm{p}^*_\mathrm{ini}\) and \(\mathrm{p}^\sharp\) and their costs.

, displayed in Figure 9, shows that our hybrid method \(\mathbb{M}_{\texttt{hyb}}\) delivers the best overall performance. The DFO methods \(\mathbb{M}_{\texttt{cdsm}}\) and \(\mathbb{M}_{\texttt{rls}}\) are slow, but this is expected since most DFO methods are not tailored to settings beyond a few dozen variables [9] unless dedicated advanced techniques are used. The \(\mathbb{M}_{\texttt{atk}}\) method quickly approaches a high-quality solution but fails to refine it further. Its progress curve in Figure 9 (barely visible in the top-left corner of the plot) quickly plateaus. Our hybrid method \(\mathbb{M}_{\texttt{hyb}}\) acts as a trade-off between all these methods, as it is only slightly slower than \(\mathbb{M}_{\texttt{atk}}\) but converges to a solution with quality similar to those of \(\mathbb{M}_{\texttt{cdsm}}\) and \(\mathbb{M}_{\texttt{rls}}\).

Figure 9: in the Counterfactual Warcraft Maps problem from Section 5.3.

, illustrated in Figure 10, shows that the \(\mathbb{M}_{\texttt{atk}}\) method stops early, with about half of its iteration ending in a successful attack. Since the map returned by this method is visually close to those produced by the other methods, this confirms the effectiveness of the attack operator for early-stage progress, but also its limitations for late-stage refinement. The \(\mathbb{M}_{\texttt{hyb}}\) method exhibits a similar pattern. The attack step succeeds in roughly half of the first \(350\) iterations (oftentimes yielding sufficient increases in the first \(100\) iterations). Beyond this point, however, the success rate drops sharply, and subsequent improvements are almost entirely driven by the cdsm.

Figure 10: in the Counterfactual Warcraft Maps problem from Section 5.3.

corroborates these observations. Figure 11 shows that the attack operator reliably identifies ascent directions when the current solution is far from optimal, while its effectiveness drops as the objective approaches optimality. The fgsm and pgd variants achieve comparable success rates across the sample, yet fgsm is substantially faster, confirming it as our preferred implementation.

Figure 11: in the Counterfactual Warcraft Maps problem from Section 5.3.

5.4 Application: Bio-Diesel Production↩︎

This problem is constructed from the chemical engineering study of bio-diesel production from [55]. The chemical reaction they study involves five chemical species: the desired methyl ester (\(\mathrm{ME}\)) specie, and four intermediate reactants: triglycerides (\(\mathrm{TG}\)), diglycerides (\(\mathrm{DG}\)), monoglycerides (\(\mathrm{MG}\)), and glycerol (\(\mathrm{G}\)). For each species, we denote by \([\cdot](t,Q)\) its concentration (in \(\mathrm{mol}.\mathrm{L}^{-1}\)) in the reactor tank after a reaction of duration \(t\) (in seconds) under constant heat input with power \(Q\) (in Watts). We also denote by \(T(t,Q)\) the reactor temperature (in Celsius degrees) under these conditions.

Our goal is to determine the pair \((t,Q) \in \mathbb{R}_+^2\) that maximizes the production of \(\mathrm{ME}\), measured by the average product-to-reactant conversion ratio over the reaction horizon, which quantifies the efficiency of converting the reactants into biodiesel. To ensure physical plausibility, we impose two constraints. First, we prevent evaporation of \(\mathrm{ME}\) (with boiling point at \(65^\circ\mathrm{C}\)) by requiring for \(T(s,Q) \leq 65\) for all \(s \in [0,t]\). Second, we enforce a global energy budget, requiring that the total energy input \(Qt\) (in Joules) does not exceed \(500\). The resulting optimization problem reads as \[\begin{array}{cl} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}(t,Q) \in \mathbb{R}_+^2\end{array}\endgroup}{\mathop{\mathrm{maximize}}} & \begin{array}{l}\dfrac{1}{t}\displaystyle\int_{0}^{t} \dfrac{[\mathrm{ME}(s,Q)]}{[\mathrm{TG}](s,Q) + [\mathrm{DG}](s,Q) + [\mathrm{MG}](s,Q) + [\mathrm{G}](s,Q)} ds\end{array} \\ \\ subject to & \begin{array}[t]{ll}T(s,Q) \leq 65,~ \forall s \in [0,t], \\ Qt \leq 500.\end{array} \end{array}\]

We solve this problem by a simulation-based optimization approach. To simulate the chemical process, we rely on the PINN trained in [55], available at this GitHub repository. As discussed in Section 1, this setting corresponds to a trend in simulation-based optimization where the chosen numerical model is a trained NN instead of numerical simulations. The PINN predicts, for any given reaction time \(t\) and heating power \(Q\), the concentrations of the five species involved in the reaction and the reactor temperature at the end of the process. Formally, the model reads as the function \[\texttt{PINN}:\left\{\begin{array}{ccl} \mathbb{R}_+^2 & \to & \mathbb{R}^6\\ (t,Q) & \mapsto & \begin{bmatrix} [\mathrm{TG}](t,Q) ~~ [\mathrm{DG}](t,Q) ~~ [\mathrm{MG}](t,Q) ~~ [\mathrm{G}](t,Q) ~~ [\mathrm{ME}](t,Q) ~~ T(t,Q) \end{bmatrix}. \end{array}\right.\] Since the PINN only provides predictions at the queried time \(t\), we estimate intermediate values over \([0,t]\) by discretizing time into \(N\) steps (fixed at \(N=100\)) and evaluating \(\texttt{PINN}(\tfrac{it}{N},Q)\) for all \(i \in \llbracket 0,N\rrbracket\). We thus define \[\Psi:\left\{\begin{array}{ccl} \mathbb{R}_+^2 & \to & \mathbb{R}^{6(N+1)}\\ (t,Q) & \mapsto & \begin{bmatrix} \texttt{PINN}\!\left(\tfrac{it}{N}, Q\right) \end{bmatrix}_{i=0}^{N}. \end{array}\right.\] To ensure physically meaningful predictions, additional constraints are imposed. First, the input domain is bounded by \(0 \leq t \leq 120\) and \(0 \leq Q \leq 12\), following [55], to remain near the training regime of the model and to reflect practical operating limits. Second, all concentrations are required to be nonnegative. While this holds physically, the constraint prevents occasional violations due to modelling imperfections by the PINN.

Then, the problem is expressed as an instance of Problem 1 by defining the input space \(\mathbb{X}\triangleq\mathbb{R}_+^2\), the predictions space \(\mathbb{O}\triangleq\mathbb{R}^{6(N+1)}\), and the neural mapping \[\Phi:\left\{\begin{array}{ccl} \mathbb{X} & \to & \mathbb{O}\times \mathbb{X}\\ x & \mapsto & (\Psi(x), x). \end{array}\right.\] Then, considering that elements \(z \in \mathbb{O}\) have their components indexed starting from \(0\), we define the objective function \(f\) as \[f:\left\{\begin{array}{ccl} \mathbb{O}\times \mathbb{X} & \to & \mathbb{R}\\ (z,x) & \mapsto & \dfrac{1}{N+1} \sum\limits_{i=0}^{N} \dfrac{ z_{6i+4} }{ z_{6i} + z_{6i+1} + z_{6i+2} + z_{6i+3} }, \end{array}\right.\] so that by design, for all \(x \triangleq(t,Q) \in \mathbb{X}\), \[f(\Phi(x)) = \dfrac{1}{N+1} \sum\limits_{i=0}^{N} \dfrac{ [\mathrm{ME}]\left(\frac{it}{N},Q\right) }{ [\mathrm{TG}]\left(\frac{it}{N},Q\right) + [\mathrm{DG}]\left(\frac{it}{N},Q\right) + [\mathrm{MG}]\left(\frac{it}{N},Q\right) + [\mathrm{G}]\left(\frac{it}{N},Q\right) }\] is a discretization of the product-to-reactant conversion ratio in the objective of the initial problem. We define our constraint function by \[c:\left\{\begin{array}{ccl} \mathbb{O}\times \mathbb{X} & \to & \mathbb{R}^{6(N+1)} \times \mathbb{R}^5\\ (z, x) & \mapsto & [c_\mathrm{outputs}(z),~ c_\mathrm{inputs}(x)], \end{array}\right.\] where the constraints function \(c_\mathrm{outputs}\) and \(c_\mathrm{inputs}\) enclose respectively output-based and input-based components. These two functions are given by \[c_\mathrm{outputs}:\left\{\begin{array}{ccl} \mathbb{O} & \to & \mathbb{R}^{6(N+1)}\\ z & \mapsto & \begin{bmatrix} -z_{6i} \\ -z_{6i+1} \\ -z_{6i+2} \\ -z_{6i+3} \\ -z_{6i+4} \\ z_{6i+5}-65 \end{bmatrix}_{i=0}^{N} \end{array}\right. \quad \text{and} \quad c_\mathrm{inputs}:\left\{\begin{array}{ccl} \mathbb{X} & \to & \mathbb{R}^5\\ x & \mapsto & \begin{bmatrix} -t \\ t-120 \\ -Q \\ Q-12 \\ Qt-500 \end{bmatrix}, \end{array}\right.\] so that for all \(x \triangleq(t,Q) \in \mathbb{X}\) and denoting by \(z \triangleq\Psi(x)\), the inequality \(c(\Phi(x)) \leq 0\) is equivalent to \(c_\mathrm{outputs}(z) \leq 0\) (enforcing nonnegativity of all concentrations and admissible reactor temperatures) and \(c_\mathrm{inputs}(x) \leq 0\) (ensuring bounds and the energy budget). Note that this problem naturally falls within the framework of active subspaces (Remark 6), since for all \(y \triangleq(z,x) \in \mathbb{O}\times \mathbb{X}\), the value of \(f(y)\) is independent to \(x\) as well as of the temperature components \((z_{6i+5})_{i=0}^{N}\). Accordingly, we adapt the attack operator to account for this reduced subspace.

, shown in Figure 12, highlights that all methods succeed on this problem. Both DFO baselines converge, with the \(\mathbb{M}_{\texttt{cdsm}}\) method being more efficient than the \(\mathbb{M}_{\texttt{rls}}\) method. Interestingly, the \(\mathbb{M}_{\texttt{atk}}\) method also converges, suggesting that the attack operator is particularly effective in this setting, despite this method being heuristic and prone to early failure in general. Finally, the \(\mathbb{M}_{\texttt{hyb}}\) method is nearly as inexpensive as \(\mathbb{M}_{\texttt{atk}}\), yet ultimately attains the best objective value among all methods. Taken together, these results strongly suggest that the attack step plays a central role in the good performance of \(\mathbb{M}_{\texttt{hyb}}\) on this problem.

Figure 12: in the Bio-Diesel Production problem from Section 5.4.

, illustrated in Figure 13, confirms this observation. The \(\mathbb{M}_{\texttt{atk}}\) method begins with several successes from the attack step, which yield fast early progress. Similarly, the early iterations of \(\mathbb{M}_{\texttt{hyb}}\) are dominated by sufficient increases from the attack successes bypassing the cdsm step. As the optimization advances, however, the contribution of the attack step diminishes and the cdsm component of \(\mathbb{M}_{\texttt{hyb}}\) takes over, driving the late-stage fine-tuning observed in Figure 12.

Figure 13: in the Bio-Diesel Production problem from Section 5.4.

further illustrates the potential of the attack operator in this context. As seen in Figure 14, the operator reliably finds dominating directions throughout most of the optimization process, up until the final convergence phase. Both fgsm and pgd algorithms perform well overall, though pgd appears slightly more robust in later stages, identifying ascent directions even near the optimum. However, these directions require very small radii and come at a higher computational cost, which reinforces the practicality of fgsm as the preferred default implementation.

Figure 14: in the Bio-Diesel Production problem from Section 5.4.

6 Conclusion and Future Work↩︎

To conclude this paper, we discuss some strengths and limitations of our hybrid method in Section 6.1, and we outline promising avenues for future research in Section 6.2.

6.1 Strengths and Limitations of our Hybrid Method↩︎

The theoretical analysis of Algorithm 3, given by Theorem 1, is limited to an asymptotic convergence. It is difficult to conduct a non-asymptotic analysis valid for all instances of Problem 1 enclosed by the broad framework resulting from Assumption 1. In particular, such analyses are scarce in the DFO literature. Nevertheless, a strength of Theorem 1 is that Algorithm 3 is guaranteed to identify a local solution to Problem 1 , even in hard instances of Problem 1 . Another positive trait of the covering step is that Algorithm 3 scans a dense set of points in a neighbourhood around that solution, which eventually gives a precise understanding of the landscape of Problem 1 near that solution.

Our hybrid method enjoys a twofold flexibility. The core components of Algorithm 3 (the attack operator and the cdsm routine) are largely independent. Each may be chosen depending on the problem at hands. This flexibility allows our hybrid method to be adapted to a wide range of contexts, but a downside is a concern on the choice of the most suitable components. Let us discuss some guideline about how to define each component of our hybrid method depending on the context.

First, the attack operator may be defined from several algorithms for NN attacks. An ideal choice depends on whether \(\Phi\) is given as a white-box NN. Indeed, if \(\Phi\) is a white-box NN, we recommend using an algorithm for NN attacks that exploits this explicit structure, such as those in the \((\alpha,\beta)\)-CROWN solver. If, instead, \(\Phi\) is a nonwhite-box NN, then we suggest using the fgsm algorithm since it is fast and our numerical experiments hint that more accurate algorithm such as pgd yield negligible difference in the performance of the attack operator. However, this observation should not hide the fact that the attack operator sometimes lacks reliability in late stages of the optimization process.

Second, in Algorithm 3, we choose to hybridize a generic attack operator with the cdsm from DFO. Others DFO algorithms could be substituted for cdsm, but unfortunately, to the best of our knowledge, no consensus currently exists in the literature about DFO regarding the most appropriate choice for a given problem. The cdsm is suited for cases where either \(\Phi\) is a nonwhite-box NN or \(f\) or \(c\) are generic nonlinear functions, so the structure of the problem is limited. In such contexts, where it is usual to consider DFO methods, Algorithm 3 outperforms the two state-of-the-art DFO baselines we considered. However, in others contexts, we suggest considering hybridizing the attack step with others algorithms instead. A DFO method is likely not the most efficient approach when \(\Phi\) is a white-box NN and either \(f\) and \(c\) are simple enough. For example, when methods from the literature (see Section 2) about optimization through trained NN are applicable (that is, when \(\Phi\) is a ReLU-based white-box NN and \(f\) is linear and \(c\) yields a polyhedral feasible set), they presumably should be preferred. Similarly, when \(f\) is nonlinear but simple enough (for example, quadratic) and \(c\) yields a polyhedral feasible set, then it is presumably more efficient to adapt these methods than to consider a DFO method.

6.2 Perspectives for Future Research↩︎

Numerous avenues for research stem from our work, either on the theoretical and practical sides.

First, we may strengthen the attack component itself, both in the choice of ascent directions for \(f\) and in the way attacks are computed. Beyond gradients, Remark 5 suggests using alternative ascent proxies such as simplex or finite-difference gradients when \(f\) lacks smoothness. Adapting our analysis to these settings is straightforward, and the literature on simplex gradients and simplex Hessians [42]–[44] offers principled constructions that could yield more reliable attack directions. On the numerical side, our experiments indicate that speed often matters more than ultimate attack accuracy. This makes fgsm an appealing default option, although this choice is likely problem-dependent. In particular, fgsm is likely not sufficient in cases where the structure of \(\Phi\) makes it difficult to compute successful attacks. More broadly, we may also develop nonwhite-box attack solvers that natively handle input constraints and possess guarantees of success whenever an attack exists. Such constrained attack would close the gap with Assumption 2 and remove the workaround from Section 3.2.2, and would also be broadly useful beyond our context.

Second, we plan to broaden the scope of problems addressed, with a particular emphasis on regularity and constraints. The cdsm is designed to cope with possible discontinuities [28], and the assumptions underlying its convergence are tight. This makes the extension of Algorithm 3 to discontinuous \(f\), \(c\), or \(\Phi\) natural, provided the attack follows the ascent-proxy adaptations discussed above. On the constraints side, we may replace strict feasibility or global relaxation with mature mechanisms from DFO such as the progressive barrier [57]. This could improve efficiency by allowing controlled, temporary infeasibility, which is common in successful DFO practice.

Third, we expect benefits from enhancing the DFO engine that surrounds the attack in our hybrid method. Within cdsm, precision and speed can be improved by established tools, such as local quadratic models [53], [54] to inform poll directions, Bayesian strategies [58] to steer the search, and variable neighbourhood rules [50], [51]. These improvements are independent of the attack component and can be integrated without altering convergence guarantees, offering routes to better late-stage refinement.

Fourth, we see opportunities to leverage problem structure more explicitly. When \(\Phi\) is a white-box NN, hybridizing the attack with methods tailored to optimization through trained NNs(see Section 2) should be preferred to structure-agnostic DFO methods. Moreover, the composite form invites alternative formulations, e.g., introducing an auxiliary variable \(y\) with the constraint that \(y=\Phi(x)\) and optimizing \(f(y)\) under \(c(y) \leq 0\), or relaxing this coupling via penalties. These viewpoints connect naturally with partitioned [59] and parametric optimization [60], and with optimization on manifolds [61]. Combining attack-based steps with such dedicated methods may unlock further gains of performance.

Finally, a broader experimental campaign across additional architectures (e.g., physics-informed models [3], [4] and digital twins [6]) would clarify when lightweight attacks suffice and when stronger attacks are warranted. Although the above fields are active, we identified few pre-trained and publicly available NN surrogates that suit our needs.

References↩︎

[1]

Z.Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: Analysis, applications, and prospects,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 6999–7019, 2022, doi: 10.1109/TNNLS.2021.3084827.

[2]

D. R. Sarvamangala and R. V. Kulkarni, “Convolutional neural networks in medical image understanding: A survey,” Evolutionary Intelligence, vol. 15, 2022, doi: 10.1007/s12065-020-00540-3.

[3]

S. Cuomo, V. S. D. Cola, F. Giampaolo, G. Rozza, M. Raissi, and F. Piccialli, “Scientific machine learning through physics–informed neural networks: Where we are and what’s next,” Journal of Scientific Computing, vol. 92, 2022, doi: 10.1007/s10915-022-01939-z.

[4]

S. Huang, W. Feng, C. Tang, Z. He, C. Yu, and L. Jiancheng, “Partial differential equations meet deep neural networks: A survey,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–21, 2025, doi: 10.1109/TNNLS.2025.3545967.

[5]

H. M. D. Kabir, A. Khosravi, M. A. Hosen, and S. Nahavandi, “Neural network-based uncertainty quantification: A survey of methodologies and applications,” IEEE Access, vol. 6, pp. 36218–36234, 2018, doi: 10.1109/ACCESS.2018.2836917.

[6]

F. Tao, B. Xiao, Q. Qi, J. Cheng, and P. Ji, “Digital twin modeling,” Journal of Manufacturing Systems, vol. 64, pp. 372–389, 2022, doi: https://doi.org/10.1016/j.jmsy.2022.06.015.

[7]

I. J. Goodfellow, Y. Bengio, and A. Courville, http://www.deeplearningbook.orgDeep learning. MIT Press, 2016.

[8]

S. J. Oh, B. Schiele, and M. Fritz, “Towards reverse-engineering black-box neural networks,” in Explainable AI: Interpreting, explaining and visualizing deep learning, W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, and K.-R. Müller, Eds. Cham: Springer International Publishing, 2019, pp. 121–144.

[9]

C. Audet and W. Hare, Derivative-Free and Blackbox Optimization. Cham, Switzerland: Springer, 2017.

[10]

S. Verma, V. Boonsanong, M.Hoang, K. E. Hines, J. P. Dickerson, and C. Shah, “Counterfactual explanations and algorithmic recourses for machine learning: A review.” 2022, [Online]. Available: https://arxiv.org/abs/2010.10596.

[11]

G. Perakis and A. Tsiourvas, “Optimizing objective functions from trained ReLU neural networks via sampling.” 2022, [Online]. Available: https://arxiv.org/abs/2205.14189.

[12]

H. Pham, A. R., I. Tahir, J. Tong, and T. Serra, “Optimization over trained (and sparse) neural networks: A surrogate within a surrogate.” 2025, [Online]. Available: https://arxiv.org/abs/2505.01985.

[13]

C. Plate, M. Hahn, A. Klimek, C. Ganzer, K. Sundmacher, and S. Sager, “An analysis of optimization problems involving ReLU neural networks.” 2025, [Online]. Available: https://arxiv.org/abs/2502.03016.

[14]

J. Tong, J. Cai, and T. Serra, “Optimization over trained neural networks: Taking a relaxing walk.” 2024, [Online]. Available: https://arxiv.org/abs/2401.03451.

[15]

K. Wang, L. Lozano, C. Cardonha, and D. Bergman, “Optimizing over an ensemble of trained neural networks,” INFORMS Journal on Computing, vol. 35, no. 3, pp. 652–674, 2023, doi: 10.1287/ijoc.2023.1285.

[16]

A. R. Conn, K. Scheinberg, and L. N. Vicente, Introduction to Derivative-Free Optimization. Philadelphia: SIAM, 2009.

[17]

S. Alarie, C. Audet, A. E. Gheribi, M. Kokkolaras, and S. Le Digabel, “Two decades of blackbox optimization applications,” EURO Journal on Computational Optimization, vol. 9, 2021, doi: 10.1016/j.ejco.2021.100011.

[18]

N. Andrés-Thió et al., “solar: A solar thermal power plant simulator for blackbox optimization benchmarking,” Optimization and Engineering, 2025, doi: 10.1007/s11081-024-09952-x.

[19]

A. Kumar, G. Wu, M. Z. Ali, R. Mallipeddi, P. N. Suganthan, and S. Das, “A test-suite of non-convex constrained optimization problems from the real-world and some baseline results,” Swarm and Evolutionary Computation, vol. 56, p. 100693, 2020, doi: https://doi.org/10.1016/j.swevo.2020.100693.

[20]

J. Larson and M. Menickelly, “Structure-aware methods for expensive derivative-free nonsmooth composite optimization,” Mathematical Programming Computation, vol. 16, no. 1, pp. 1–36, 2024, doi: 10.1007/s12532-023-00245-5.

[21]

J. Larson, M. Menickelly, and B. Zhou, “Manifold sampling for optimizing nonsmooth nonconvex compositions,” SIAM Journal on Optimization, vol. 31, no. 4, pp. 2638–2664, 2021, doi: 10.1137/20M1378089.

[22]

C. Tsay, “Sobolev trained neural network surrogate models for optimization,” Computers & Chemical Engineering, vol. 153, p. 107419, 2021, doi: https://doi.org/10.1016/j.compchemeng.2021.107419.

[23]

S. Wachter, B. Mittelstadt, and C. Russell, “Counterfactual explanations without opening the black box: Automated decisions and the GDPR.” 2018, [Online]. Available: https://arxiv.org/abs/1711.00399.

[24]

L. Bouvier, T. Prunet, V. Leclère, and A. Parmentier, “Primal-dual algorithm for contextual stochastic combinatorial optimization.” 2025, [Online]. Available: https://arxiv.org/abs/2505.04757.

[25]

U. Sadana, A. Chenreddy, E. Delage, A. Forel, E. Frejinger, and T. Vidal, “A survey of contextual optimization methods for decision-making under uncertainty,” European Journal of Operational Research, vol. 320, no. 2, pp. 271–289, 2025, doi: https://doi.org/10.1016/j.ejor.2024.03.020.

[26]

A. Forel, A. Parmentier, and T. Vidal, “Explainable data-driven optimization: From context to decision and back again,” in 40th international conference on machine learning, Jul. 2023, vol. 202, pp. 10170–10187, [Online]. Available: https://proceedings.mlr.press/v202/forel23a.html.

[27]

G. Vivier-Ardisson, A. Forel, A. Parmentier, and T. Vidal, “CF-OPT: Counterfactual explanations for structured prediction.” 2024, [Online]. Available: https://arxiv.org/abs/2405.18293.

[28]

C. Audet, P.-Y. Bouchet, and L. Bourdin, “Convergence towards a local minimum by direct search methods with a covering step,” Optimization Letters, vol. 19, no. 1, pp. 211–231, 2025, doi: 10.1007/s11590-024-02165-2.

[29]

J. Larson, M. Menickelly, and S. M. Wild, “Derivative-free optimization methods,” Acta Numerica, vol. 28, pp. 287–404, 2019, doi: 10.1017/S0962492919000060.

[30]

C.Szegedy et al., “Intriguing properties of neural networks.” 2014, [Online]. Available: https://arxiv.org/abs/1312.6199.

[31]

H. Kim, “Torchattacks: A PyTorch repository for adversarial attacks.” 2021, [Online]. Available: https://arxiv.org/abs/2010.01950.

[32]

M.-I. Nicolae et al., “Adversarial robustness toolbox v1.0.0.” 2019, [Online]. Available: https://arxiv.org/abs/1807.01069.

[33]

J. Rauber, W. Brendel, and M. Bethge, “Foolbox: A python toolbox to benchmark the robustness of machine learning models,” in Reliable machine learning in the wild workshop, 34th international conference on machine learning, 2017, [Online]. Available: http://arxiv.org/abs/1707.04131.

[34]

H. Zhang et al., “A branch and bound framework for stronger adversarial attacks of ReLU networks,” in Proceedings of the 39th international conference on machine learning, 2022, vol. 162, pp. 26591–26604.

[35]

M. König, A. W. Bosman, H. H. Hoos, and J. N. van Rijn, “Critically assessing the state of the art in neural network verification,” Journal of Machine Learning Research, vol. 25, no. 12, pp. 1–53, 2024, [Online]. Available: http://jmlr.org/papers/v25/23-0119.html.

[36]

C. Liu, T. Arnon, C. Lazarus, C. Strong, C. Barrett, and M. J. Kochenderfer, “Algorithms for verifying deep neural networks,” Foundations and Trends in Optimization, vol. 4, no. 3–4, pp. 244–404, 2021, doi: 10.1561/2400000035.

[37]

A. Wurm, “Robustness verification in neural networks,” in Integration of constraint programming, artificial intelligence, and operations research, 2024, pp. 263–278, doi: 10.1007/978-3-031-60599-4_18.

[38]

C. Brix, S. Bak, T. T. Johnson, and H. Wu, “The fifth international verification of neural networks competition (VNN-COMP 2024): Summary and results.” 2024, [Online]. Available: https://arxiv.org/abs/2412.19985.

[39]

C. Urban and A. Miné, “A review of formal methods applied to machine learning.” 2021, [Online]. Available: https://arxiv.org/abs/2104.02466.

[40]

J. Ansel et al., “Pytorch 2: Faster machine learning through dynamic python bytecode transformation and graph compilation.” ACM, pp. 929–947, 2024, [Online]. Available: https://pytorch.org/.

[41]

M. Abadi et al., Software available from tensorflow.org“TensorFlow: Large-scale machine learning on heterogeneous systems.” 2015, [Online]. Available: https://www.tensorflow.org/.

[42]

W. Hare, G. Jarry-Bolduc, and C. Planiden, “Nicely structured positive bases with maximal cosine measure,” Optimization Letters, vol. 17, pp. 1495–1515, 2023, doi: 10.1007/s11590-023-01973-2.

[43]

W. Hare, G. Jarry–Bolduc, and C. Planiden, “Error bounds for overdetermined and underdetermined generalized centred simplex gradients,” IMA Journal of Numerical Analysis, vol. 42, no. 1, pp. 744–770, Dec. 2020, doi: 10.1093/imanum/draa089.

[44]

W. Hare, G. Jarry-Bolduc, and C. Planiden, “Hessian approximations.” 2020, [Online]. Available: http://arxiv.org/abs/2011.02584.

[45]

C. Cartis and L. Roberts, “Scalable subspace methods for derivative-free nonlinear least-square optimization,” Mathematical Programming, vol. 199, pp. 461–524, 2023, doi: 10.1007/s10107-022-01836-1.

[46]

P. G. Constantine, Active subspaces. Society for Industrial; Applied Mathematics, 2015.

[47]

N. Wycoff, M. Binois, and S. M. Wild, “Sequential learning of active subspaces,” Journal of Computational and Graphical Statistics, vol. 30, no. 4, pp. 1224–1237, 2021, doi: 10.1080/10618600.2021.1874962.

[48]

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples.” 2015, [Online]. Available: https://arxiv.org/abs/1412.6572.

[49]

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks.” 2019, [Online]. Available: https://arxiv.org/abs/1706.06083.

[50]

P. Hansen and N. Mladenović, “Variable neighborhood search: Principles and applications,” European Journal of Operational Research, vol. 130, no. 3, pp. 449–467, 2001, doi: 10.1016/S0377-2217(00)00100-4.

[51]

N. Mladenović, M. Drazic, V. Kovacevic-Vujcic, and M. Cangalovic, “General Variable Neighborhood Search for the Continuous Optimization,” European Journal of Operational Research, vol. 191, no. 3, pp. 753–770, 2008, doi: 10.1016/j.ejor.2006.12.064.

[52]

M. A. Abramson, C. Audet, J. E. Dennis, Jr., and S. Le Digabel, “OrthoMADS: A Deterministic MADS Instance with Orthogonal Directions,” SIAM Journal on Optimization, vol. 20, no. 2, pp. 948–966, 2009, doi: 10.1137/080716980.

[53]

A. R. Conn and S. Le Digabel, “Use of quadratic models with mesh-adaptive direct search for constrained black box optimization,” Optimization Methods and Software, vol. 28, no. 1, pp. 139–158, 2013, doi: 10.1080/10556788.2011.623162.

[54]

A. Verdério and E. W. Karas, “On the construction of quadratic models for derivative-free trust-region algorithms,” EURO Journal on Computational Optimization, vol. 5, no. 4, pp. 501–527, 2017, doi: 10.1007/s13675-017-0081-7.

[55]

V. Bibeau, D. C. Boffito, and B. Blais, “Physics-informed neural network to predict kinetics of biodiesel production in microwave reactors,” Chemical Engineering and Processing - Process Intensification, vol. 196, p. 109652, 2024, doi: https://doi.org/10.1016/j.cep.2023.109652.

[56]

L. Roberts and C. W. Royer, “Direct search based on probabilistic descent in reduced spaces,” SIAM Journal on Optimization, vol. 33, no. 4, pp. 3057–3082, 2023, doi: 10.1137/22M1488569.

[57]

C. Audet and J. E. Dennis, Jr., “A Progressive Barrier for Derivative-Free Nonlinear Programming,” SIAM Journal on Optimization, vol. 20, no. 1, pp. 445–472, 2009, doi: 10.1137/070692662.

[58]

A. Zhigljavsky and A. Žilinskas, Bayesian and high-dimensional global optimization. Cham: Springer, 2021.

[59]

C. Audet, P.-Y. Bouchet, and L. Bourdin, “A derivative-free approach to partitioned optimization.” 2024, [Online]. Available: https://arxiv.org/abs/2407.05046.

[60]

G. Still, “Lectures on parametric optimization: An introduction,” Optimization Online, 2018, [Online]. Available: https://optimization-online.org/?p=15154.

[61]

N. Boumal, Introduction to optimization on smooth manifolds. Cambridge University Press, 2023.

Optimization by Directional Attacks: Solving Problems with Neural Network Surrogates

0.0.0.1 Keywords:

1 Introduction↩︎

1.0.0.1 Simulation-based optimization.

1.0.0.2 Counterfactual explanations in decision-focused learning.

1.0.0.3 Contributions.

1.0.0.4 Notation.

2 Related Literature↩︎

3 Optimization Leveraging Directional `NN` Attacks↩︎

3.1 Directional `NN` Attacks↩︎

3.1.1 Background↩︎

3.1.2 Formal Definition↩︎

3.1.3 Numerical Tools↩︎

3.2 Optimizing Through Directional `NN` Attacks↩︎

3.2.1 Favourable Setting for the `attack` Operator↩︎

3.2.2 Practical Construction of `attack` Operators↩︎

4 Hybrid Algorithm Leveraging `NN` Attacks and `DFO`↩︎

4.1 The `cdsm` Algorithm↩︎

4.2 A Hybrid Algorithm Using Directional `NN` Attacks and `cdsm`↩︎

5 Numerical Experiments↩︎

5.0.0.1 Experiment 1.

5.0.0.2 Experiment 2.

5.0.0.3 Experiment 3.

5.1 Experimental Setup↩︎

5.1.0.1 Method \(\mathbb{M}_{\texttt{atk}}\): directional `NN` attacks only.

5.1.0.2 Method \(\mathbb{M}_{\texttt{rls}}\): random line searches.

5.1.0.3 Method \(\mathbb{M}_{\texttt{cdsm}}\): `cdsm` baseline.

5.1.0.4 Method \(\mathbb{M}_{\texttt{hyb}}\): hybrid method.

5.2 Proof-of-concept problem: Target Image Recovery↩︎

5.3 Application: Counterfactual Warcraft Maps↩︎

5.4 Application: Bio-Diesel Production↩︎

6 Conclusion and Future Work↩︎

6.1 Strengths and Limitations of our Hybrid Method↩︎

6.2 Perspectives for Future Research↩︎

References↩︎

Subjects

Updated on Academus

Optimization by Directional Attacks: Solving Problems with Neural Network Surrogates

0.0.0.1 Keywords:

1 Introduction↩︎

1.0.0.1 Simulation-based optimization.

1.0.0.2 Counterfactual explanations in decision-focused learning.

1.0.0.3 Contributions.

1.0.0.4 Notation.

2 Related Literature↩︎

3 Optimization Leveraging Directional NN Attacks↩︎

3.1 Directional NN Attacks↩︎

3.1.1 Background↩︎

3.1.2 Formal Definition↩︎

3.1.3 Numerical Tools↩︎

3.2 Optimizing Through Directional NN Attacks↩︎

3.2.1 Favourable Setting for the attack Operator↩︎

3.2.2 Practical Construction of attack Operators↩︎

4 Hybrid Algorithm Leveraging NN Attacks and DFO↩︎

4.1 The cdsm Algorithm↩︎

4.2 A Hybrid Algorithm Using Directional NN Attacks and cdsm↩︎

5 Numerical Experiments↩︎

5.0.0.1 Experiment 1.

5.0.0.2 Experiment 2.

5.0.0.3 Experiment 3.

5.1 Experimental Setup↩︎

5.1.0.1 Method \(\mathbb{M}_{\texttt{atk}}\): directional NN attacks only.

5.1.0.2 Method \(\mathbb{M}_{\texttt{rls}}\): random line searches.

5.1.0.3 Method \(\mathbb{M}_{\texttt{cdsm}}\): cdsm baseline.

5.1.0.4 Method \(\mathbb{M}_{\texttt{hyb}}\): hybrid method.

5.2 Proof-of-concept problem: Target Image Recovery↩︎

5.3 Application: Counterfactual Warcraft Maps↩︎

5.4 Application: Bio-Diesel Production↩︎

6 Conclusion and Future Work↩︎

6.1 Strengths and Limitations of our Hybrid Method↩︎

6.2 Perspectives for Future Research↩︎

References↩︎

Subjects

Updated on Academus

3 Optimization Leveraging Directional `NN` Attacks↩︎

3.1 Directional `NN` Attacks↩︎

3.2 Optimizing Through Directional `NN` Attacks↩︎

3.2.1 Favourable Setting for the `attack` Operator↩︎

3.2.2 Practical Construction of `attack` Operators↩︎

4 Hybrid Algorithm Leveraging `NN` Attacks and `DFO`↩︎

4.1 The `cdsm` Algorithm↩︎

4.2 A Hybrid Algorithm Using Directional `NN` Attacks and `cdsm`↩︎

5.1.0.1 Method \(\mathbb{M}_{\texttt{atk}}\): directional `NN` attacks only.

5.1.0.3 Method \(\mathbb{M}_{\texttt{cdsm}}\): `cdsm` baseline.