October 01, 2025
Abstract This paper tackles optimization problems whose objective and constraints involve a trained Neural Network (NN
), where the goal is to maximize \(f(\Phi(x))\) subject to \(c(\Phi(x)) \leq 0\), with \(f\) smooth, \(c\) general and non-stringent, and \(\Phi\) an already trained and possibly
nonwhite-box NN
. We address two challenges regarding this problem: identifying ascent directions for local search, and ensuring reliable convergence towards relevant local solutions. To this end, we re-purpose the notion of directional
NN
attacks as efficient optimization subroutines, since directional NN
attacks use the neural structure of \(\Phi\) to compute perturbations of \(x\) that
steer \(\Phi(x)\) in prescribed directions. Precisely, we develop an attack
operator that computes attacks of \(\Phi\) at any \(x\) along the
direction \(\nabla f(\Phi(x))\). Then, we propose a hybrid algorithm combining the attack
operator with derivative-free optimization (DFO
) techniques, designed for numerical reliability by
remaining oblivious to the structure of the problem. We consider the cdsm
algorithm, which offers asymptotic guarantees to converge to a local solution under mild assumptions on the problem. The resulting method alternates between attack-based
steps for heuristic yet fast local intensification and cdsm
steps for certified convergence and numerical reliability. Experiments on three problems show that this hybrid approach consistently outperforms standard DFO
baselines.
Deep Learning, Neural Networks Attacks, Hybrid Methods, Derivative-Free Optimization, Simulation-Based Optimization, Neural Surrogate Models, Digital Twins.
We are grateful to Paul A. Patience and Bruno Blais, from the Department of Chemical Engineering at Polytechnique Montréal, for providing the Physics-Informed Neural Network used in our numerical experiment in Section 5.4. This research was supported by SCALE-AI through its Academic Research Chairs program.
Neural Networks (NN
s) are modelling tools acknowledged across various scientific domains. Beyond their widespread use in classification (such as in computer vision [1] and medical diagnostics [2]), NN
s are
increasingly deployed for applications such as, among many others, physics-informed learning [3], [4] and uncertainty quantification [5]. This
allows NN
s to be used as digital twins [6] of a physical system, that is, rich models of the system
allowing for real-time decision-making. This evolving use has given rise to a class of optimization problems in which a trained NN
\(\Phi: \mathbb{R}^n \to \mathbb{R}^m\) acts as a surrogate function mapping
the decision variables to a system response. The objective is to optimize the output of a task-specific goal function \(f: \mathbb{R}^m \to \mathbb{R}\) defined over the space of outputs of \(\Phi\), subject to constraints defined via a constraints function \(c: \mathbb{R}^m \to \mathbb{R}^p\). This leads to what we refer to as the composite problem of optimization through a
neural network, \[\label{problem:P} } \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in
\mathbb{R}^n\end{array}\endgroup}{\mathop{\mathrm{maximize}}} \quad f(\Phi(x)) \quad subject to \quad c(\Phi(x)) \leq 0,\tag{1}\] where \((n, m, p) \in (\mathbb{N}^*)^3\) denote the dimensions of,
respectively, the variables space, the NN
output space, and the constraints function output space. We study Problem 1 under the following Assumption 1.
Assumption 1 (Problem requirements). The NN
\(\Phi\) encodes a continuous function, the goal function \(f\) is differentiable with gradient \(\nabla f\), and the feasible set \(F \triangleq\{x \in \mathbb{R}^n: c(\Phi(x)) \leq 0\}\) induced by the constraints function \(c\) is nonempty, ample (that
is, \(F \subseteq \mathrm{cl}(\mathrm{int}(F))\)) and compact.
In this work, the NN
\(\Phi\) is considered already trained and not tunable anymore. Moreover, we do not impose that \(\Phi\) is given as a white-box
NN
. A process mapping some inputs to associated outputs is a white-box when the mathematical function it encodes is explicitly exposed, fully accessible, and analytically exploitable. In the case of a NN
, that means that
the architecture and weights of the NN
are all given. In contrast, the algorithm proposed in this paper requires no information about the NN
model besides the continuity of the function it encodes, and our numerical implementation
relies only on backpropagation [7], a tool that most framework for NN
s provide and that may be used with no
explicit knowledge of the structure of the NN
. Hence, in this work we say that \(\Phi\) is possibly a nonwhite-box NN
. We remark that, in the terminology of [8] and of derivative-free optimization (DFO
) [9], our
setting considers \(\Phi\) as a black-box NN
; however we also remark that some authors such as [10] would consider our requirement as a grey-box NN
; hence our choice of terminology.
To the best of our knowledge, only a few papers tackle Problem 1 , and most do so under more restrictive assumptions. They typically assume a linear goal function \(f\), a polyhedral feasible
set \(F\), and moreover that \(\Phi\) is provided as a white-box NN
[11]–[15]. Yet, the widespread adoption of NN
s across industrial and engineering applications suggests that instances of Problem 1 involving nonwhite-box NN
s are increasingly
common, for example, in cases where \(\Phi\) is given as a compiled file since this allows it to run faster at the cost of transparency. To motivate the importance of tackling this broader setting, let us present two
contexts where it naturally appears.
A classical framework in DFO
[9], [16] considers problems of
simulation-based optimization [17]–[19], that have
the form “\(\mathop{\mathrm{maximize}}_{x \in \mathbb{R}^n} f(B(x))\) subject to \(c(B(x)) \leq 0\)” where \(f\) and \(c\)
are defined as in Problem 1 and the mapping \(B: \mathbb{R}^n \to \mathbb{R}^m\) denotes an intractable process that, typically, runs a costly numerical simulation parameterized by \(x \in \mathbb{R}^n\). This setting, popular in engineering design [20], [21], has the same composite structure as Problem 1 , differing only in that the intermediate mapping is the simulator \(B\)
rather than the NN
\(\Phi\). Yet, an increasing trend replaces simulators \(B\) by trained NN
s \(\Phi\) [22] acting as digital twins, which yields instances of Problem 1 with key advantages: evaluating \(\Phi\) is typically orders of magnitude faster than running \(B\); and even if treated as a nonwhite-box, \(\Phi\) retains a neural architecture that can be
exploited (for example by backpropagation). This motivates the development of dedicated optimization algorithms.
Counterfactual explanations [10], [23] for a NN
input consist in minimal changes to the input that induce a desired change in the output. When considering a classification task for a NN
classifier \(\Phi:
\mathbb{R}^n \to \llbracket 1,m\rrbracket\), counterfactual explanations \(x_\mathrm{cfa}\in \mathbb{R}^n\) for an input \(x_\mathrm{ini}\in \mathbb{R}^n\) and a target class \(y^\sharp \neq \Phi(x_\mathrm{ini})\) are solutions to “\(\mathop{\mathrm{minimize}}_{x \in \mathbb{R}^n} \lVert{x-x_\mathrm{ini}}\rVert^2\) subject to \(\Phi(x) =
y^\sharp\)”. The notion also extends to decision-focused learning and contextual optimization [24], [25], where the NN
\(\Phi: \mathbb{R}^n \to
\mathbb{R}^m\) maps a so-called context \(x \in \mathbb{R}^n\) to parameters for a downstream optimization task “\(\mathop{\mathrm{minimize}}_{y \in \mathcal{C}}
g(\Phi(x),y)\)”, involving some function \(g\) usually convex and some set \(\mathcal{C}\) that is typically combinatorial, for which an optimal decision rule \(y^*(x) \in \mathop{\mathrm{argmin}}_{y \in \mathcal{C}} g(\Phi(x), y)\) may be selected. In this setting, a counterfactual explanation of a context \(x_\mathrm{ini}\in \mathbb{R}^n\) consists in
a context \(x_\mathrm{cfa}\in \mathbb{R}^n\) close to \(x_\mathrm{ini}\) and such that a given target decision \(y^\sharp \in \mathcal{C}\), usually proposed
by an expert or a benchmark policy, is preferable to \(y^*(x_\mathrm{ini})\) for the problem induced by \(\Phi(x_\mathrm{cfa})\) [26], [27]. This results in the problem of \(\varepsilon\)-relative counterfactual, for any \(\varepsilon \in \mathbb{R}_+\), which aligns with Problem 1 since it is given by \[\underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in \mathbb{R}^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \left\lVert{x-x_\mathrm{ini}}\right\rVert^2 \quad subject to \quad
(1+\varepsilon)g(\Phi(x), y^\sharp) \leq g(\Phi(x),y^*(x_\mathrm{ini})).\]
This paper addresses the growing need to solve increasingly challenging instances of Problem 1 , particularly in settings where the NN
\(\Phi\) is not provided as a white-box. Our
main insight is that a core challenge in Problem 1 , of identifying ascent directions for \(f \circ \Phi\) since \(\Phi\) has a complex structure, can be
alleviated through directional NN
attacks. Indeed, directional NN
attacks exploit the structure of a NN
to find a perturbation of the input that steers its outputs in a desired direction. This paper
formalizes the idea to solve Problem 1 by iteratively computing directional attacks of \(\Phi\) at the incumbent solution \(x^k\) in the direction given by \(\nabla f(\Phi(x^k))\). Importantly, state-of-the-art NN
attack techniques span a broad range of access levels: some assume white-box access to the NN
, while others require only the ability to
backpropagate, making them usable when \(\Phi\) is not a white-box. Moreover, this approach may be hybridized with existing optimization algorithms, acting as an additional step at all iterations to search for ascent
directions using explicitly the neural structure of \(\Phi\). In particular, we hybridize this so-called attack
step with DFO
algorithms that aim for global search capability and numerical
reliability in problems with nonconvexity, and nondifferentiability or poor gradient information. Therefore, in this paper,
we formalize the concept of directional NN
attacks, and we show that they can be repurposed to compute ascent directions for Problem 1 . We highlight that standard NN
attack algorithms
can be used off-the-shelf for this purpose, and we describe how to embed them in a local optimization framework. While powerful when successful, this approach is inherently local and sometimes prone to failure for various technical reasons. This
contribution forms the content of Section 3;
we hybridize the above attack-based approach with the covering direct search method (cdsm
) [28], a
DFO
method that is guaranteed to asymptotically converge to a local solution to Problem 1 under Assumption 1. Our
resulting hybrid method algorithm combines, at each iteration, two concepts with natural synergy: an attack
step leveraging the internal structure of \(\Phi\) for fast local improvement, and steps from
cdsm
that allow for globalization strategies and asymptotic convergence towards a local solution. This contribution is detailed in Section 4;
we conduct numerical experiments on three problems to assess our hybrid algorithm against DFO
baselines. The first problem acts as a proof of concept, while the other two are drawn from the simulation-based optimization and
counterfactual explanation contexts highlighted in our motivation. Our experiments highlight that the attack
step is not a reliable standalone method, but it nevertheless contributes significantly to the performance of our hybrid method.
Overall, our hybrid method appears faster than the DFO
baselines thanks to the attack
step, and as reliable thanks to the steps from cdsm
. Details about our experiments are given in Section 5.
We leave a literature review to Section 2 and a discussion foreseeing future work to Section 6. We conclude this section by introducing some notation used throughout the paper.
We denote by \(\left<\cdot,\cdot\right>\) the dot product in \(\mathbb{R}^m\), by \(\lVert{\cdot}\rVert\) some norm in \(\mathbb{R}^n\), by \((e_i)_{i=1}^{n}\) the vectors of the canonical basis of \(\mathbb{R}^n\), and by \(\mathbb{1} \triangleq\sum_{i=1}^{n}e_i\) the vector of all ones in \(\mathbb{R}^n\). For all sets \(\mathcal{D}\subseteq \mathbb{R}^n\), we denote the positive spanning set of \(\mathcal{D}\) by \(\mathrm{PSpan}(\mathcal{D})\). For all \(r \in \mathbb{R}_+\), all \(x \in \mathbb{R}^n\), and all \(y \in \mathbb{R}^m\), we respectively denote by \(\mathcal{B}^n_r(x)\) and \(\mathcal{B}^m_r(y)\) the closed ball of radius \(r\) in \(\mathbb{R}^n\) centred at \(x\) and the closed ball of radius \(r\) in \(\mathbb{R}^m\) centred at \(y\). We omit the parentheses for the balls centred at \(0\), that is, \(\mathcal{B}_r^n \triangleq\mathcal{B}^n_r(0)\) and \(\mathcal{B}_r^m \triangleq\mathcal{B}^m_r(0)\). We denote the rectified linear unit function by \(\texttt{ReLU}: v \in \mathbb{R}^a \mapsto [\max(v_i,0)]_{i=1}^{a} \in \mathbb{R}^a\) and the softmax function by \(\texttt{softmax}: v \in \mathbb{R}^a \mapsto [\exp(v_i) / \sum_{j=1}^{a}\exp(v_j)]_{i=1}^{a} \in [0,1]^a\), regardless of the dimension \(a \in \mathbb{N}^*\) of the input vector \(v\). For all couples \((x_1, x_2) \in \mathbb{R}^n \times \mathbb{R}^n\), we say that \(x_2\) dominates \(x_1\) (with respect to Problem 1 ) if \(c(\Phi(x_2)) \leq 0\) and \(f(\Phi(x_2)) > f(\Phi(x_1))\). Similarly, for all couples \((x,d) \in \mathbb{R}^n \times \mathbb{R}^n\), we say that \(d\) is an ascent direction emanating from \(x\) (with respect to Problem 1 ) if \(x+d\) dominates \(x\). Note that in this work, we allow for non-unitary directions.
A few prior studies tackled Problem 1 , but under more restrictive assumptions [11]–[15]. They typically assume a linear goal function \(f\), a polyhedral feasible set \(F\), and moreover that \(\Phi\) is
provided as a white-box NN
. These approaches effectively exploit the structure of \(\Phi\), but their requirements contrast with the reality of many practical applications, where \(\Phi\) may be deployed as an opaque or compiled artifact to maximize inference speed, which makes white-box assumptions and layer-wise manipulations impractical.
The general setting of Problem 1 has been less explored. It relates to DFO
because of the possible nonwhite-box nature of \(\Phi\) and the potential nonsmoothness of \(f \circ \Phi\). We are not aware of any DFO
method specifically designed for Problem 1 , but most general-purpose DFO
methods [9], [16], [29] may be applied. In this paper, we
build our hybrid method upon the covering
direct search method (cdsm
) algorithm [28]. This choice is
motivated by two aspects: the cdsm
is an extension of the popular and widely studied direct search method [9] from DFO
,
and it has guarantees of convergence to a local solution to Problem 1 under Assumption 1. Nevertheless, considering the
cdsm
is not stringent. As shown in [28], many DFO
methods may be adapted to inherit the same convergence
properties as cdsm
when enhanced with a covering step.
The idea of leveraging the neural structure of \(\Phi\) to identify ascent directions for \(f \circ \Phi\) relates to the literature on NN
attacks. The seminal work of [30] introduces the notion of NN
attacks. There are now several solvers to compute NN
attacks [31]–[34], and
we design our attack
operator so that it may leverage any of them. We also remark that a desirable property of NN
attacks is that they return successful attacks whenever some exists, which has connection with the notion of
NN
verification [35]–[37], an active research area [38], [39]. Neural network verification asks whether, for a given NN
\(\Phi\), an input set \(X\) contains an element \(x\) such that \(\Phi(x)\) lies within a specified output set \(Y\). NN
verification is typically a computationally demanding problem, and most parts
of the \((\alpha,\beta)\)-CROWN
solver [36] address precisely this verification
task, albeit primarily in binary classification contexts.
In short, our work significantly departs from the existing literature on optimization through NN
s by addressing Problem 1 without assuming that \(\Phi\) is a white-box
NN
. It also departs from the existing literature on DFO
by proposing a hybrid method that explicitly leverages the neural structure of \(\Phi\).
NN
Attacks↩︎This section addresses the first objective of the paper: establishing directional NN
attacks (formalized in Section 3.1) as a practical tool for solving Problem 1 . Specifically, in Section 3.2 we demonstrate that such attacks, when appropriately constructed, can be used to identify ascent directions (even when \(\Phi\) is a nonwhite-box NN
model).
NN
Attacks↩︎First, we contextualize the notion of NN
attack and illustrate its purpose in Section 3.1.1. Then we formalize the notion of directional NN
attacks in Section 3.1.2. Finally, we discuss practical approaches for computing such attacks in Section 3.1.3.
The notion of NN
attacks originates from the seminal work of [30], which empirically demonstrates
a striking vulnerability of many neural networks: small, carefully crafted perturbations of an input can lead to large changes in the output of the network. The notion of NN
attacks is initially formalized for NN
s focusing on
classification, and it is a central topic in adversarial machine learning, used both to manipulate models’ classifications and to evaluate robustness. A NN
attack consists in finding an alteration of an input (within a ball of
pre-defined radius) that changes the classification from the class predicted initially to any other. Similarly, a targeted NN
attack consists in altering the input in order to make it classified as a specific class. In the context of
the present work, the NN
s we consider may not perform classification, but nevertheless the notion of NN
attacks admits a direct extension that we formalize in Section 3.1.2. Before going to this point, let us illustrate the observation from [30] that a NN
classification may be drastically vulnerable to slight alterations of the input.
Consider PyTorch’s ResNet18
NN
for image classification (with its default training). This network is designed to classify images according to \(1000\) pre-selected classes. That is, \(\texttt{ResNet18}\) takes as inputs RGB images with \(224 \times 224\) pixels, represented as tensors in \(\mathbb{I}\triangleq[0,1]^{3\times224\times224}\), and outputs vectors in the 1000-dimensional space \(\mathbb{P}^{1000} \triangleq\{p \in [0,1]^{1000}: \sum_{\ell=1}^{1000} p_\ell = 1\}\),
where each component represents the plausibility that the image depicts an item from the associated class. Then the image is classified to the class \(\ell \in \llbracket 1,1000\rrbracket\) with the highest value \(p_\ell\). Consider the image in Figure 1 (left), which depicts a Samoyed dog. After preprocessing, the image becomes a tensor \(x \in \mathbb{I}\)
(centre left), and ResNet18
correctly classifies it as a Samoyed with \(88\%\) confidence. Then, consider a targeted attack aiming to shift the classification of \(x\) from a
Samoyed to a crane. This consists in finding a small perturbation \(d\) (e.g., with \(\lVert{d}\rVert_\infty \leq 10^{-2}\)) that shifts the network’s output \(\texttt{ResNet18}(x+d)\) towards the one-hot vector associated to the \("\mathrm{Crane}"\) class. The altered image \(x+d\) (centre right) remains
visually indistinguishable from the original, yet the classification changes drastically: ResNet18
assigns nearly \(100\%\) confidence to the “\(\mathrm{Crane}\)” class. As
shown on the right of the figure, \(d\) alters most of the pixels, but only slightly so that the result is visually unchanged.
We now formalize the notion of directional NN
attack in Definition 1. We build it from those of a targeted
NN
attack, which we therefore introduce first. For simplicity, we state both notions directly in terms of the NN
\(\Phi\) and the constraints function \(c\). Their
formulations also involve a norm \(\lVert{\cdot}\rVert\) and a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\) that are left generic since numerical tools
computing NN
attacks implement several choices. We discuss popular practical choices for \(\lVert{\cdot}\rVert\) and \(\mathcal{L}\) in Remark 1. We also adapt Definition 1 to those of directional NN
attacks with respect to some components in Remark 2.
First, the notion of targeted NN
attack must be adapted from the context of NN
s doing classification. Consider a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to
\mathbb{R}_+\) and a norm \(\lVert{\cdot}\rVert\). For all \((x,r,y) \in \mathbb{R}^n \times \mathbb{R}_+ \times \mathbb{R}^m\), a targeted attack on \(\Phi\) at \(x\) with radius \(r\) and target \(y\) (with respect to \(\mathcal{L}\) and
the ball \(\mathcal{B}_r^n\) in the \(\lVert{\cdot}\rVert\) norm) consists in finding an input alteration \(d \in \mathbb{R}^n\) feasible for the problem
\[\label{problem:targeted95attack} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}d \in
\mathcal{B}_r^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(\Phi(x+d) , y\right) \quad subject to \quad c(\Phi(x+d)) \leq 0.\tag{2}\] All feasible elements are said to be feasible attacks, all global
solutions are said to be optimal attacks, and all feasible attacks \(d\) satisfying \(\mathcal{L}(\Phi(x+d),y) < \mathcal{L}(\Phi(x),y)\) are said to be successful
attacks. From that notion, we may formalize those of a directional NN
attack as Definition 1.
Definition 1 (Directional NN
attack). Consider a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\) and a norm \(\lVert{\cdot}\rVert\). For all \(x \in \mathbb{R}^n\), define the differential NN
\(\Phi_x: d \in \mathbb{R}^n \mapsto \Phi(x+d)-\Phi(x) \in
\mathbb{R}^m\). Then, for all points \(x \in \mathbb{R}^n\), all radii \(r \in \mathbb{R}_+\), and all directions \(u \in \mathbb{R}^m\), an
attack of \(\Phi\) at \(x\) of radius \(r\) in direction \(u\)* (with respect to \(\mathcal{L}\) and the ball \(\mathcal{B}_r^n\) in the \(\lVert{\cdot}\rVert\) norm) consists in a targeted attack on \(\Phi_x\)
at \(0\) with radius \(r\) and target \(u\). That is, a directional NN
attack consists in finding a feasible element for the problem
\[\label{problem:directional95attack} _\mathcal{L}(x, r, u)} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}d \in
\mathcal{B}_r^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(\Phi_x(d) , u\right) \quad subject to \quad c(\Phi(x)+\Phi_x(d)) \leq 0.\tag{3}\] We denote by \(\mathcal{A}_\mathcal{L}(x, r,
u)\) the set of feasible attacks, by \(\mathcal{A}_\mathcal{L}^+(x, r, u)\) the set of successful attacks, and by \(\mathcal{A}_\mathcal{L}^*(x, r, u)\) the set of optimal
attacks.*
Remark 1. As Definition 1 stresses, all solvers from the literature designed for targeted attacks may be used off-the-shelf to compute directional attacks, since the latter is a specific instance of the former. Moreover, Definition 1 is compatible with any loss function and any norm. However, as shown in Section 3.1.3, many numerical tools are restricted to the \(\lVert{\cdot}\rVert_\infty\) norm, and either the square-error loss function \(\mathcal{L}_{\mathrm{SE}}\) or the cross-entropy loss function \(\mathcal{L}_{\mathrm{CE}}\). These two losses are defined as follows, for all \((y_1,y_2) \in \mathbb{R}^m \times \mathbb{R}^m\), and by applying the logarithm component-wise, \[\label{equation:usual95losses}\mathcal{L}_{\mathrm{SE}}(y_1,y_2) \triangleq\left\lVert{y_2-y_1}\right\rVert_2^2 \quad and \quad \mathcal{L}_{\mathrm{CE}}(y_1,y_2) \triangleq-\left<\ln(\texttt{softmax}(y_1))~,~\texttt{softmax}(y_2)\right>.\qquad{(1)}\]
Remark 2. Definition 1 is designed so that, at any \(x\) and any
direction \(u\), an optimal attack direction \(d\) makes all components of \(\Phi_x(d)\) to align with all components of \(u\). However, in some contexts (such as those in Remark 6), we may want for \(\Phi_x(d)\) to match \(u\) with respect to some components of only. For ease of presentation, we adapt Definition 1 only to the case where the first \(a\) components of \(u\) are of interest. We then seek a direction \(d \in
\mathbb{R}^n\) such that the first \(a\) components of \(\Phi_x(d)\) agree with the first \(a\) components of \(u\)
while the remaining \(m-a\) components of \(\Phi_x(d)\) are free. Given a loss function \(\mathcal{L}: \mathbb{R}^a \times \mathbb{R}^a \to \mathbb{R}_+\), a
directional NN
attack of \(\Phi\) at \(x\) with radius \(r\) in the first \(a\) components of the
direction \(u\)* consists in defining \(A \triangleq\begin{bmatrix} I_a ~ 0\end{bmatrix} \in \mathbb{R}^{a \times m}\) and seeking for \(d \in \mathbb{R}^n\)
feasible for the problem \[\underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}d \in \mathcal{B}_r^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(A\Phi_x(d) , Au\right)
\quad subject to \quad c(\Phi(x)+\Phi_x(d)) \leq 0.\]*
Several numerical tools are available for computing NN
attacks. Most of them are implemented in PyTorch [40] or TensorFlow [41], and rely on one of the ?? . The open-source solvers we are aware of are listed in Table 1. While these tools are designed for targeted attacks only, Definition 1 shows how to use them to compute directional attacks. Consequently, all listed solvers can be used to compute directional attacks as well. In addition, all solvers except \((\alpha,\beta)\)-CROWN
are compatible with nonwhite-box NN
models since both PyTorch and TensorFlow allow backpropagation. However, none of the solvers from this list currently supports constraints on
the input, so we introduce a reformulation in Section 3.2.2 to address this limitation.
Software | Frameworks | Losses | Norms | Additional requirements |
---|---|---|---|---|
\((\alpha,\beta)\)-CROWN [34] |
PT | \(\mathcal{L}_{\mathrm{SE}}\) | \(\infty\) | \(c \equiv 0\), \(\Phi\) white-box ReLU -based model |
FoolBox [33] | PT, TF | \(\mathcal{L}_{\mathrm{CE}}\) | \(2\) or \(\infty\) | \(c \equiv 0\) |
ART [32] | many | \(\mathcal{L}_{\mathrm{CE}}\) | \(2\) or \(\infty\) | \(c \equiv 0\) |
Torchattacks [31] | PT | \(\mathcal{L}_{\mathrm{CE}}\) | \(2\) or \(\infty\) | \(c \equiv 0\), \(\Phi: [0,1]^n \to \mathbb{R}^m\) |
NN
Attacks↩︎We now leverage directional NN
attacks to solve Problem 1 . Specifically, we show that, for any point \(x \in \mathbb{R}^n\), a directional attack of \(\Phi\) at \(x\) in the direction \(\nabla f(\Phi(x))\) likely yields an ascent direction for Problem 1 . This section formalizes this
idea and analyzes its theoretical properties. To this end, we introduce the attack
operator in Definition 2, which
performs the aforementioned attack. The norm is left unspecified in this definition since it does not impact the theoretical properties of the attack
operator, but in practice we consider the \(\lVert{\cdot}\rVert_\infty\) norm.
Definition 2 (attack
operator). Given a loss function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\), we name attack
operator* any set-valued
function \(\texttt{attack}: \mathbb{R}^n \times \mathbb{R}_+ \to 2^{\mathbb{R}^n}\) such that \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}(x, r, \nabla f(\Phi(x)))\) holds for
all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\).*
Note that many functions satisfy Definition 2, e.g., the empty-set operator \(\texttt{attack}\equiv \emptyset\) and the ideal operator given by \(\texttt{attack}(x,r) \triangleq\mathcal{A}_\mathcal{L}^*(x, r, \nabla f(\Phi(x)))\) for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\). Nevertheless, in practice, we choose a numerical solver from the literature, and for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) we define \(\texttt{attack}(x,r)\) as the output of that solver tackling Problem 3 at \(u \triangleq\nabla f(\Phi(x))\). Definition 2 is tailored to encompass all possible outputs that a solver may return. Indeed, the solver either fails (so it returns \(\emptyset\), which is allowed) or returns a feasible attack \(d \in \mathcal{A}(x,r,u)\) that has no guarantee to be optimal or even successful (hence, we do not impose that \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^*(x,r,u)\) or \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^+(x,r,u)\)).
In Section 3.2.1, we prove that if the \(\texttt{attack}\) operator is actually guaranteed to identify successful attacks,
then its outputs possess guarantees to be ascent directions for Problem 1 . In Section 3.2.2, we discuss that although this setting is not
fully supported by existing solvers, an attack
operator defined via current NN
attacks solvers already acts as a good heuristic for computing ascent directions.
attack
Operator↩︎This section establishes a theoretical setting guaranteeing that the attack
operator yields an ascent direction for Problem 1 whenever one exists. This setting requires that the \(\texttt{attack}\) operator returns successful attack directions when some exist, and that the loss function is well-suited in a sense that we introduce below. We formalize these requirements in Assumption 2 and our claim in Proposition 1.
We say that a function \(\mathcal{L}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}_+\) satisfies the Well-Suited Loss Property (WSLP
) when
\[\label{equation:well95suited95losses} } \forall (y_1,y_2) \in \mathbb{R}^m \times \mathbb{R}^m, \quad \mathcal{L}(y_1,y_2) < \mathcal{L}(0,y_2) \implies
\left<y_1,y_2\right> > 0.\tag{4}\] Roughly speaking, the WSLP
ensures that for any \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) and with \(u \triangleq\nabla
f(\Phi(x))\), successful attacks for Problem 3 are alterations of \(x\) that drive the output of \(\Phi\) in the direction of the
gradient of \(f\). Thus, according to the first-order approximation of \(f\) near \(\Phi(x)\), the point \(\Phi(x+d)\)
likely satisfies \(f(\Phi(x+d)) > f(\Phi(x))\). This observation is formalized in Proposition 1.
Assumption 2 (Successful attack
operator). The \(\texttt{attack}\) operator relies on a loss function \(\mathcal{L}\) satisfying the WSLP
,
and for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\), the set it returns satisfies \(\texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x)))\). Moreover, for
all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) such that \(\mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x)))\) is nonempty, \(\texttt{attack}(x,r)\) is
nonempty.
Proposition 1. Under Assumptions 1 and 2, for all \(x \in \mathbb{R}^n\), there exists \(\overline{r}(x) > 0\) such that for all \(r \in [0,\overline{r}(x)]\), every \(d \in \texttt{attack}(x, r)\) is an ascent direction for Problem 1 emanating from \(x\).
Proof. Let Assumptions 1 and 2 hold. Define \[\begin{array}{rlcl} \forall y \in \mathbb{R}^m, & \overline{\rho}(y) & \triangleq & \max\{ \rho \in [0,+\infty] : \quad (\forall u \in \mathcal{B}^m_\rho ~\text{such that}~ \left<u,\nabla f(y)\right> > 0,~ f(y+u) > f(y)) \}, \\ \forall x \in \mathbb{R}^n, & \overline{r}(x) & \triangleq & \max\{ r \in [0,+\infty] : \quad (\forall d \in \mathcal{B}^n_r,~ \lVert{\Phi_x(d)}\rVert \leq \overline{\rho}(\Phi(x))) \}. \end{array}\] First, for all \(y \in \mathbb{R}^m\), the first-order approximation of \(f\) near \(y\) ensures that \(\overline{\rho}(y) > 0\). Moreover, for all \(x \in \mathbb{R}^n\), the continuity of \(\Phi\) from Assumption 1 ensures that \(\overline{r}(x) > 0\). We deduce that \[\label{equation:property95in95proof95careful95attacks}\forall d \in \mathcal{B}^n_{\overline{r}(x)}: \left<\Phi_x(d)~,~\nabla f(\Phi(x))\right> > 0, \qquad f(\Phi(x+d)) > f(\Phi(x)).\tag{5}\] If moreover \(c(\Phi(x+d)) \leq 0\), then \(d\) is an ascent direction emanating from \(x\). Second, Assumption 2 ensures that for all \(x \in \mathbb{R}^n\) and all \(r \in [0,\overline{r}(x)]\), all \(d \in \texttt{attack}(x,r) \subseteq \mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x)))\) satisfy \[\begin{array}{rrl} & c(\Phi(x+d)) \leq 0 & \text{since}~ d ~\text{is feasible}, \\[0.5em] & d \in \mathcal{B}^n_{\overline{r}(x)} & \text{since}~ \left\lVert{d}\right\rVert \leq r \leq \overline{r}(x), \\[0.5em] & \mathcal{L}(\Phi_x(d), \nabla f(\Phi(x))) < \mathcal{L}(0, \nabla f(\Phi(x))) & \text{since}~ d ~\text{is a successful attack}, \\ so & \left<\Phi_x(d)~,~\nabla f(\Phi(x))\right> > 0 & \text{since}~ \mathcal{L}~satisfies the \ref{equation:well95suited95losses}. \end{array}\] The result follows directly thanks to 5 . ◻
Proposition 1 validates the first goal of this paper: directional
NN
attacks may be used as tools for solving Problem 1 . Under Assumption 2, for all \(x \in \mathbb{R}^n\), all attacks identified by \(\texttt{attack}(x,r)\) with \(r \leq \overline{r}(x)\) are ascent directions for Problem 1 . Remark 3 shows an algorithmic way to search for a radius \(0 < r \leq \overline{r}(x)\), Remark 4 discusses whether \(\texttt{attack}(x,r) = \emptyset\), and Remark 5 adapts our methodology to the regularity of \(f\). Finally, Remark 6 focuses on cases where \(f\) has active subspaces, which require special care to avoid a deterioration of the performance of the attack
operator.
However, an attack
operator constructed from any solver in Table 1 violates Assumption 2, so it may not yield a reliable optimization algorithm. Nevertheless, such an attack
operator remains a valuable component within a broader optimization strategy. Section 3.2.2 substantiates these claims.
Remark 3. Given a point \(x \in \mathbb{R}^n\) and under Assumptions 1 and 2, Proposition 1 shows that we must select a radius \(r \in ]0,\overline{r}(x)]\) to ensure that all \(d \in \texttt{attack}(x,r)\) are ascent directions. Finding such a radius is not difficult in general. Indeed, \(\overline{r}(x)\) is strictly positive (even though it may be difficult to compute since it depends on the local Lipschitz constant of \(\Phi\) at \(x\)), so a workaround to find some \(r \in ]0,\overline{r}(x)]\) is to initialize \(r \triangleq 1\) and halve it until any \(d \in \texttt{attack}(x,r)\) is an ascent direction.
Remark 4. For all \(x \in \mathbb{R}^n\) and all \(r \in [0,\overline{r}(x)]\), Proposition 1 does not ensure that \(\texttt{attack}(x,r) \neq \emptyset\). It is possible to prove that if either \(\mathcal{B}_{\overline{r}(x)}^n(x) \cap F = \emptyset\) or \(x\) satisfies a necessary optimality conditions for Problem 1 (given by \(\left<\Phi_x(d),\nabla f(\Phi(x))\right> \leq 0\) for all \(d \in \mathcal{B}_{\overline{r}(x)}^n\) such that \(x+d \in F\)), then \(\texttt{attack}(x, r)\) is empty for all \(r \in [0,\overline{r}(x)]\). However, the reciprocal implication does not hold. It is theoretically possible that \(\mathcal{A}_\mathcal{L}^+(x,r,\nabla f(\Phi(x))) = \emptyset\) at a point \(x \in \mathbb{R}^n\) that does not satisfy necessary optimality conditions, for all \(r \in
[0,\overline{r}(x)]\), so the attack
operator returns a void set of attacks at \(x\).
Remark 5. The \(\texttt{attack}\) operator computes directional NN
attacks at all \(x \in \mathbb{R}^n\) in the gradient ascent direction \(\nabla f(\Phi(x))\) only. Yet, others directions may be considered depending on the regularity of \(f\), such as the Hessian ascent direction \([\nabla^2
f(\Phi(x))]^{-1} \nabla f(\Phi(x))\) when \(f\) has a Hessian \(\nabla^2 f\). When \(f\) is nonsmooth, subgradient ascent directions,
simplex gradient ascent directions [42], [43] or even simplex Hessian [44], may be considered. For all \(y \in
\mathbb{R}^m\) and all \(r \in \mathbb{R}_+^*\), the (canonical forward) simplex gradient of \(f\) of radius \(r\) at \(y\) is the vector \(\nabla_r f(y) \triangleq r^{-1}[f(y+re_i)-f(y)]_{i=1}^{m}\), and the (canonical forward) simplex Hessian of \(f\) of radius \(r\) at \(y\) is the matrix \(\nabla_r^2 f(y) \triangleq r^{-1}[\nabla_r f(y+re_j) - \nabla_r f(y)]_{j=1}^{m}\).
Remark 6. The attack
operator may fall short of its best potential when \(f\) has an active subspace [45], [46] of dimension \(a \ll m\). We say that \(f\) admits such a
subspace if there exists a mapping \(g: \mathbb{R}^a \to \mathbb{R}\) and a projection matrix \(A \triangleq\begin{bmatrix} I_a ~ 0\end{bmatrix} \in \mathbb{R}^{a \times m}\) such that \(f(y) = g(Ay)\) for all \(y \in \mathbb{R}^m\). In this case, for all \(x \in \mathbb{R}^n\), the gradient takes the form , so a directional NN
attack from Definition 1 seeks for a direction \(d \in \mathbb{R}^n\) such that . However, the
last \(m-a\) components of \(\Phi_x(d)\) are irrelevant, since they do not affect \(f\). If \(a\) is known, the
attack
operator can be adapted as in Remark 2; otherwise, we may learn \(A\) on the fly [47] and adapt accordingly. Active subspaces arise naturally, for example,
when \(\Phi\) has the form \(\Phi(x) = [\Psi(x), x]\) for all \(x \in \mathbb{R}^n\), with \(\Psi: \mathbb{R}^n \to
\mathbb{R}^k\) another NN
, allowing to model constraints on \(x\) via the function \(c\). In such cases, the objective of the problem may depend only on \(\Psi(x)\), so the goal function \(f: \mathbb{R}^{k+n} \to \mathbb{R}\) has an active subspace of dimension \(a = k\).
attack
Operators↩︎Proposition 1 describes an attack
operator that returns ascent
directions for Problem 1 . However, its setting (enclosed by Assumption 2) may not hold in practice, where we construct an attack
operator by selecting a solver \(\mathrm{S}\) and, for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\), defining \(\texttt{attack}(x,r)\) as the output of \(\mathrm{S}\) solving Problem 3 at \(u \triangleq\nabla f(\Phi(x))\). This section discusses the gap between Assumption 2 and practical settings, and introduces a heuristic yet efficient attack
operator from existing solvers. This heuristic operator is stated in Definition 3, and its performance is assessed in Section 5.
Assumption 2 primarily fails in practice because its setting is incompatible with solvers
listed in Table 1, for four reasons. First, Assumption 2 requires the attack
operator to return a successful attack whenever one exists. To our best knowledge, only \((\alpha,\beta)\)-CROWN
[34] offers such guarantees, but under the requirement that \(\Phi\) is a white-box. Second, most solvers use the loss function \(\mathcal{L}_{\mathrm{CE}}\), which does not satisfy the WSLP
, and moreover, they do not easily allow for a change
of the loss function. Among the two ?? , only \(\mathcal{L}_{\mathrm{SE}}\) satisfies the WSLP
, and only \((\alpha,\beta)\)-CROWN
relies on \(\mathcal{L}_{\mathrm{SE}}\). Third, no solver handles Problem 3 exactly, as none supports input constraints. Fourth, some solvers assume input domains restricted to \([0,1]^n\) rather than \(\mathbb{R}^n\). For these last two drawbacks, we introduce a workaround to make existing solvers compatible with our framework.
First, we reformulate Problem 1 as the unconstrained problem \[\label{problem:P95practical} }} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in \mathbb{R}^n\end{array}\endgroup}{\mathop{\mathrm{maximize}}} \quad \widetilde{f}(\widetilde{\Phi}(x)),\tag{6}\] where \[\widetilde{\Phi}:\left\{\begin{array}{ccl} \mathbb{R}^n & \to & \mathbb{R}^{m+p}\\ x & \mapsto & \begin{bmatrix} \Phi(x) \\ \texttt{ReLU}(c(\Phi(x))) \end{bmatrix} \end{array}\right. \qquad and \qquad \widetilde{f}:\left\{\begin{array}{ccl} \mathbb{R}^{m+p} & \to & \mathbb{R}\\ \begin{bmatrix} y \\ z \end{bmatrix} & \mapsto & f(y)-\left\lVert{z}\right\rVert_2^2. \end{array}\right.\] Second, for all \(r \in \mathbb{R}_+\) we define the affine map \(d_r: \delta \mapsto 2r(\delta - \mathbb{1}/2)\) that maps \([0,1]^n\) to the ball \(\mathcal{B}^n_r\) in the \(\lVert{\cdot}\rVert_\infty\) norm, and for all \((x,r) \in \mathbb{R}^n \times \mathbb{R}_+\) we define the scaled differential network \[\widetilde{\Phi}_{(x,r)}:\left\{\begin{array}{ccl} [0,1]^n & \to & \mathbb{R}^m\\ \delta & \mapsto & \widetilde{\Phi}_x(d_r(\delta)), \end{array}\right.\] so that for any \((x, r, u) \in \mathbb{R}^n \times \mathbb{R}+ \times \mathbb{R}^{m+p}\), we can compute a directional attack on \(\widetilde{\Phi}\) at \(x\) of radius \(r\) in direction \(u\) by solving a targeted attack on \(\widetilde{\Phi}_{(x,r)}\) at \(\mathbb{1}/2\) with radius \(1/2\) and target \(u\). That is, we solve the problem \[\label{problem:directional95attack95practical} }_\mathcal{L}(x,r,u)} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}\delta \in \mathcal{B}_{1/2}^n\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \mathcal{L}\left(\widetilde{\Phi}_{(x,r)}(\mathbb{1}/2+\delta) , u\right),\tag{7}\] which is compatible with most practical solvers. By construction, any \(\delta\) has the same classification (optimal, successful, or unsuccessful) for this problem as \(d_r(\mathbb{1}/2+\delta)\) for the original directional attack of \(\widetilde{\Phi}\) at \(x\) with radius \(r\). We now express the practical \(\texttt{attack}\) operator accordingly, in Definition 3.
Definition 3 (practical attack
operator). Let \(\mathrm{S}\) be a solver for targeted NN
attacks. The \(\texttt{attack}\) operator
induced by \(\mathrm{S}\)* is the set-valued function \(\texttt{attack}_\mathrm{S}: \mathbb{R}^n \times \mathbb{R}_+ \to 2^{\mathbb{R}^n}\) that, for all \((x,r)
\in \mathbb{R}^n \times \mathbb{R}_+\), maps \((x,r)\) to the set of outputs of \(\mathrm{S}\) solving Problem 7 with \(u \triangleq\nabla \widetilde{f}(\widetilde{\Phi}(x))\), using the loss \(\mathcal{L}\) fixed by \(\mathrm{S}\).*
We emphasize that the current gap between theoretical and practical attack
operators may narrow as solver implementations advance. The workaround above is heuristic since it consists in an inexact relaxation of Problem 1 and since the \(\texttt{attack}_\mathrm{S}\) operator may return attack directions that would be infeasible for the original problem. In our numerical experiments in Section 5, we consider the \(\texttt{attack}_\mathrm{S}\) operator defined using the Torchattacks library [31] as the solver \(\mathrm{S}\) (specifically, its implementations of the fgsm
[48] and pgd
[49] algorithms under the \(\lVert{\cdot}\rVert_\infty\) norm and the \(\mathcal{L}_{\mathrm{CE}}\) loss function).
NN
Attacks and DFO
↩︎This section addresses the second goal of the paper, of designing a hybrid optimization algorithm that combines directional NN
attacks (as in Section 3) with techniques from
derivative-free optimization (DFO
) to tackle Problem 1 regardless of the structure of \(\Phi\). The field of DFO
[9], [16] studies optimization methods that assume little accessible structure on the problem. The books [9], [16] and the survey [29] give broad coverage of all classes of DFO
techniques. In this work, we focus on the direct search methods (dsm
) [9] because the covering``dsm
(cdsm
) [28] variant possesses the
strongest convergence properties among DFO
methods.
Section 4.1 reviews the cdsm
[28] and its asymptotic convergence
guarantees towards a local solution. Section 4.2 then presents our hybrid algorithm (Algorithm 3) along with its convergence result (Theorem 1) inherited from those of the cdsm
.
Before going through this section, let us stress that our main motivation to design our hybrid method from DFO
methods is to address our lack of assumptions about the structure of \(\Phi\). The core idea of
our methodology is to hybridize NN
attacks with an optimization algorithm that is well-suited for the problem at hand. Then, in contexts where \(\Phi\) is a white-box NN
, it may be preferable to
consider another algorithm that exploits the explicit neural architecture of \(\Phi\), instead of the cdsm
or any DFO
algorithms that are oblivious to this structure. We leave this discussion about
Algorithm 3 to Section 6. However, we also stress that the independence of the DFO
methods to the structure of \(\Phi\) also
provides numerical reliability, although at the cost of speed. Algorithm 3 may therefore consist in a baseline method to assess the performance of more sophisticated algorithms that exploit the structure of \(\Phi\).
cdsm
Algorithm↩︎All dsm
algorithms proceed iteratively, with each iteration comprising two steps. The first one is named the search
step. It is optional, but it allows for many user-defined strategies (e.g., globalization techniques) with few
restrictions. The second one is named the poll
step. This step is mandatory and has a more stringent definition, as it underpins all theoretical guarantees. Historically, the dsm
class has been split into two subclasses: mesh
dsm
and sufficient increase dsm
, each defining some specific restrictions on the search
and poll
steps. We refer to [9] for mesh dsm
, and to [16] for sufficient increase dsm
. Both subclasses of
dsm
ensure that some limit points they generate satisfy necessary optimality conditions for Problem 1 . Nevertheless, an advance on dsm
unifies these two subclasses. The
cdsm
(covering``dsm
) [28] introduces a third step named the covering
step, which also has a rigid
definition but overrides the convergence properties of dsm
. This step, when properly defined and under mild assumptions, ensures convergence to local solutions regardless of the implementations of the search
and poll
steps. To our knowledge, at the time of writing the cdsm
is the only DFO
algorithm with this general convergence guarantee (although [28] highlights that the covering
step may be added into most DFO
algorithms and provide them the same convergence properties). We now overview the search
and poll
steps as
allowed in the cdsm
, and then we introduce the covering
step.
The search
step offers a light framework for evaluating trial points chosen by the user. Its purpose in practice is to allow for globalization strategies, such as the variable neighbourhood search [50], [51]. At each iteration \(k \in \mathbb{N}\), the search
step usually relies on the current incumbent solution \(x^k\), the current poll
radius \(r^k\) (to be specified in the next paragraph), and the trial points
history \(\mathcal{H}^k\) that consists of the set of all points evaluated by the algorithm up to iteration \(k\); and the set \(\texttt{search}(x^k,r^k,\mathcal{H}^k) \subseteq \mathbb{R}^n\) is only required to be finite. Accordingly, we formalize a search
operator as follows.
Definition 4 (search
operator). A set-valued function \(\texttt{search}: \mathbb{R}^n \times \mathbb{R}_+ \times 2^{\mathbb{R}^n} \to 2^{\mathbb{R}^n}\) is a search
operator if \(\texttt{search}(x,r,\mathcal{H}) \subseteq \mathbb{R}^n\) is empty or finite for all \((x,r,\mathcal{H}) \in \mathbb{R}^n \times \mathbb{R}_+ \times
2^{\mathbb{R}^n}\).
The poll
step has a more restrictive definition. It consists, at all iterations \(k \in \mathbb{N}\), in a local search around \(x^k\) of some radius \(r^k > 0\). Most of the literature defines \(\texttt{poll}(x^k,r^k,\mathcal{H}^k)\) as a positive spanning set with all elements having a norm of at most \(r^k\), that is, \(\mathrm{PSpan}(\texttt{poll}(x^k,r^k,\mathcal{H}^k)) = \mathbb{R}^n\) with \(\texttt{poll}(x^k,r^k,\mathcal{H}^k) \subseteq
\mathcal{B}^n_{r^k}\). We formalize the poll
operator accordingly. A popular strategy [52] samples a unit vector \(v^k\), builds the matrix \(M^k \triangleq I-2(v^k)(v^k)^\top\) (which is orthonormal), and sets \(\texttt{poll}(x^k,r^k,\mathcal{H}^k) \triangleq\{\pm
r^kM^ke_i\}_{i=1}^{n}\). In addition, popular approaches such as [53], [54] use the
set \(\mathcal{H}^k\) to construct a local quadratic model of \(f \circ \Phi\) near \(x^k\) and add the gradient of that model at \(x^k\) to the set of trial directions.
Definition 5 (poll
operator). A set-valued function \(\texttt{poll}: \mathbb{R}^n \times \mathbb{R}_+ \times 2^{\mathbb{R}^n} \to 2^{\mathbb{R}^n}\) is a poll
operator if \(\mathrm{PSpan}(\texttt{poll}(x,r,\mathcal{H})) = \mathbb{R}^n\) and \(\texttt{poll}(x,r,\mathcal{H}) \subseteq \mathcal{B}^n_r\) hold for all \((x,r,\mathcal{H}) \in \mathbb{R}^n \times \mathbb{R}_+ \times 2^{\mathbb{R}^n}\).
Finally, the covering
step [28] may be added to the dsm
on top of the search
and
poll
steps. This step consists, at all iterations \(k \in \mathbb{N}\), in evaluating points in the ball \(\mathcal{B}^n_1(x^k)\) that are far enough from the trial points
history \(\mathcal{H}^k\). The fixed radius (here set to \(1\) for simplicity) ensures that all accumulation points of \((x^k)_{k \in \mathbb{N}}\) are local
solutions. Below, we consider a concise definition of the covering
operator for simplicity, although we remark that [28]
formalizes others that allow for an easier computation.
Definition 6 (covering
operator). The covering
operator* is the set-valued function \(\texttt{covering}: \mathbb{R}^n \times 2^{\mathbb{R}^n} \to 2^{\mathbb{R}^n}\)
defined by \(\texttt{covering}(\cdot,\emptyset) \equiv \{0\}\) and by \(\texttt{covering}(x,\mathcal{H}) \triangleq\mathop{\mathrm{argmax}}_{d \in \mathcal{B}^n_1}
\mathrm{dist}(x+d,\mathcal{H})\) for all \((x,\mathcal{H}) \in \mathbb{R}^n \times 2^{\mathbb{R}^n}\) with \(\mathcal{H}\neq \emptyset\).*
The cdsm
algorithm is summarized in Algorithm 2 below. As we now formalize in Proposition 2, its convergence properties established in [28] hold for Problem 1 .
Proposition 2 (Adapted from [28]). Let \((x^k)_{k \in \mathbb{N}}\) be the sequence of incumbent solutions generated by Algorithm 2 solving Problem 1 under Assumption 1. Then \((x^k)_{k \in \mathbb{N}}\) admits at least one accumulation point, and all of them are local solutions to Problem 1 .
NN
Attacks and cdsm
↩︎The cdsm
described in Algorithm 2 is purely based on DFO
techniques, and thus it ignores the structure of Problem 1 . Thus, the cdsm
remains applicable
to hard instances, but it usually converges slowly in practice. In contrast, our technique from Section 3.2.2, relying on directional NN
attacks, offers a
heuristic yet potentially efficient intensification strategy. Then, each of these two approaches offsets the other’s limitations. Motivated by this synergy, we propose a hybrid algorithm that combines directional NN
attacks with the
steps of the cdsm
. At each iteration \(k \in \mathbb{N}\), our algorithm first attempts a directional NN
attack via the attack
operator from Section 3.2. The outcome of this attack determines how the algorithm proceeds: (i) if the attack yields a sufficient increase (formalized in Definition 7 below), the iteration is accepted and the cdsm
steps are skipped, (ii) if it yields a simple increase, the
iteration continues with the cdsm
steps applied from the improved point, (iii) if the attack fails, the cdsm
steps proceed from the current point. This logic is formalized in Algorithm 3 given
below.
Definition 7 (Sufficient increase). Given a couple \((\tau,\varepsilon) \in \mathbb{R}_+^* \times \mathbb{R}_+^*\), the sufficient increase function* is the function \[\rho:\left\{\begin{array}{ccl} \mathbb{R}^n \times \mathbb{R}^n & \to & \{0,1\}\\ (x_1, x_2) & \mapsto & 1 ~\text{if}~ \dfrac{f(\Phi(x_1))-f(\Phi(x_2))}{\left\lvert{f(\Phi(x_2))}\right\rvert+\varepsilon} \geq \tau, ~0~ \text{otherwise.} \end{array}\right.\] For all \((x_1, x_2) \in \mathbb{R}^n \times \mathbb{R}^n\), we say that \(x_1\) yields a sufficient increase over \(x_2\) if \(\rho(x_1, x_2) = 1\).*
This algorithm inherits the convergence analysis of the cdsm
, as formalized in the next Theorem 1.
Theorem 1. Let \((x^k)_{k \in \mathbb{N}}\) be the sequence of incumbent solutions generated by Algorithm 3 solving Problem 1 under Assumption 1. Then \((x^k)_{k \in \mathbb{N}}\) admits at least one accumulation point, and all such points are local solutions to Problem 1 .
Proof. This proof relies on [28], which claims the following. Consider that Assumption 1 holds, and define \(x^0 \in F\) and \(\mathcal{H}^0 \subseteq \mathbb{R}^n\). For all \(k \in \mathbb{N}\), set \(\mathcal{T}_{\tt{C}}^k \triangleq\{x^k+d,~ d \in \texttt{covering}(x^k,\mathcal{H}^k)\}\) and select \(\mathcal{H}^{k+1}\) such that \(\mathcal{H}^k \cup \mathcal{T}_{\tt{C}}^k \subseteq \mathcal{H}^{k+1}\) and select \(x^{k+1} \in \mathop{\mathrm{argmax}}f(\Phi(\mathcal{H}^{k+1} \cap F))\). Then \((x^k)_{k \in \mathbb{N}}\) admits at least one accumulation point, and all of them are local solutions to Problem 1 .
Let \((x^{k+1}, \mathcal{H}^{k+1})_{k \in K}\) be the sequence generated by Algorithm 3 under Assumption 1, and denote by \(K \subseteq \mathbb{N}\) the set of all iterations at which the covering
step is executed. We show that \(K\) contains
all sufficiently large integers. Indeed, \((x^{k+1}, \mathcal{H}^{k+1})_{k \in K}\) therefore satisfies [28] by construction, and \((x^k)_{k \in \mathbb{N}}\) and \((x^{k+1})_{k \in K}\) share the same set of accumulation points, so the result follows. By
Assumption 1, \(f(\Phi(F))\) is bounded above since \(F\) is compact and \(f \circ \Phi\) is continuous. Then, we get that \(\lim_{k \in \mathbb{N}} f(\Phi(x^k)) \leq \sup f(\Phi(F)) < +\infty\) since, by construction, \((f(\Phi(x^k)))_{k
\in \mathbb{N}}\) is increasing and \(x^k \in F\) for all \(k \in \mathbb{N}\). Moreover, by construction, \(\mathbb{N}\setminus K\) contains exactly
the iterations skipping the covering
step, so \(\mathbb{N}\setminus K = \{k \in \mathbb{N}: \rho(t_\texttt{atk}^k, x^k) = 1\) and \(x^{k+1} = t_\texttt{atk}^k\}\). Then,
each \(k \in \mathbb{N}\setminus K\) raises \(\rho(x^{k+1},x^k) = 1\), and then \(f(\Phi(x^{k+1})) \geq f(\Phi(x^k)) +
\tau(\left\lvert{f(\Phi(x^k))}\right\rvert+\varepsilon) \geq f(\Phi(x^k)) + \tau\varepsilon\). We deduce that \(\mathbb{N}\setminus K\) contains at most \((\tau\varepsilon)^{-1}(\sup
f(\Phi(F))-f(x^0))\) elements, which is a finite quantity so it follows that \(K\) contains all sufficiently large integers, as desired. ◻
In this section, we evaluate the numerical performance and behaviour of Algorithm 3 on three problems with diverse structures. On each problem, we conduct the next three analyses.
General performance comparison. We evaluate the performance of Algorithm 3 against three baselines (detailed in Section 5.1): two from the
DFO
literature, and one consisting solely of repeated calls to the attack
operator. For each algorithm, we track the sequence of all evaluated points and we record the best objective value found as a function of the evaluation
budget.
Contribution of each step. We analyze the respective roles of the attack
and cdsm
steps within Algorithm 3. At each iteration, we record which step contributes to improving
the current incumbent solution. Precisely, we track whether the attack
step leads to a sufficient increase skipping the cdsm
steps, a simple increase, or a failure. In the latter two cases, we moreover track which
cdsm
step succeeds, if any. For comparison, we also conduct the same analysis for the baseline methods.
Alternative attack
operators. We evaluate the effectiveness of the two \(\texttt{attack}_\mathrm{S}\) operator proposed in Section 3.2.2, based on the Torchattacks [31] versions of
the fgsm
[48] and pgd
[49] algorithms respectively. To this end, we extract a sample \((x^{k(j)})_{j \in \mathbb{J}}\) (with \(\mathbb{J}\subset \mathbb{N}\)) of incumbent
solutions from the sequence \((x^k)_{k \in \mathbb{N}}\) generated by Algorithm 3. This sample spans a range of objective values for Problem 1 , enabling
evaluation across different optimization stages. For each \(j \in \mathbb{J}\) and several radii \(r \in \mathbb{R}_+\), we test whether \(\texttt{attack}_\mathrm{S}(x^{k(j)}, r)\) yields an ascent direction. We also track the runtime and the associated values \(f(\Phi(x^{k(j)}))\) to relate ascent potential with point quality.
The full numerical setup is presented in Section 5.1. Section 5.2 explores a proof-of-concept task to align the prediction returned by
the ResNet18
network with a fixed target. Section 5.3 addresses a counterfactual explanation task involving a NN
that generates Warcraft maps, as proposed in [27]. Finally, Section 5.4 tackles a chemical engineering optimization problem of maximizing
the production of some bio-diesel, using a Physics-Informed NN
(PINN
) from [55] which models a
chemical reaction based on reaction time and the power of a heat flux.
Overall, our experiments show that Algorithm 3 outperforms the three baselines. It also appears that the attack
step contributes to the performance of Algorithm 3 in
its early iterations (when the incumbent solution remains far from the solution), since it often leads to a sufficient increase that allows skipping the cdsm
steps. However, the performance of the attack
step decreases in later
iterations, so the cdsm
steps are performed more often at the end of the optimization process. Finally, the two variants of the \(\texttt{attack}_\mathrm{S}\) operator have similar potential. Both are efficient
when at low-quality points, and both get a drop in performance as solutions converge. However, the fgsm
algorithm is sensibly faster than the pgd
algorithm. This suggests implementing the \(\texttt{attack}_\mathrm{S}\) operator via fgsm
rather than pgd
, since fgsm
is faster than pgd
while its lower accuracy has little impact on the performance.
As discussed in Section 3.2.2, our experiments solve the relaxed Problem 6 rather than Problem 1 .
This relaxation enables the use of existing solvers for NN
attacks. The methods compared in our experiments are described below. They all adopt the relaxed problem, to ensure a fair comparison.
NN
attacks
only.This method uses the \(\texttt{attack}_\mathrm{S}\) operator from Definition 3,
instantiated with the Torchattacks [31] implementation of the fgsm
algorithm [48] under \(\lVert{\cdot}\rVert_\infty\) and \(\mathcal{L}_{\mathrm{CE}}\) loss function. We
conducted preliminary experiments with a variant relying on the pgd
algorithm [49] instead, but we observed that this stronger attack
yields similar results, so we favour fgsm
since it is faster. The algorithm reads as follows, given \(x^0 \in \mathbb{R}^n\) and \(r^0 \in \mathbb{R}_+^*\) and some fine-tuned
expansion and shrinking radius parameters: \[\forall k \in \mathbb{N}, \quad \left\{\begin{array}{lll} \mathcal{T}_{\tt{A}}^k & \triangleq& \{x^k+d,~ d \in \texttt{attack}_\mathrm{S}(x^k, r^k) \cup
\texttt{attack}_\mathrm{S}(x^k, \frac{11}{10}r^k)\}, \\ t_{\tt{A}}^k & \triangleq& \mathop{\mathrm{argmax}}\widetilde{f}(\widetilde{\Phi}(\mathcal{T}_{\tt{A}}^k)), \\ x^{k+1} & \triangleq&
\mathop{\mathrm{argmax}}\{\widetilde{f}(\widetilde{\Phi}(t_{\tt{A}}^k)), \widetilde{f}(\widetilde{\Phi}(x^k))\}, \\ r^{k+1} & \triangleq& \frac{11}{10}r^k ~\text{if}~ t_{\tt{A}}^k \neq x^k,~ \frac{2}{3}r^k ~\text{otherwise}.
\end{array}\right.\]
This baseline follows the random line search (rls
) method [56], known for consistent empirical performance in
high-dimensional DFO
settings despite its theoretical guarantees weaker than those of cdsm
. We implement the method as follows, given \((x^0,r^0) \in \mathbb{R}^n \times \mathbb{R}_+^*\) and some
fine-tuned numerical values: \[\forall k \in \mathbb{N}, \quad \left\{\begin{array}{lll} d_{\tt{L}}^k & \triangleq& \text{random (uniform) draw on the unit sphere of}~ \mathbb{R}^n, \\ \mathcal{T}_{\tt{L}}^k &
\triangleq& \{x^k+rd_{\tt{L}}^k,~ r \in \{\frac{13}{10}r^k, r^k, \frac{10}{13}r^k\} \}, \\ t_{\tt{L}}^k & \triangleq& \mathop{\mathrm{argmax}}\widetilde{f}(\widetilde{\Phi}(\mathcal{T}_{\tt{L}}^k)), \\ x^{k+1} & \triangleq&
\mathop{\mathrm{argmax}}\{\widetilde{f}(\widetilde{\Phi}(t_{\tt{L}}^k)), \widetilde{f}(\widetilde{\Phi}(x^k))\}, \\ r^{k+1} & \triangleq& \left\lVert{t_{\tt{L}}^k-x^k}\right\rVert ~\text{if}~ t_{\tt{L}}^k \neq x^k,~ \frac{2}{3}r^k
~\text{otherwise}. \end{array}\right.\]
cdsm
baseline.This method runs Algorithm 2 on Problem 6 . For each \(k \in \mathbb{N}\), the covering
step is simplified so it evaluates a random
point in the ball \(\mathcal{B}_1(x^k)\) chosen uniformly, the poll
step follows the scheme from [53], and
the search
step evaluates one point in a random (uniform) direction with length \(r^0\sqrt{s^k}\), where \(s^k\) is the number of search
steps executed up to
iteration \(k\).
This method runs Algorithm 3 on Problem 6 . We define the attack
step via the attack
operator from Definition 3 and the cdsm
steps as in the \(\mathbb{M}_{\texttt{cdsm}}\) method. The
sufficient increase function for the attack
step is defined via \((\tau,\varepsilon) \triangleq(10^{-3},10^{-10})\).
Each algorithm terminates when all radii become smaller than \(10^{-5}\). We implement our methods, available at this GitHub repository, in Python \(3.12.2\) and run them on a single-thread Intel Xeon Gold 6258R CPU cadenced at \(2.70\)GHz. The entire experimental process (each experiment sequentially, with each method run sequentially within each experiment) runs in seven hours for Section 5.2, two hours for Section 5.3, and ten minutes for Section 5.4. However, it is possible to parallelize most of these computations, as discussed in the repository.
This task builds upon the ResNet18
classifier discussed in Section 3.1.1 and on the barycentric image function \(\texttt{bary}: \mathbb{I}^n \times \mathbb{R}^n \to \mathbb{I}\) defined by \(\texttt{bary}(I;w) \triangleq\sum_{\ell=1}^{n} w_\ell I_\ell\) for all sets \(I
\triangleq(I_\ell)_{\ell=1}^{n} \in \mathbb{I}^n\) of \(n\) images and all vectors \(w \in \mathbb{R}^n\) of weights. We fix \(n \triangleq 100\) and
we select \(I \in \mathbb{I}^n\) such that the input images are mutually dissimilar. Then, we define \[\Phi:\left\{\begin{array}{ccl} \mathbb{R}^n & \to & \mathbb{P}^{1000}\\ x &
\mapsto & \texttt{ResNet18}(\texttt{bary}(I;\texttt{softmax}(x))) \end{array}\right. \quad and \quad f:\left\{\begin{array}{ccl} \mathbb{P}^{1000} & \to & \mathbb{R}\\ y & \mapsto & -\left\lVert{y-\texttt{ResNet18}(I_1)}\right\rVert
\end{array}\right.\] and \(c: x \mapsto \begin{bmatrix} (x_i-10)(x_i+10) \end{bmatrix}_{i=1}^{n}\), defining the feasible set \(F \triangleq[-10, 10]^n\). The solution is the
vector \(x^* \in \mathbb{R}^n\) with \(x^*_1 \triangleq 10\) and \(x^*_i \triangleq-10\) for all \(i \neq 1\), and we
initialize \(x^0 \triangleq 0\). Although synthetic, this setup enables an assessment of algorithmic behaviour in a simple and controlled setting.
Let us begin by analyzing the results of on this problem. Figure 4 shows that among the three baselines, only the \(\mathbb{M}_{\texttt{cdsm}}\) method converges to a
near-optimal solution. The \(\mathbb{M}_{\texttt{rls}}\) method evaluates numerous points with little progress, as confirmed by Figure 5 showing that few of its \(1000\) iterations are successful. This is likely due to a narrow cone of ascent directions near the initial point, a known difficulty in DFO
. The \(\mathbb{M}_{\texttt{atk}}\)
method also underperforms, halting early on a suboptimal point with most iterations providing only marginal improvements. In contrast, the \(\mathbb{M}_{\texttt{cdsm}}\) method succeeds in closely approximating the optimum,
albeit with numerous trial points. The \(\mathbb{M}_{\texttt{hyb}}\) method outperforms all the baselines, as it requires significantly fewer points to evaluate to reach the solution.
We now focus to , depicted in Figure 5. The \(\mathbb{M}_{\texttt{cdsm}}\) method mostly progresses with the poll
step, while the covering
and search
steps have a limited contribution to the process. This domination of the poll
step among the cdsm
components carries over to the \(\mathbb{M}_{\texttt{hyb}}\) method.
However, most of the performance of the \(\mathbb{M}_{\texttt{hyb}}\) method results from the attack
step, which succeeds roughly half of the time and often yields a sufficient increase skipping the
cdsm
steps. This highlights the strength of combining attack techniques with a local search. We believe that the poll
step assists the attack
step by repositioning the point at which the attack is performed in case of
failure. For example, the neighbourhood of \(x^0\) appears challenging for the attack
step (as shown by the stagnation of the \(\mathbb{M}_{\texttt{atk}}\) method in ), but the
poll
step identifies new incumbent solutions at which the attack
step performs better.
Finally, confirms the reliability of the attack
operator in identifying ascent directions throughout the optimization process. Figure 6 shows that fgsm
and pgd
attacks are similarly effective, so we favour fgsm
since it is six times faster on average.
Our second problem, adapted from [27], focuses on optimal counterfactual explanations for structured prediction models involving
a NN
providing parameters of a combinatorial optimization layer. We adapt the flagship application discussed in [27], which
searches for \(\varepsilon\)-relative counterfactual Warcraft maps with respect to the lightest path problem, in the sense defined in Section 1.
The video game Warcraft is a tactical board game where some areas of the maps (represented as images in the space \(\mathbb{W}\triangleq[0,1]^{3 \times 96 \times 96}\) endowed with some norm \(\lVert{\cdot}\rVert_\mathbb{W}\)) are lighter to cross than others. To reflect this, [27] designs a
NN
\(\texttt{costmap}: \mathbb{W}\to \mathbb{C}\triangleq\mathbb{R}_+^{12 \times 12}\) that splits any map \(\mathcal{W}\in \mathbb{W}\) into \(12
\times 12\) areas of \(8 \times 8\) pixels and estimates the cost for a character to enter all areas. We endow the space \(\mathbb{C}\) with the sum-of-entries norm \(\lVert{\cdot}\rVert_\mathbb{C}\) and the entry-wise inner product \(\odot\). Thus, for any Warcraft map \(\mathcal{W}\in \mathbb{W}\) and any path \(\mathrm{p}\in \{0,1\}^{12 \times 12}\) crossing the map, the cost of \(\mathrm{p}\) along \(\mathcal{W}\) reads as \[\texttt{costpath}(\mathcal{W};\mathrm{p}) \triangleq\left\lVert{\texttt{costmap}(\mathcal{W}) \odot \mathrm{p}}\right\rVert_\mathbb{C}.\] We consider a Warcraft map \(\mathcal{W}_\mathrm{ini}\in
\mathbb{W}\) and its cost map \(\mathcal{C}_\mathrm{ini}\triangleq\texttt{costmap}(\mathcal{W}_\mathrm{ini}) \in \mathbb{C}\), and we compute the lightest path \(\mathrm{p}^*_\mathrm{ini}\in
\{0,1\}^{12 \times 12}\) joining the Northwestern corner of \(\mathcal{W}_\mathrm{ini}\) to the Southeastern one. Then we pick an alternative path \(\mathrm{p}^\sharp \neq
\mathrm{p}^*_\mathrm{ini}\) joining the same corners. Our goal is to produce an alternative map that remains close to \(\mathcal{W}_\mathrm{ini}\) and favours this alternative path over the original one. More
specifically, we seek for a \(\varepsilon\)-relative counterfactual explanation of \(\mathcal{W}_\mathrm{ini}\) with respect to the function \(\texttt{costpath}\) and the path \(\mathrm{p}^\sharp\), where we fix \(\varepsilon \triangleq 1\). We therefore search for a map \(\mathcal{W}_\mathrm{cfa}\in \mathbb{W}\) that solves the problem \[\underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}\mathcal{W}\in
\mathbb{W}\end{array}\endgroup}{\mathop{\mathrm{minimize}}} \quad \left\lVert{\mathcal{W}-\mathcal{W}_\mathrm{ini}}\right\rVert_\mathbb{W}^2 \quad subject to \quad (1+\varepsilon)\texttt{costpath}(\mathcal{W};\mathrm{p}^\sharp) \leq
\texttt{costpath}(\mathcal{W};\mathrm{p}^*_\mathrm{ini}).\] In other words, the goal is to find an alternative map \(\mathcal{W}_\mathrm{cfa}\in \mathbb{W}\) as close as possible to \(\mathcal{W}_\mathrm{ini}\), but on which the path \(\mathrm{p}^\sharp\) is at least twice lighter than the path \(\mathrm{p}^*_\mathrm{ini}\). We illustrate this
problem in Figure 7.
However, the above problem has no constraints guaranteeing that the image encoded by \(\mathcal{W}_\mathrm{cfa}\) is visually similar to \(\mathcal{W}_\mathrm{ini}\), or more generally to
a real map from the game. To enforce this requirement, we rely on another NN
from [27], that we denote by \(\texttt{warcraft}: \mathbb{X}\to \mathbb{W}\) (where \(\mathbb{X}\triangleq\mathbb{R}^n\) with \(n \triangleq 64\)). This NN
is designed to generate
credible Warcraft maps from abstract input vectors, where credible means that the images generated by \(\texttt{warcraft}\) may not represent maps that truly exist in-game, but that look like real ones. By design, all \(x \in \mathbb{R}^n\) with \(\lvert{\lVert{x}\rVert_\mathbb{X}-\sqrt{n}}\rvert \leq 1\) yield \(\texttt{warcraft}(x)\) being credible. We consider that we know
the vector \(x_\mathrm{ini}\in \mathbb{R}^n\) that generates \(\mathcal{W}_\mathrm{ini}\) as \(\mathcal{W}_\mathrm{ini}\triangleq\texttt{warcraft}(x_\mathrm{ini})\). By design, any \(x \approx x_\mathrm{ini}\) generates \(\mathcal{W}\triangleq\texttt{warcraft}(x)
\approx \mathcal{W}_\mathrm{ini}\), so \(\lVert{x-x_\mathrm{ini}}\rVert_\mathbb{X}\) acts as a surrogate of \(\lVert{\mathcal{W}-\mathcal{W}_\mathrm{ini}}\rVert_\mathbb{W}\).
Moreover, to enforce even further proximity between maps, we consider \(\lVert{\texttt{costmap}(\mathcal{W})-\mathcal{C}_\mathrm{ini}}\rVert_\mathbb{C}\) as another surrogate of \(\lVert{\mathcal{W}-\mathcal{W}_\mathrm{ini}}\rVert_\mathbb{W}\). Then, instead of searching for \(\mathcal{W}_\mathrm{cfa}\in \mathbb{W}\) directly, we seek for a counterfactual \(x_\mathrm{cfa}\in \mathbb{X}\) to then obtain \(\mathcal{W}_\mathrm{cfa}\triangleq\texttt{warcraft}(x_\mathrm{cfa})\). We also use the surrogate norms instead of the true norm. We therefore
solve \[\begin{array}{cl} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}x \in \mathbb{X}\end{array}\endgroup}{\mathop{\mathrm{minimize}}} &
\begin{array}{l}\left\lVert{x-x_\mathrm{ini}}\right\rVert_\mathbb{X}^2 + \left\lVert{\texttt{costmap}(\texttt{warcraft}(x))-\mathcal{C}_\mathrm{ini}}\right\rVert_\mathbb{C}^2\end{array} \\ \\ subject to &
\begin{array}[t]{ll}\left\lVert{x}\right\rVert_\mathbb{X}\in \left[\sqrt{n}-1 , \sqrt{n}+1\right], \\ (1+\varepsilon)\texttt{costpath}(\texttt{warcraft}(x);\mathrm{p}^\sharp) \leq \texttt{costpath}(\texttt{warcraft}(x);\mathrm{p}^*_\mathrm{ini}).
\end{array} \end{array}\] This reformulation is easier to solve than the original problem, since \(\mathbb{X}\) is a usual vector space instead of a space of images, and its dimension \(n =
64\) is much smaller than those of \(\mathbb{W}\) (\(3 \times 96 \times 96 = 27648\)). Moreover, for any \(x \in \mathbb{X}\), we do not need to
record \(\texttt{warcraft}(x)\). Either the objective and the constraint involving the \(\texttt{costpath}\) function may be expressed in terms of \(\texttt{costmap}(\texttt{warcraft}(x))\) directly.
We fit this problem in the framework of Problem 1 as follows. First, the NN
\(\Phi\) is \[\Phi:\left\{\begin{array}{ccl} \mathbb{X} & \to &
\mathbb{Y}\triangleq\mathbb{X}\times \mathbb{C}\\ x & \mapsto & \left(x, \texttt{costmap}\left(\texttt{warcraft}\left(x\right)\right) \right) \end{array}\right.\] and the goal function \(f\) is given by \[f:\left\{\begin{array}{ccl} \mathbb{Y} & \to & \mathbb{R}\\ y \triangleq(x,c) & \mapsto & -\left( \left\lVert{x-x_\mathrm{ini}}\right\rVert_\mathbb{X}^2 +
\left\lVert{c-\mathcal{C}_\mathrm{ini}}\right\rVert_\mathbb{C}^2\right). \end{array}\right.\]
Next, we define the constraints function by \[c:\left\{\begin{array}{ccl} \mathbb{Y} & \to & \mathbb{R}^2\\ y \triangleq(x,c) & \mapsto & \begin{bmatrix} \left(\left\lVert{x}\right\rVert_\mathbb{X}-\sqrt{n}-1\right)\left(\left\lVert{x}\right\rVert_\mathbb{X}-\sqrt{n}+1\right) \\[0.25em] (1+\varepsilon)\left\lVert{c \odot \mathrm{p}^\sharp}\right\rVert_\mathbb{C}- \left\lVert{c \odot \mathrm{p}^*_\mathrm{ini}}\right\rVert_\mathbb{C} \end{bmatrix}, \end{array}\right.\] so that all \(x \in \mathbb{X}\) with \(c(\Phi(x)) \leq 0\) yield a credible and valid counterfactual explanation as \(\texttt{warcraft}(x)\); since \(\lvert{\lVert{x}\rVert_\mathbb{X}-\sqrt{n}}\rvert \leq 1\) and \((1+\varepsilon)\texttt{costpath}(\texttt{warcraft}(x),\mathrm{p}^\sharp) \leq \texttt{costpath}(\texttt{warcraft}(x),\mathrm{p}^*_\mathrm{ini})\). We initialize all methods with \(x^0 \triangleq x_\mathrm{ini}\), and the counterfactual maps calculated by each method are shown in Figure 8. The returned maps are all visually close. This suggests that the late-stage refinements in this problem consists of fine-tuning. For example, all generated maps exhibit a gorge through the mountain to lighten the path \(\mathrm{p}^\sharp\), though the precise placement of this gorge varies across solutions.
, displayed in Figure 9, shows that our hybrid method \(\mathbb{M}_{\texttt{hyb}}\) delivers the best overall performance. The DFO
methods \(\mathbb{M}_{\texttt{cdsm}}\) and \(\mathbb{M}_{\texttt{rls}}\) are slow, but this is expected since most DFO
methods are not tailored to settings beyond a few dozen variables [9] unless dedicated advanced techniques are used. The \(\mathbb{M}_{\texttt{atk}}\) method quickly approaches a high-quality
solution but fails to refine it further. Its progress curve in Figure 9 (barely visible in the top-left corner of the plot) quickly plateaus. Our hybrid method \(\mathbb{M}_{\texttt{hyb}}\) acts as a trade-off between all these methods, as it is only slightly slower than \(\mathbb{M}_{\texttt{atk}}\) but converges to a solution with quality similar to
those of \(\mathbb{M}_{\texttt{cdsm}}\) and \(\mathbb{M}_{\texttt{rls}}\).
, illustrated in Figure 10, shows that the \(\mathbb{M}_{\texttt{atk}}\) method stops early, with about half of its iteration ending in a successful
attack
. Since the map returned by this method is visually close to those produced by the other methods, this confirms the effectiveness of the attack
operator for early-stage progress, but also its limitations for late-stage
refinement. The \(\mathbb{M}_{\texttt{hyb}}\) method exhibits a similar pattern. The attack
step succeeds in roughly half of the first \(350\) iterations (oftentimes yielding
sufficient increases in the first \(100\) iterations). Beyond this point, however, the success rate drops sharply, and subsequent improvements are almost entirely driven by the cdsm
.
corroborates these observations. Figure 11 shows that the attack
operator reliably identifies ascent directions when the current solution is far from optimal, while its effectiveness
drops as the objective approaches optimality. The fgsm
and pgd
variants achieve comparable success rates across the sample, yet fgsm
is substantially faster, confirming it as our preferred implementation.
This problem is constructed from the chemical engineering study of bio-diesel production from [55]. The chemical reaction they study involves five chemical species: the desired methyl ester (\(\mathrm{ME}\)) specie, and four intermediate reactants: triglycerides (\(\mathrm{TG}\)), diglycerides (\(\mathrm{DG}\)), monoglycerides (\(\mathrm{MG}\)), and glycerol (\(\mathrm{G}\)). For each species, we denote by \([\cdot](t,Q)\) its concentration (in \(\mathrm{mol}.\mathrm{L}^{-1}\)) in the reactor tank after a reaction of duration \(t\) (in seconds) under constant heat input with power \(Q\) (in Watts). We also denote by \(T(t,Q)\) the reactor temperature (in Celsius degrees) under these conditions.
Our goal is to determine the pair \((t,Q) \in \mathbb{R}_+^2\) that maximizes the production of \(\mathrm{ME}\), measured by the average product-to-reactant conversion ratio over the reaction horizon, which quantifies the efficiency of converting the reactants into biodiesel. To ensure physical plausibility, we impose two constraints. First, we prevent evaporation of \(\mathrm{ME}\) (with boiling point at \(65^\circ\mathrm{C}\)) by requiring for \(T(s,Q) \leq 65\) for all \(s \in [0,t]\). Second, we enforce a global energy budget, requiring that the total energy input \(Qt\) (in Joules) does not exceed \(500\). The resulting optimization problem reads as \[\begin{array}{cl} \underset{\begingroup\scriptstyle\renewcommand{\arraystretch}{1.0}\begin{array}{c}(t,Q) \in \mathbb{R}_+^2\end{array}\endgroup}{\mathop{\mathrm{maximize}}} & \begin{array}{l}\dfrac{1}{t}\displaystyle\int_{0}^{t} \dfrac{[\mathrm{ME}(s,Q)]}{[\mathrm{TG}](s,Q) + [\mathrm{DG}](s,Q) + [\mathrm{MG}](s,Q) + [\mathrm{G}](s,Q)} ds\end{array} \\ \\ subject to & \begin{array}[t]{ll}T(s,Q) \leq 65,~ \forall s \in [0,t], \\ Qt \leq 500.\end{array} \end{array}\]
We solve this problem by a simulation-based optimization approach. To simulate the chemical process, we rely on the PINN
trained in [55], available at this GitHub repository. As discussed in Section 1, this setting corresponds to a trend in
simulation-based optimization where the chosen numerical model is a trained NN
instead of numerical simulations. The PINN
predicts, for any given reaction time \(t\) and heating power \(Q\), the concentrations of the five species involved in the reaction and the reactor temperature at the end of the process. Formally, the model reads as the function \[\texttt{PINN}:\left\{\begin{array}{ccl} \mathbb{R}_+^2 & \to & \mathbb{R}^6\\ (t,Q) & \mapsto & \begin{bmatrix} [\mathrm{TG}](t,Q) ~~ [\mathrm{DG}](t,Q) ~~ [\mathrm{MG}](t,Q) ~~ [\mathrm{G}](t,Q) ~~ [\mathrm{ME}](t,Q)
~~ T(t,Q) \end{bmatrix}. \end{array}\right.\] Since the PINN
only provides predictions at the queried time \(t\), we estimate intermediate values over \([0,t]\) by
discretizing time into \(N\) steps (fixed at \(N=100\)) and evaluating \(\texttt{PINN}(\tfrac{it}{N},Q)\) for all \(i \in
\llbracket 0,N\rrbracket\). We thus define \[\Psi:\left\{\begin{array}{ccl} \mathbb{R}_+^2 & \to & \mathbb{R}^{6(N+1)}\\ (t,Q) & \mapsto & \begin{bmatrix} \texttt{PINN}\!\left(\tfrac{it}{N}, Q\right)
\end{bmatrix}_{i=0}^{N}. \end{array}\right.\] To ensure physically meaningful predictions, additional constraints are imposed. First, the input domain is bounded by \(0 \leq t \leq 120\) and \(0 \leq Q \leq 12\), following [55], to remain near the training regime of the model and to reflect
practical operating limits. Second, all concentrations are required to be nonnegative. While this holds physically, the constraint prevents occasional violations due to modelling imperfections by the PINN
.
Then, the problem is expressed as an instance of Problem 1 by defining the input space \(\mathbb{X}\triangleq\mathbb{R}_+^2\), the predictions space \(\mathbb{O}\triangleq\mathbb{R}^{6(N+1)}\), and the neural mapping \[\Phi:\left\{\begin{array}{ccl} \mathbb{X} & \to & \mathbb{O}\times \mathbb{X}\\ x & \mapsto & (\Psi(x), x).
\end{array}\right.\] Then, considering that elements \(z \in \mathbb{O}\) have their components indexed starting from \(0\), we define the objective function \(f\) as \[f:\left\{\begin{array}{ccl} \mathbb{O}\times \mathbb{X} & \to & \mathbb{R}\\ (z,x) & \mapsto & \dfrac{1}{N+1} \sum\limits_{i=0}^{N} \dfrac{ z_{6i+4} }{ z_{6i} + z_{6i+1} +
z_{6i+2} + z_{6i+3} }, \end{array}\right.\] so that by design, for all \(x \triangleq(t,Q) \in \mathbb{X}\), \[f(\Phi(x)) = \dfrac{1}{N+1} \sum\limits_{i=0}^{N} \dfrac{
[\mathrm{ME}]\left(\frac{it}{N},Q\right) }{ [\mathrm{TG}]\left(\frac{it}{N},Q\right) + [\mathrm{DG}]\left(\frac{it}{N},Q\right) + [\mathrm{MG}]\left(\frac{it}{N},Q\right) + [\mathrm{G}]\left(\frac{it}{N},Q\right) }\] is a discretization of the
product-to-reactant conversion ratio in the objective of the initial problem. We define our constraint function by \[c:\left\{\begin{array}{ccl} \mathbb{O}\times \mathbb{X} & \to & \mathbb{R}^{6(N+1)} \times
\mathbb{R}^5\\ (z, x) & \mapsto & [c_\mathrm{outputs}(z),~ c_\mathrm{inputs}(x)], \end{array}\right.\] where the constraints function \(c_\mathrm{outputs}\) and \(c_\mathrm{inputs}\) enclose respectively output-based and input-based components. These two functions are given by \[c_\mathrm{outputs}:\left\{\begin{array}{ccl} \mathbb{O} & \to &
\mathbb{R}^{6(N+1)}\\ z & \mapsto & \begin{bmatrix} -z_{6i} \\ -z_{6i+1} \\ -z_{6i+2} \\ -z_{6i+3} \\ -z_{6i+4} \\ z_{6i+5}-65 \end{bmatrix}_{i=0}^{N} \end{array}\right. \quad \text{and} \quad c_\mathrm{inputs}:\left\{\begin{array}{ccl} \mathbb{X}
& \to & \mathbb{R}^5\\ x & \mapsto & \begin{bmatrix} -t \\ t-120 \\ -Q \\ Q-12 \\ Qt-500 \end{bmatrix}, \end{array}\right.\] so that for all \(x \triangleq(t,Q) \in \mathbb{X}\) and denoting by \(z \triangleq\Psi(x)\), the inequality \(c(\Phi(x)) \leq 0\) is equivalent to \(c_\mathrm{outputs}(z) \leq 0\) (enforcing nonnegativity of all concentrations and
admissible reactor temperatures) and \(c_\mathrm{inputs}(x) \leq 0\) (ensuring bounds and the energy budget). Note that this problem naturally falls within the framework of active subspaces (Remark 6), since for all \(y \triangleq(z,x) \in \mathbb{O}\times \mathbb{X}\), the value of \(f(y)\) is independent to \(x\) as well as of the temperature components \((z_{6i+5})_{i=0}^{N}\). Accordingly, we adapt the attack
operator to
account for this reduced subspace.
, shown in Figure 12, highlights that all methods succeed on this problem. Both DFO
baselines converge, with the \(\mathbb{M}_{\texttt{cdsm}}\) method
being more efficient than the \(\mathbb{M}_{\texttt{rls}}\) method. Interestingly, the \(\mathbb{M}_{\texttt{atk}}\) method also converges, suggesting that the attack
operator
is particularly effective in this setting, despite this method being heuristic and prone to early failure in general. Finally, the \(\mathbb{M}_{\texttt{hyb}}\) method is nearly as inexpensive as \(\mathbb{M}_{\texttt{atk}}\), yet ultimately attains the best objective value among all methods. Taken together, these results strongly suggest that the attack
step plays a central role in the good performance
of \(\mathbb{M}_{\texttt{hyb}}\) on this problem.
, illustrated in Figure 13, confirms this observation. The \(\mathbb{M}_{\texttt{atk}}\) method begins with several successes from the attack
step,
which yield fast early progress. Similarly, the early iterations of \(\mathbb{M}_{\texttt{hyb}}\) are dominated by sufficient increases from the attack
successes bypassing the cdsm
step. As the
optimization advances, however, the contribution of the attack
step diminishes and the cdsm
component of \(\mathbb{M}_{\texttt{hyb}}\) takes over, driving the late-stage fine-tuning observed in
Figure 12.
further illustrates the potential of the attack
operator in this context. As seen in Figure 14, the operator reliably finds dominating directions throughout most of the optimization
process, up until the final convergence phase. Both fgsm
and pgd
algorithms perform well overall, though pgd
appears slightly more robust in later stages, identifying ascent directions even near the optimum. However,
these directions require very small radii and come at a higher computational cost, which reinforces the practicality of fgsm
as the preferred default implementation.
To conclude this paper, we discuss some strengths and limitations of our hybrid method in Section 6.1, and we outline promising avenues for future research in Section 6.2.
The theoretical analysis of Algorithm 3, given by Theorem 1, is limited to an asymptotic convergence.
It is difficult to conduct a non-asymptotic analysis valid for all instances of Problem 1 enclosed by the broad framework resulting from Assumption 1. In particular, such analyses are scarce in the DFO
literature. Nevertheless, a strength of Theorem 1 is that Algorithm 3 is guaranteed to identify a local solution to Problem 1 , even in hard instances of Problem 1
. Another positive trait of the covering
step is that Algorithm 3 scans a dense set of points in a neighbourhood around that solution, which eventually gives a precise understanding of the landscape of
Problem 1 near that solution.
Our hybrid method enjoys a twofold flexibility. The core components of Algorithm 3 (the attack
operator and the cdsm
routine) are largely independent. Each may be chosen depending on the
problem at hands. This flexibility allows our hybrid method to be adapted to a wide range of contexts, but a downside is a concern on the choice of the most suitable components. Let us discuss some guideline about how to define each component of our hybrid
method depending on the context.
First, the attack
operator may be defined from several algorithms for NN
attacks. An ideal choice depends on whether \(\Phi\) is given as a white-box NN
. Indeed, if \(\Phi\) is a white-box NN
, we recommend using an algorithm for NN
attacks that exploits this explicit structure, such as those in the \((\alpha,\beta)\)-CROWN
solver. If, instead, \(\Phi\) is a nonwhite-box NN
, then we suggest using the fgsm
algorithm since it is fast and our numerical
experiments hint that more accurate algorithm such as pgd
yield negligible difference in the performance of the attack
operator. However, this observation should not hide the fact that the attack
operator sometimes
lacks reliability in late stages of the optimization process.
Second, in Algorithm 3, we choose to hybridize a generic attack
operator with the cdsm
from DFO
. Others DFO
algorithms could be substituted for cdsm
,
but unfortunately, to the best of our knowledge, no consensus currently exists in the literature about DFO
regarding the most appropriate choice for a given problem. The cdsm
is suited for cases where either \(\Phi\) is a nonwhite-box NN
or \(f\) or \(c\) are generic nonlinear functions, so the structure of the problem is limited. In such contexts, where
it is usual to consider DFO
methods, Algorithm 3 outperforms the two state-of-the-art DFO
baselines we considered. However, in others contexts, we suggest considering hybridizing the
attack
step with others algorithms instead. A DFO
method is likely not the most efficient approach when \(\Phi\) is a white-box NN
and either \(f\)
and \(c\) are simple enough. For example, when methods from the literature (see Section 2) about optimization through trained NN
are applicable (that is,
when \(\Phi\) is a ReLU
-based white-box NN
and \(f\) is linear and \(c\) yields a polyhedral feasible set), they presumably should
be preferred. Similarly, when \(f\) is nonlinear but simple enough (for example, quadratic) and \(c\) yields a polyhedral feasible set, then it is presumably more efficient to adapt these
methods than to consider a DFO
method.
Numerous avenues for research stem from our work, either on the theoretical and practical sides.
First, we may strengthen the attack
component itself, both in the choice of ascent directions for \(f\) and in the way attacks are computed. Beyond gradients, Remark 5 suggests using alternative ascent proxies such as simplex or finite-difference gradients when \(f\) lacks smoothness. Adapting our analysis to these settings is straightforward, and the literature on simplex gradients and simplex Hessians [42]–[44] offers principled constructions that
could yield more reliable attack directions. On the numerical side, our experiments indicate that speed often matters more than ultimate attack accuracy. This makes fgsm
an appealing default option, although this choice is likely
problem-dependent. In particular, fgsm
is likely not sufficient in cases where the structure of \(\Phi\) makes it difficult to compute successful attacks. More broadly, we may also develop nonwhite-box attack
solvers that natively handle input constraints and possess guarantees of success whenever an attack exists. Such constrained attack would close the gap with Assumption 2 and remove the workaround from Section 3.2.2, and would also be broadly useful beyond our
context.
Second, we plan to broaden the scope of problems addressed, with a particular emphasis on regularity and constraints. The cdsm
is designed to cope with possible discontinuities [28], and the assumptions underlying its convergence are tight. This makes the extension of Algorithm 3 to discontinuous \(f\), \(c\), or \(\Phi\) natural, provided the attack
follows the ascent-proxy adaptations discussed above. On the constraints side, we may replace
strict feasibility or global relaxation with mature mechanisms from DFO
such as the progressive barrier [57]. This could improve efficiency by
allowing controlled, temporary infeasibility, which is common in successful DFO
practice.
Third, we expect benefits from enhancing the DFO
engine that surrounds the attack
in our hybrid method. Within cdsm
, precision and speed can be improved by established tools, such as local quadratic models [53], [54] to inform poll
directions, Bayesian strategies [58] to steer the search
, and variable neighbourhood rules [50], [51]. These improvements are independent of the attack component and can be integrated without altering convergence guarantees, offering routes to better
late-stage refinement.
Fourth, we see opportunities to leverage problem structure more explicitly. When \(\Phi\) is a white-box NN
, hybridizing the attack
with methods tailored to optimization through trained
NN
s(see Section 2) should be preferred to structure-agnostic DFO
methods. Moreover, the composite form invites alternative formulations, e.g., introducing an auxiliary
variable \(y\) with the constraint that \(y=\Phi(x)\) and optimizing \(f(y)\) under \(c(y) \leq 0\), or relaxing this
coupling via penalties. These viewpoints connect naturally with partitioned [59] and parametric optimization [60], and with optimization on manifolds [61].
Combining attack-based steps with such dedicated methods may unlock further gains of performance.
Finally, a broader experimental campaign across additional architectures (e.g., physics-informed models [3], [4] and digital twins [6]) would clarify when lightweight attacks suffice and when stronger attacks are warranted. Although the above fields are active, we identified few pre-trained and publicly available NN
surrogates that suit our
needs.
solar
: A solar thermal power plant simulator for blackbox optimization benchmarking,”
Optimization and Engineering, 2025, doi: 10.1007/s11081-024-09952-x.