R2 v2:
The Pareto-compliant R2 Indicator for Better Benchmarking in Bi-objective Optimization




Abstract

In multi-objective optimization, set-based quality indicators are a cornerstone of benchmarking and performance assessment. They capture the quality of a set of trade-off solutions by reducing it to a scalar number. One of the most commonly used set-based metrics is the R2 indicator, which describes the expected utility of a solution set to a decision-maker under a distribution of utility functions. Typically, this indicator is applied by discretizing the latter distribution, yielding a weakly Pareto-compliant indicator. In consequence, adding a nondominated or dominating solution to a solution set may – but does not have to – improve the indicator’s value.

In this paper, we reinvestigate the R2 indicator under the premise that we have a continuous, uniform distribution of (Tchebycheff) utility functions. We analyze its properties in detail, demonstrating that this continuous variant is indeed Pareto-compliant – that is, any beneficial solution will improve the metric’s value. Additionally, we provide efficient computational procedures that (a) compute this metric for bi-objective problems in \(\mathcal{O} (N \log N)\), and (b) can perform incremental updates to the indicator whenever solutions are added to (or removed from) the current set of solutions, without needing to recompute the indicator for the entire set. As a result, this work contributes to the state-of-the-art Pareto-compliant unary performance metrics, such as the hypervolume indicator, offering an efficient and promising alternative.

Performance assessment, multi-objective optimization, R2 indicator, benchmarking, utility functions, Pareto compliance.

1 Introduction↩︎

When optimizing any system, there is often not just one objective, but multiple criteria required to assess the quality of a solution. Rather than aggregating these different optimization objectives into one, e.g., by means of a linear combination of the individual objectives, the domain of multi-objective (MO) optimization aims to find a set of (Pareto-)optimal trade-off solutions to present to a decision-maker [1]. However, to benchmark MO optimizers and facilitate algorithm design, parameter tuning, and automated algorithm selection, quantifying the quality of trade-off solutions in a unary set-based performance indicator is often necessary.

To be reasonably interpretable, it is recommended that a MO performance indicator fulfills a property that is known as Pareto compliance [2]. More precisely, a set-based performance indicator is called Pareto-compliant, if and only if its indicator value for set A is better than for set B when set A dominates set B. In addition, an indicator is called weakly Pareto-compliant if its value for set A is not worse than for set B.

Up to now, the hypervolume (HV) indicator, and variants thereof, are the only set-based performance indicators that are recognized as truly Pareto-compliant [3][5]. The HV indicator computes the \(m\)-dimensional hypervolume dominated by the solution set w.r.t. to a user-specified anti-optimal reference point.

In contrast, there are multiple families of weakly Pareto-compliant indicators. Exemplary and widely used representatives are the IGD+ indicator [6], requiring an (approximated or known) reference Pareto front, or the R2 indicator [7], [8], requiring an ideal/utopian reference point as well as a sample of aggregation (or: utility) functions.

The R2 indicator, in particular, may be an attractive choice as it requires an ideal rather than an anti-optimal reference point. In many problems, the ideal point is easier to find, e.g., in multi-objective machine learning problems where optimal, but not always anti-optimal, values for loss functions are available. Also, solutions that do not dominate a chosen reference point may not contribute to the dominated hypervolume, and a reference point far away from the Pareto front (PF) tends to put a high weight on solutions at the PF’s boundary, which is also often undesirable. Another benefit arises when constructing test problems from objectives with known single-objective optima, which lack a natural upper bound for the reference point or a clear “region of interest”.

While the unary R2 indicator was initially defined as an integral over a continuum of utility functions by [7], it is usually only discussed and applied in an approximate manner for which the distribution of utility functions is discretized [8], [9]. The latter comes with the benefit of being flexible regarding the involved utility functions. It also provides a convenient way to compute the indicator as an average of multiple utility functions, however, sacrificing Pareto compliance in the process.

In this paper, we consider the most common R2 indicator definition with a uniform distribution of Tchebycheff utility functions [7], [8]. Our main contribution is the methodology to compute this indicator exactly for a set of solutions in the bi-objective case, thereby preserving its Pareto compliance. We achieve this by shifting perspective away from averaging over predefined utility functions towards computing the R2 indicator contributions of each individual nondominated point, thereby eliminating weaknesses in the R2 indicator’s properties. Additionally, we demonstrate the exact R2 indicator values for individual points and linear Pareto fronts, as well as the approximate nature of the discretization-based approach commonly used so far. This paper is an extended version of [10], which additionally provides the following contributions:

  • a correction regarding the exact R2 indicator values of concave fronts,

  • a refined derivation of the continuous R2 indicator,

  • the means to compute the exclusive contribution of a solution from the non-dominated set,

  • an incremental update variant suitable for benchmarking scenarios, and

  • a comparison between the iterative R2 and hypervolume indicators.

In the meantime, [11] have also published a more detailed analysis of the exact calculation of the R2 indicator. They show the Pareto compliance of the indicator for more than two objectives based on the quick hypervolume algorithm but do not discuss the derivation of exact indicator values, or benchmarking-critical aspects such as its incremental computation.

This paper is structured as follows: We introduce multi-objective optimization and set-based performance assessment in 2. Then, in 3, we derive the methodology for computing exact R2 indicator values, first for a single solution, then for a set of solutions, and finally present an efficient method to compute the indicator incrementally for a history of evaluated points. 4 presents some exemplary results regarding characteristics of the exact R2 indicator, and 5 concludes the paper with an outlook on future research avenues.

2 Background↩︎

We begin by introducing some fundamental aspects of multi-objective optimization and dominance relationships of multi-objective solutions. Then, we will cover core aspects of set-based performance assessment in multi-objective optimization and its best known representative, the hypervolume indicator. Finally, we introduce the R2 indicator with its most important properties for the discretized and the exact variant.

2.1 Multi-objective Optimization↩︎

In multi-objective (MO) optimization, we aim to (w.l.o.g.) minimize multiple conflicting objectives. Commonly, a MO optimization problem (MOP) with \(m\) objectives is given by an objective function \(F: \mathcal{X} \mapsto \mathbb{R}^m\) where \(\mathcal{X}\) represents the decision space. The individual objectives are denoted as \(f_i: \mathcal{X} \mapsto \mathbb{R}, i=1,\dots,m\) in this work. Further, we are primarily considering the bi-objective setting (\(m=2\)). Depending on the particular problem and decision space, there may be further constraints on admissible solutions.

A particular challenge posed by MOPs pertains to solution comparison. While in single-objective optimization, solutions can be compared directly (either they have identical objective values or one is better than another), such immediate comparisons are not possible for all solutions of a MOP. To solve this, we need the concept of dominance. A solution \(x\) dominates another solution \(y\) (\(x \prec y\)), iff \(f_i(x) \leq f_i(y)\) for all \(i\) and \(f_i(x) < f_i(y)\) for at least one \(i\). A solution \(x\) strongly dominates another solution \(y\) if the stronger condition \(f_i(x) < f_i(y)\) holds for all \(i\). A solution that dominates, but does not strongly dominate, another solution is also called weakly dominant. Finally, two solutions can be incomparable, that is, mutually nondominated, if either fulfills some objective better than the other.

Definitions of dominance can also be extended to sets of solutions. A set of solutions \(A\) (weakly) dominates another set \(B\), if each member of \(B\) is (weakly) dominated by a solution in \(A\), written as \(A \preceq B\) and \(A \prec B\), respectively [2], [12].

The set of all nondominated solutions \(P = \{x \in \mathcal{X} ~ | ~ \nexists y \in \mathcal{X}: y \prec x\}\) is known as the Pareto set, and its image under \(F\) is known as the Pareto front. The Pareto set contains the optimal trade-off solutions regarding the objectives, and aiming to obtain a close approximation to it is the prevalent approach of solving MOPs when no further constraints or preferences on the objectives are known, i.e., under black-box assumptions. Evolutionary algorithms are the most widespread approach for finding good Pareto set approximations in these conditions.

Finally, we call the vector of the optimal, individual function values ideal point, and refer to the best vector dominated by all Pareto optimal points as nadir point. Often, before computing indicator values and if the ideal and nadir points are available, the region between them is normalized to the \([0,1]^m\) box in objective space as a normalization technique.

2.2 Set-based Performance Assessment↩︎

As the Pareto set generally contains more than one solution, set-based performance measures are the norm in assessing the overall quality of an archive of evaluated points. This need to quantify Pareto set approximations has led to numerous performance measures being introduced. For a recent survey on MO performance indicators, we refer to [13]. The most prominent set-based performance measure is the dominated hypervolume (HV) indicator (or: \(S\)-metric) that measures the space dominated by the set of solutions w.r.t. an anti-optimal reference point [3][5]. An illustration of the HV indicator for two objectives is given in 1.

An important property of a set-based performance measure \(I: \mathbb{R}^m \rightarrow \mathbb{R}\) is Pareto compliance: If \(A \preceq B\) and \(B \npreceq A\), then \(I(A) < I(B)\) [12]. That is, a performance measure should improve if new non-dominated or (weakly) dominating solutions are added to a set of solutions. If only \(I(A) \leq I(B)\) can be guaranteed under the same circumstances, \(I\) is called weakly Pareto-compliant. Only the HV indicator and other indicators based on it are established to be Pareto-compliant [5]. The selection of weakly Pareto-compliant indicators is somewhat larger, including, for example, the (discretized) R2 [8] and the IGD+ [6] measures.

The necessity of the anti-optimal reference point for computing the HV indicator can, however, be a hindrance to achieving Pareto compliance in practice. Setting the reference point so far back that it is dominated by every feasible solution introduces a bias towards the edges of the PF, while a reference point close to the nadir point fails to consider all solutions outside of such a defined region of interest [14], cf. 1.

Figure 1: Left: Illustration of the HV indicator. Right: If no point dominating the HV is found, the minimal distance to the region of interest (dotted) can be used as an additional indicator. The combined indicator is, however, not Pareto-compliant anymore.

2.3 The R2 Indicator↩︎

In contrast to the HV indicator, the unary R2 indicator is ordinarily defined as the expected utility of the point set w.r.t. a distribution of utility functions \(U\) [7]. In the most general case, for a set of solutions \(Y\) of a MOP, we can define it as follows: \[R2(Y) := \int_{u \in U} \min_{y \in Y} u(y) \, du.\] The most common choice of a utility function is a Tchebycheff aggregation, which allows to reach all Pareto-optimal points depending on the chosen parametrization. For a weight vector \(w \in [0,1]^m\) with \(\sum_{i=1}^m w_i = 1\) and a utopian vector \(y^*\), it is given by \[\begin{align} u_w(y) &= \max_{i = 1,\dots,m}{w_i (y_i - y_i^*)} \\ &= \max_{i = 1,\dots,m}{w_i y'_i} \end{align}\] using \(y'_i = y_i - y_i^*\) to shift the utopian point w.l.o.g. to the origin. The distribution of utility functions is then usually chosen as uniform on the weight simplex.

In the bi-objective case, where \(w_2 = 1 - w_1\) holds, this yields the following formula for R2: \[\begin{align} R2(Y) &= \int_0^1 \min_{y \in Y} u_w(Y) \, dw \\ &= \int_0^1 \min_{y \in Y} \{\max(w y'_1, (1 - w) y'_2)\} \, dw. \end{align}\] As there is no apparent way to calculate this property directly, it is generally approximated in a discrete manner by discretizing \(U\) using \(n = |W|\) weight vectors \(w \in W\): \[\begin{align} R2(Y) \approx \frac{1}{|W|} \sum_{w \in W} \min_{y \in Y} u_w(y). \end{align}\] As an example, the uniform weight distribution with size \(n\) for the bi-objective case is given by [8] \[W = \left\{(0, 1), \left(\frac{1}{n-1}, \frac{n-2}{n-1}\right), \dots, \left(\frac{n-2}{n-1},\frac{1}{n-1}\right), \left(1, 0\right)\right\}.\]

The discretization is simultaneously a blessing and a curse: On the one hand, it provides an effective method of approximating the underlying exact R2 value with high precision. On the other hand, this weakens the indicators’ properties. To optimize the discretized R2 indicator, one can consider at most \(|W|\) points on the Pareto front, which optimize \(u_w\) for each \(w \in W\), respectively. Additional nondominated solutions cannot contribute to the indicator value. Further, an individual utility function \(u_w\) may be optimized by a point that is only weakly Pareto optimal, but has the same utility (for this set of weights) as another, Pareto optimal point. See, for example, the top left image of 2, where each point along the vertical lines would have identical utility values. This places the discretized R2 indicator among the weakly Pareto-compliant indicators.

Figure 2: Illustration of level sets of the Tchebycheff utility for five different weight vectors w. The utility value u_w(y) is determined by the surface that the level set touches first: At vertical surfaces (see left and center image in the top row), the u_w(y) is determined by the f_1 value while at horizontal surfaces (see bottom row) u_w(y) depends on the f_2 value of y. At y (the top right figure) both w_1 y_1 and w_2 y_2 are identical. Note: Weight vectors are illustrated to point towards their equilibrium between both objectives, i.e., w_1 f_1 = w_2 f_2.

A final property of the R2 indicator is that each solution \(y\) from a nondominated set of solutions \(Y\) is optimal in terms of utility compared to all other solutions from \(Y\) for a particular weight \((w_1^*, 1 - w_1^*)\) such that [8] \[\begin{align} w_1^* y'_1 &= (1 - w_1^*) y'_2 \\ \Rightarrow w_1^* &= \frac{y'_2}{y'_1 + y'_2}. \end{align}\] Consequently, we also have \(w_2^* = \frac{y'_1}{y'_1 + y'_2}\). Discussions around the R2 indicator and its application focus (almost) exclusively on its discrete variant [8], [15] rather than on the original continuous definition of R2 [7]. The remainder of this paper is dedicated to a better understanding of this original definition, analyzing its properties, and detailing methods for computing it.

3 The R2 Indicator for Continuous Utility Distributions↩︎

In this section, we derive how the R2 indicator can be computed under the assumption of a continuous distribution of Tchebycheff utility functions for bi-objective problems. We will start by analyzing the utility of an axis-parallel segment and a single solution point before extending the analysis to sets of solutions, including the exclusive contribution of individual points and an incremental procedure to efficiently compute the indicator for a history of evaluated points. Further, we derive the computational complexity of the presented approaches.

3.1 Tchebycheff Utility of Axis-parallel Pareto Front Segments↩︎

Figure 3: An axis-parallel (vertical) line segment between y and y' with utility \color{Orange} u(y_1, [y_2, y_2']). For horizontal segments, the computation is analogous.

We begin by demonstrating how we can compute the partial R2 utility for an axis-parallel segment of the PF, cf. 3. This will be the building block for deriving the R2 value for a single solution as well as a set of solutions.

Without loss of generality, let \(y^* = (0,0)\) be the utopian point. Let \([y_2,y'_2]\) with \(0 \leq y_2 < y'_2\) and \(y_1 \geq 0\) be an axis-parallel segment of the Pareto front. Let \(w_1 = \frac{y_2}{y_2 + y_1}\) and \(w'_1 = \frac{y'_2}{y'_2 + y_1}\) be the weights for which the respective end points of the segment are optimal w.r.t. the R2 utility. As \(y_2 < y'_2\), we also have \(w_1 < w'_1\). The segment then contributes the partial R2 utility for the weight range \([w_1,w'_1]\) w.r.t. \(y_1\): \[\begin{align} u(y_1, [y_2, y'_2]) &= \int_{w_1}^{w'_1} w y_1 dw \\ &= \frac{1}{2} y_1 ({w'_1}^2 - {w_1}^2) \\ &= \frac{1}{2} y_1 \left(\left(\frac{y'_2}{y_1 + y'_2}\right)^2 - \left(\frac{y_2}{y_1 + y_2}\right)^2\right). \end{align}\] If we are in the extreme regions of the PF where \(y'_2 \rightarrow \infty\), the integral and computation are still well-defined as \(\lim_{y'_2 \rightarrow \infty} {w'_1} = 1\).

The computation for an axis-parallel segment \([y_1,y'_1]\) with \(0 \leq y_1 < y'_1\) and \(y_2 \geq 0\) is completely analogous, so that we can write: \[\begin{align} u(y_2, [y_1, y'_1]) = \frac{1}{2} y_2 \left(\left(\frac{y'_1}{y'_1 + y_2}\right)^2 - \left(\frac{y_1}{y_1 + y_2}\right)^2\right). \end{align}\] Finally, we observe that for a given interval \(I = [y_2,y'_2]\), \(\lim_{y_1 \rightarrow \infty} u(y_1, I) = 0\), i.e., we can use \(u(\infty, I) = 0\).

3.2 R2 Utility for a Single Solution↩︎

Now, let \(Y = \{y\}\) be the set containing only one solution \(y = (y_1, y_2)\) that is dominated by \(y^* = (0,0)\), cf. [fig:r2-single]. The Pareto front corresponding to \(Y\) then only contains two axis-parallel segments with one unbounded end each, and the corresponding utility can be computed as follows: \[\begin{align} R2(\{y\}) &= \int_0^1 \max(y_1 w, y_2 (1 - w)) \, dw \\ &= u(y_1, [y_2,\infty)) + u(y_2, [y_1,\infty)) \\ &= \int_{\frac{y_2}{y_1 + y_2}}^{1} w y_1 dw + \int_{\frac{y_1}{y_1 + y_2}}^{1} w y_2 dw \\ &= \frac{1}{2} y_1 \left(1 - \left(\frac{y_2}{y_1 + y_2} \right)^2 \right) + \frac{1}{2} y_2 \left(1 - \left(\frac{y_1}{y_1 + y_2} \right)^2 \right) \end{align}\] This simple case demonstrates that computing the exact R2 indicator for a set of solutions consists of the following steps: First, we need to identify areas in which the utility value does not vary w.r.t. \(y\) and is only sensitive to \(w\). Then, we compute the contribution of each of these areas to the final R2 value using the corresponding integral and finally sum everything up.

We can build on these observations to derive a general procedure to compute \(R2(Y)\) for arbitrary solution sets \(Y\).

3.3 R2 for a Set of Solutions↩︎

Figure 4: Integration ranges surrounding a solution y^{(n)}. y^{(n-1)}, y^{(n)}, and y^{(n+1)} are consecutive points in the solution set. The utility of the vertical (orange) segment is determined by the \color{Orange} f_1 value of y^{(n)}, while the utility of points on the horizontal (blue) segment is dependent on its \color{NavyBlue} f_2 value. The length of the segments depends on the neighbors of y^{(n)} in the solution set.

Let us now consider what happens when our solution set contains \(N > 1\) solutions. Let \(Y = \{y^{(1)}, \dots, y^{(N)}\}\) be the set of nondominated solutions ordered by ascending \(y_1\) value. In order to compute \(R2(Y)\), we can consider the utilities from the respective axis-parallel segments that make up the Pareto front. We can identify all segments by considering each solution \(y^{(n)}\) and its respective neighbors \(y^{(n-1)}\) and \(y^{(n+1)}\), cf. 4 for a visual aid.

For the vertical segment neighboring \(y^{(n)}\), we get the utility \(\color{Orange} u(y_1^{(n)}, [y_2^{(n)}, y_2^{(n-1)}])\), while for the horizontal segment, we get \(\color{NavyBlue} u(y_2^{(n)}, [y_1^{(n)}, y_1^{(n+1)}])\). For the special cases \(n=1\) or \(n=N\) at the extreme ends of the Pareto front, where there is no neighbor in one direction, we can use \(y_1^{(N+1)} \rightarrow \infty\) and \(y_2^{(0)} \rightarrow \infty\), as explained above. Bringing this all together, we can derive the following formula for \(R2(Y)\): \[\begin{align} R2(Y) & = \int_{0}^{1} \min_{y \in Y} \{ \max(y_1 w, y_2 (1 - w)) \} \, dw \\ & = \sum_{n=1}^N \left(\color{Orange} u\left(y_1^{(n)}, \left[y_2^{(n)}, y_2^{(n-1)}\right]\right) \color{Black} + \color{NavyBlue} u\left(y_2^{(n)}, \left[y_1^{(n)}, y_1^{(n+1)}\right]\right) \color{Black} \right), \text{where} \end{align}\] \[\begin{align} \color{Orange} u\left(y_1^{(n)}, \left[y_2^{(n)}, y_2^{(n-1)}\right]\right) \color{Black} & = \frac{1}{2} y_1^{(n)} \left(\left(\frac{y_2^{(n-1)}}{y_1^{(n)} + y_2^{(n-1)}}\right)^2 - \left(\frac{y_2^{(n)}}{y_1^{(n)} + y_2^{(n)}}\right)^2\right), \text{and} \\ \color{NavyBlue} u\left(y_2^{(n)}, \left[y_1^{(n)}, y_1^{(n+1)}\right]\right) \color{Black} & = \frac{1}{2} y_2^{(n)} \left(\left(\frac{y_1^{(n+1)}}{y_1^{(n+1)} + y_2^{(n)}}\right)^2 - \left(\frac{y_1^{(n)}}{y_1^{(n)} + y_2^{(n)}}\right)^2\right). \end{align}\] The computational complexity of this approach is determined mostly by the condition that we require \(Y\) to contain only nondominated solutions and be sorted by \(y_1\). For this, we can first sort an archive of solutions (including dominated points) by \(y_1\) and pass over this list once, removing any dominated points and duplicates. This takes \(\mathcal{O}(N \log N)\) time with standard sorting algorithms. The computation of the indicator itself is then just a matter of another pass over the sorted list of nondominated points, which requires linear time: All required \(w\) and \(y\) values can be obtained on the fly based on any given point \(y^{(n)}\) and its immediate neighbors. To summarize, the computation of the exact R2 indicator on an archive of \(N\) points in bi-objective space requires a complexity of \(\mathcal{O}(N \log N)\). This improves upon the \(\mathcal{O} (N |W|)\) complexity of the discretized R2 for precise indicator values and large sets of solutions, i.e., large \(|W|\) and \(N\) values.

3.4 Exclusive R2 Contribution↩︎

When benchmarking multi-objective optimization algorithms, quality indicators are often not just evaluated after a specific budget of evaluations was spent, but more frequently. Ideally, the most relevant quality indicator is updated for each change to the nondominated set of points evaluated during a run. This enables (a) an optimizer to react to large changes in approximation set quality, or (b) a benchmarking framework to record when a target indicator value has been reached for the first time.

For example, the standard protocol of the bi-objective BBOB testbed records the first hitting times for a set of hypervolume indicator values [16], using an AVL tree data structure – i.e., a binary tree whose left and right subtrees differ by at most one level in height [17] – to represent the unconstrained nondominated set to represent the unconstrained nondominated set and efficiently updating the hypervolume value for each evaluated point on the test problem. Each time the approximated Pareto set is updated, the exclusive contributions of removed (added) points is subtracted from (added to) the total indicator value. Each update is performed in \(\mathcal{O} (\log N)\) time, leading to an overall running time of \(\mathcal{O} (N \log N)\) to compute the full history of bi-objective R2 indicator values, i.e., the same asymptotic runtime as the individual computation for the final value.

Figure 5: Illustration of the effect of removing y^{(n)} from Y to determine the exclusive contribution \Delta R2(y^{(n)},Y). Both PF line segments with constant \color{Orange} f_1 and \color{NavyBlue} f_2 values are shifted upwards to the dashed lines.

To implement an iterative version of the R2 indicator, we most importantly need to be able to compute the exclusive contribution \(\Delta R2(y^{(n)}, Y)\) for a solution \(y^{(n)} \in Y\) to the overall value. We consider the effect of removing \(y^{(n)}\) from \(Y\), cf. 5: \[\begin{align} \Delta R2(y^{(n)}, Y) & = R2(Y) - R2(Y \setminus \{y^{(n)}\}) \\ & = \color{Orange} u\left(y_1^{(n)},[y_2^{(n)},y_2^{(n-1)}]\right) - u\left(y_1^{(n+1)},[y_2^{(n)},y_2^{(n-1)}]\right) \color{Black} \\ & + \color{NavyBlue} u\left(y_2^{(n)},[y_1^{(n)},y_1^{(n+1)}]\right) - u\left(y_2^{(n-1)},[y_1^{(n)},y_1^{(n+1)}]\right) \color{Black}. \end{align}\] Note that a Pareto front \(A\) that (weakly) dominantes another Pareto front \(B\) will be closer to the ideal point and thus reduce \(R2\) as the corresponding Tchebycheff aggregation functions have smaller values. This is in contrast to the HV indicator, where better fronts yield in an increased value. In consequence, except for the special case \(Y = \{y\}\), \(R2(Y) < R2(Y \setminus \{y\})\) and thus \(\Delta R2(y^{(n)}, Y)\) will be negative. Further, this defines the region of the exclusive contribution of each solution in \(Y\), and so re-establishes Pareto compliance for the R2 indicator as it was described in its original publication [7].

3.5 Incremental Bi-objective R2 Indicator Computation↩︎

Using the exclusive contribution \(\Delta R2(y^{(n)}, Y)\), the procedure R2IndicatorUpdate in 6 demonstrates how an update of the nondominated set \(Y\) and the corresponding R2 indicator value can be performed. To start with, we check whether \(Y\) is empty, in which case we add \(y\) to it and compute the R2 indicator value for this individual point. Otherwise, if \(y\) is not dominated by some point in \(Y\), we remove all points dominated by \(y\) from \(Y\), and finally add \(y\) to \(Y\). Using the \(\Delta R2\) contribution, we update the indicator correspondingly for these removal and addition operations.

Figure 6: Bi-objective R2 Indicator Update

Storing the nondominated set \(Y\) with a suitable data structure for a sorted list (e.g., AVL trees), we can efficiently determine whether \(y\) is dominated by some point from \(Y\) in \(\mathcal{O}(\log N)\) time. Addition and deletion from the nondominated set can equally be performed in \(\mathcal{O}(\log N)\) for each operation, and as each point can only be added or deleted once, we get an overall runtime of \(\mathcal{O}(N \log N)\) to perform \(N\) successive R2IndicatorUpdate operations to compute the complete history of R2 indicator values across \(N\) evaluated solutions.

4 Properties of the Exact R2 Indicator↩︎

In this section, we present some empirical and theoretical results on our proposed exact R2 indicator. We start by demonstrating the approximation behavior of the discrete R2 in relation to the exact computation described in the previous sections. Then, we will calculate the optimal R2 indicator values for simple front shapes and demonstrate the convergence behavior of the indicator w.r.t. increasingly large nondominated sets.

In [10], we implemented the exact R2 indicator in the statistical software R. We subsequently implemented incremental variants of the R2 and hypervolume indicators in Python due to the easy access to an efficient sorted list in the sortedcontainers library [18]. For computations of the discrete R2 indicator, we utilize the unary_r2_indicator function from the emoa R package [19]. The scripts to reproduce these experimental results are published at https://github.com/schaepermeier/r2-revisited.

4.1 Comparison of Discrete and Exact R2 Values↩︎

a

b

Figure 7: Comparison of the approximated R2 indicator values using \(|W|\) weight vectors and the exact R2 indicator value computed by our methodology. Left: The exact value is shown by the dashed line. Right: The corresponding approximation error of the discretized approach. The solution set is given by \(1,\!001\) equidistant solutions evaluated on the Pareto set of a bi-sphere problem with centers \((-0.5, 0)\) and \((0.5, 0)\)..

We start by comparing the exact R2 indicator as computed using our method (see 3) with the discretized version found throughout the literature. As a basis for the comparison, we sample equidistant solutions on the Pareto set of a bi-sphere problem \(F(x) = \left(\sum (x - c_1)^2, \sum (x - c_2)^2\right)\) with centers \(c_1 = (-0.5, 0)\) and \(c_2 = (0.5, 0)\) using a step size of \(0.001\), resulting in \(1,\!001\) points. We evaluated the discretized R2 with uniformly distributed weights, and the number of weights (\(|W|\)) ranging from one to one million. The value of the discretized indicator as well as the approximation error are visualized in 7.

We can see that, in this example, the R2 indicator seems sufficiently well approximated at around \(1,\!000\) weights. Still, more weights yield a more accurate approximation: Empirically, there seems to be an exponential relationship between the number of weights chosen and the approximation error. This reflects analyses by [8] on the behavior of the discrete R2 with an increasing number of weights, albeit missing the exact R2 values for comparison.

4.2 Exact R2 Indicator Values↩︎

For the exact R2 indicator, we can provide the optimal indicator values for simple solution sets and corresponding test functions. This contributes to a better understanding of the R2 indicator values, as a geometric interpretation like with the hypervolume indicator is not possible.

4.2.1 Nadir and Ideal Points↩︎

Assuming we have normalized the objective space (\((0,0)\) being the ideal point and \((1,1)\) being the nadir point), we can compute the R2 indicator for the worst possible solution within the region of interest \([0,1] \times [0,1]\) by inserting the nadir point \((1,1)\) into the equation. According to the previous results, we can compute this value as follows: \[\begin{align} R2(\{(1,1)\}) &= \frac{1}{2} y_1 \left(1 - \left(\frac{y_2}{y_1 + y_2}\right)^2\right) + \frac{1}{2} y_2 \left(1 - \left(\frac{y_1}{y_1 + y_2}\right)^2\right) \\ & = 0.5 \cdot (1 - 0.25) + 0.5 \cdot (1 - 0.25) \;\mathbf{= 0.75} \end{align}\] This result is independent of the particular problem or PF, as it is only dependent on the normalized nadir and ideal points. Analogously, we can derive the R2 value for the ideal point as \(R2(\{(0,0)\}) \;\mathbf{= 0}\), as all utilities equal zero at the ideal point. For comparison, the HV w.r.t. the nadir point as the reference point is \(HV(\{(1, 1)\}) = 0\), while the HV of the ideal point is \(HV(\{(0, 0)\}) = 1\) in this situation.

4.2.2 Linear Front↩︎

Let us now consider a linear PF where \(y_2 = 1 - y_1\) and \(y_1 \in [0,1]\) with ideal point \((0,0)\). When computing R2, for each weight vector \(w\), we can find the optimal solution \(y\) on the PF, which we can derive as follows: \[\begin{align} w_1 y_1 = w_2 y_2 & \Rightarrow w_1 y_1 = (1 - w_1) (1 - y_1) \Leftrightarrow y_1 = 1 - w_1. \end{align}\] Note that when we use this, it does not matter which of the objectives we consider in the R2 computation, as they will always yield the same utility value. Integrating over the weights for this set \(Y_\texttt{lin}\), we get: \[\begin{align} R2(Y_\texttt{lin}) &= \int_0^1 \min_{y \in Y_\texttt{lin}} \{\max(w y_1, (1 - w) y_2)\} \, dw \\ &= \int_0^1 w y_1 \, dw = \int_0^1 w (1 - w) \, dw = \int_0^1 (w - w^2) \, dw \\ &= \left[\frac{1}{2} w^2 - \frac{1}{3} w^3 \right]^1_0 \\ &= \frac{1}{2} - \frac{1}{3} \;\mathbf{= \frac{1}{6} \approx 0.1667}. \end{align}\]

Based on this result, we can derive the optimal R2 indicator value for any problem with a linear PF. For example, the well-known DTLZ1 problem [20] possesses a linear PF with ideal point \((0, 0)\) and nadir point \((0.5, 0.5)\), which results in an optimal R2 value of \(\frac{1}{12} \approx 0.0833\).

4.2.3 Convex and Concave Fronts↩︎

Additionally, we can derive value ranges for general concave and convex PFs: A concave front \(Y_\texttt{conc}\) will always achieve worse utility values than a linear function, and a convex front \(Y_\texttt{conv}\) will always have better utility. Again considering the normalized objective space, a convex front \(Y_\texttt{conv}\) will always fall between the R2 values of the ideal point and the linear front, i.e., \(0 < R2(Y_\texttt{conv}) < \frac{1}{6}\). Analogously, a general concave front has an ideal R2 value between the value of the linear front and a front made up of only the extreme points \(\{(0,1), (1,0)\}\). This extreme concave front has an R2 value of \(u(1, [0, 1]) + u(1, [0, 1]) = 2 \cdot 0.5 \cdot (0.5^2 - 0^2) = 0.25\), so \(\frac{1}{6} < R2(Y_\texttt{conc}) < \frac{1}{4}\). All cases are illustrated in 8.

a

b

c

Figure 8: Schematic illustration of a convex (\(0 < R2 < \frac{1}{6}\)), linear (\(R2 = \frac{1}{6}\)), and concave (\(\frac{1}{6} < R2 < \frac{1}{4}\)) PF, respectively. The ideal point at the origin is denoted by +, and the gray area indicates the dominated area..

If none of the discussed conditions apply, the ideal R2 indicator value of the normalized objective space may lie anywhere between \(0\) and \(0.25\).

As shown above, exact R2 indicator values can be derived for certain analytical PF shapes. Following the same pattern, that is, resolving \(w_1 y_1 = (1 - w_1) y_2\) and integrating the utility function, we can compute the exact indicator values also for more complex PF shapes, albeit in a less straightforward manner. We limit ourselves to reporting the results for simple quadratic PF functions, which correspond to the PFs in 8:

  • Convex PF with \(y_2 = (1 - \sqrt{y_1})^2\): \(\frac{3 \pi - 8}{16} \approx 0.0890\)

  • Concave PF with \(y_2 = \sqrt{1 - y_1^2}\): \(\frac{1}{8} \left(3 \sqrt 2 \sinh^{-1}{(1)} - 2\right) \approx 0.2174\)

The result for the convex PF applies, e.g., for the classical bi-sphere problem, while the concave PF corresponds to DTLZ2-6 [20].

To summarize, we can compute exact R2 indicator values for different simple front shapes and individual points. While these values do not seem to have an intuitive (geometric) interpretation, they are rather given meaning by the expected utility to a decision-maker.

4.3 Iterative Indicator Comparison↩︎

a
b

Figure 9: Comparison of the hypervolume (HV) and R2 indicator gaps using the nadir and ideal point, respectively, on a bi-sphere problem with centers \((-0.5, 0)\) and \((0.5, 0)\). As the R2 indicator always considers all solutions dominated by the ideal point, it does not produce an initial plateau.. a — Random uniform sampling in \([-1,1]^2\)., b — Random uniform sampling in \([-5,5]^2\).

Finally, we compare the gaps of the hypervolume and R2 indicators, respectively, when evaluating a large amount of randomly sampled solutions, cf. 9. We use the same bi-sphere problem from above (see 4.1) with centers \((-0.5,0)\) and \((0.5,0)\), and sample \(10^5\) solutions from \([-5,5]^2\) and \([-1,1]^2\) uniformly at random, respectively. The nadir (\(1,1\)) and ideal points \((0,0)\) are used as the respective reference points, and average indicator gaps after \(100\) repetitions are shown.

Here, we can see how not considering all solutions in the indicator computation affects the HV indicator: It produces a noticeable plateau with few sampled solutions, which is all the more pronounced when the sampling area is larger. This emphasizes the practicality of the R2 indicator for benchmarking scenarios where an ideal point, but not necessarily a good anti-optimal reference point, can be generated automatically.

5 Conclusion↩︎

In this paper, we introduce a procedure to compute the exact R2 indicator value for a given solution set. In contrast to its widely-known and commonly used discrete counterpart, the exact R2 indicator is not just weakly Pareto-compliant, but a proper Pareto-compliant indicator. This is achieved by foregoing the discretization of the distribution of utility functions usually performed in the calculation of the indicator and taking a continuous, uniform distribution of Tchebycheff utility functions as the basis. Pareto compliance of the R2 indicator with this utility distribution was already described when it was introduced by [7], but missing instructions for its exact calculation. Further, we show how this indicator is implemented efficiently with a running time of \(\mathcal{O}(N \log N)\) for bi-objective solution sets of size \(N\). Further, we derive the exclusive contribution of an individual solution to the indicator value, which we then use in an incremental update procedure to determine the full history of indicator values within the same runtime of \(\mathcal{O}(N \log N)\). This positions the exact R2 indicator as a promising Pareto-compliant alternative to the hypervolume indicator, especially when a utopian rather than an anti-optimal reference point is naturally available, making it the more natural choice for benchmarking multi-objective optimizers in common problem settings. We demonstrate the approximation behavior of the commonly used, discretized R2 indicator in comparison with the exact computation, and provide optimal R2 indicator values for the ideal and nadir points, as well as a linear front in normalized objective space. We then use these values to derive R2 indicator value ranges for general convex and concave PFs as well.

The availability of the exact, and thus Pareto-compliant, R2 indicator along with the proposed approach to compute it efficiently offers multiple directions for further theoretical and empirical research. From a theoretical point of view, we see potential in analyzing the approximation quality of the discretized R2 depending on the number of weights used. This could be particularly interesting for giving quality guarantees for R2 in higher-dimensional objective spaces, where exact indicator computations may become computationally intractable. A deeper theoretically supported analysis of its properties, along the lines of [8], would also be interesting. Further, the effects of different utility functions as well as the integration of (decision-maker) preferences can be examined, as both directions have been studied in detail for the discretized R2 indicator [21]. Finally, connections between the R2 indicator and the integrated preference functional [22], [23], a parallel development of an indicator very similar to R2 in the operations research community, should be further investigated, and could yield improvements in understanding the R2 indicator’s properties and computation.

Differences and similarities in the preferred distributions between the exact R2 indicator and the hypervolume indicator when applying them in a benchmarking context could present a further promising research direction. Likewise, integrating the exact R2 in optimization heuristics, similar to R2-EMOA [9], may be the subject of future studies.

References↩︎

[1]
Miettinen, K. (1999). Nonlinear Multiobjective Optimization, volume 12. Springer Science & Business Media.
[2]
Grimme, C., Kerschke, P., Aspar, P., Trautmann, H., Preuss, M., Deutz, A. H., Wang, H., and Emmerich, M. (2021). . Computers & Operations Research, 136:105489.
[3]
Zitzler, E. and Thiele, L. (1998). . In Proceedings of the International Conference on Parallel Problem Solving From Nature (PPSN), pages 292–301. Springer.
[4]
Beume, N., Fonseca, C. M., López-Ibáñez, M., Paquete, L., and Vahrenhold, J. (2009). . IEEE Transactions on Evolutionary Computation, 13(5):1075–1082.
[5]
Guerreiro, A. P., Fonseca, C. M., and Paquete, L. (2022). . ACM Computing Surveys, 54(6):119:1–119:42.
[6]
Ishibuchi, H., Masuda, H., Tanigaki, Y., and Nojima, Y. (2015). . In Proceedings of the 8th International Conference on Evolutionary Multi-Criterion Optimization (EMO 2015), pages 110–125. Springer.
[7]
Hansen, M. P. and Jaszkiewicz, A. (1998). . Technical Report IMM-REP-1998-7, IMM, Department of Mathematical Modelling, Technical University of Denmark.
[8]
Brockhoff, D., Wagner, T., and Trautmann, H. (2012). . In Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, pages 465–472.
[9]
Trautmann, H., Wagner, T., and Brockhoff, D. (2013). . In Proceedings of the 7th International Conference on Learning and Intelligent Optimization (LION 7), pages 70–74. Springer.
[10]
Schäpermeier, L. and Kerschke, P. (2024). . In Proceedings of the 18th International Conference on Parallel Problem Solving from Nature (PPSN XVIII), volume 15151 of Lecture Notes in Computer Science, pages 202–216. Springer.
[11]
Jaszkiewicz, A. and Zielniewicz, P. (2025). Exact calculation and properties of the R2 multiobjective quality indicator. IEEE Transactions on Evolutionary Computation, 29(4):1227–1238.
[12]
Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C. M., and da Fonseca, V. G. (2003). . IEEE Transactions on Evolutionary Computation, 7(2):117–132.
[13]
Audet, C., Bigeon, J., Cartier, D., Le Digabel, S., and Salomon, L. (2021). . European Journal of Operational Research, 292(2):397–422.
[14]
Ishibuchi, H., Imada, R., Setoguchi, Y., and Nojima, Y. (2018). How to specify a reference point in hypervolume calculation for fair performance comparison. Evolutionary Computation, 26(3):411–440.
[15]
Zitzler, E., Knowles, J. D., and Thiele, L. (2008). . In Branke, J., Deb, K., Miettinen, K., and Slowinski, R., editors, Multiobjective Optimization, Interactive and Evolutionary Approaches [outcome of Dagstuhl seminars], volume 5252 of Lecture Notes in Computer Science, pages 373–404. Springer.
[16]
Brockhoff, D., Auger, A., Hansen, N., and Tušar, T. (2022). . Evolutionary Computation, 30(2):165–193.
[17]
Adelson-Velsky, G. M. and Landis, E. M. (1962). An algorithm for the organization of information. Doklady Akademii Nauk SSSR, 146(2):263–266. English translation in Soviet Mathematics Doklady, vol. 3, pp. 1259–1263, 1962.
[18]
Jenks, G. (2025). Python sorted containers.
[19]
Mersmann, O. (2023). emoa: Evolutionary Multiobjective Optimization Algorithms. package version 0.5-0.2.
[20]
Deb, K., Thiele, L., Laumanns, M., and Zitzler, E. (2005). . In Evolutionary Multiobjective Optimization: Theoretical Advances and Applications, pages 105–145. Springer.
[21]
Wagner, T., Trautmann, H., and Brockhoff, D. (2013). . In Proceedings of the 7th International Conference on Evolutionary Multi-Criterion Optimization (EMO 2013), pages 81–95. Springer.
[22]
Carlyle, W. M., Fowler, J. W., Gel, E. S., and Kim, B. (2003). . Decision Sciences, 34(1):63–82.
[23]
Bozkurt, B., Fowler, J. W., Gel, E. S., Kim, B., Köksalan, M., and Wallenius, J. (2010). . Operations Research, 58(3):650–659.