June 06, 2023
We prove a vector-valued inequality for the Gaussian noise stability (i.e. we prove a vector-valued Borell inequality) for Euclidean functions taking values in the two-dimensional sphere, for all correlation parameters at most \(1/10\) in absolute value. This inequality was conjectured (for all correlation parameters at most \(1\) in absolute value) by Hwang, Neeman, Parekh, Thompson and Wright. Such an inequality is needed to prove sharp computational hardness of the product state Quantum MAX-CUT problem, assuming the Unique Games Conjecture. In fact, assuming the Unique Games Conjecture, we show that the product state of Quantum MAX-CUT is NP-hard to approximate within a multiplicative factor of \(.9859\). In contrast, a polynomial time algorithm is known with approximation factor \(.956\ldots\).
The noise stability of a measurable Euclidean set \(A\) with correlation \(\rho\) is the probability that \((X,Y)\in A\times A\), where \(X,Y\) are standard Gaussian random vectors with correlation \(\rho\in(-1,1)\). Borell’s inequality asserts that half spaces have the largest noise stability among all Euclidean sets of fixed Gaussian measure. Borell’s inequality [1] generalizes the Gaussian isoperimetric inequality, since letting \(\rho\to1^{-}\) in Borell’s inequality recovers the Gaussian isoperimetric inequality [2]. The Gaussian isoperimetric inequality says that a half space has the smallest Gaussian surface area among all Euclidean sets of fixed Gaussian volume.
Besides its intrinsic interest, Borell’s inequality has been applied to social choice theory [3], the Unique Games Conjecture [3]–[5], to semidefinite programming algorithms such as MAX-CUT [4], [6], to learning theory [7], etc. For some surveys on this and related topics, see [8]–[10].
A Corollary of Borell’s inequality is the Majority is Stablest Theorem [3]. Moreover, Borell’s inequality was the main technical ingredient used to prove sharp computational hardness of the MAX-CUT problem, assuming the Unique Games Conjecture [4].
The MAX-CUT problem asks for the partition of the vertices of an undirected finite graph into two disjoint sets that maximizes the number of edges going between the two sets. The MAX-CUT problem is a well-studied NP-complete constraint satisfaction problem with well-understood hardness of approximation [4], [11]. The Unique Games Conjecture implies that the Goemans-Williamson semidefinite program is the best quality polynomial time algorithm for approximately solving MAX-CUT.
The quantum analogue of a constraint satisfaction problem is a local Hamiltonian problem, which is QMA-complete [12]. The Quantum MAX-CUT problem is a special case of the local Hamiltonian problem, that in some sense generalizes the usual MAX-CUT problem. As with MAX-CUT, it is natural to ask for approximation algorithms for Quantum MAX-CUT, and to try to prove sharp computational hardness of those algorithms [13], [14]. Below, we only discuss classical algorithms for Quantum MAX-CUT, i.e. we do not discuss quantum algorithms for Quantum MAX-CUT.
For the product state Quantum MAX-CUT problem, there is a conjecturally optimal approximation algorithm. Assuming the Unique Games Conjecture and Conjecture 1 below (a vector-valued Borell inequality), we would then have a sharp hardness of approximation for the product state of Quantum MAX-CUT. The vector-valued Borell inequality, Conjecture 1 [13] is a sphere-valued generalization of the Borell inequality [1].
The main result of this paper is a proof the vector-valued Borell Inequality (Conjecture 1) for all correlations \(\rho\) satisfying \(\left|\rho\right|<.104\). It remains a challenge to prove Conjecture 1 for all \(\left|\rho\right|<1\).
Our focus in this paper is proving the inequality conjectured in [13] for functions \(f\colon\mathbb{R}^{n}\to S^{2}\). Our methods work equally well for functions taking values in \(S^{k}\) for any \(k\geq2\), but the values of \(\rho\) for which our proof works would then depend on \(k\). Since the case \(k=2\) is the only one needed for the product state Quantum MAX-CUT problem, we have therefore only focused on the case \(k=2\) in this paper.
Below we describe the Quantum MAX-CUT problem by analogy with MAX-CUT. When \(M\) is a \(2\times 2\) matrix and \(j\) is a positive integer, we denote \[M^{\otimes j}\colonequals\underbrace{M\otimes \cdots\otimes M}_{j\rm\;times}.\] If \(n\) is a positive integer and \(1\leq j\leq n\), denote \[Z_{j}\colonequals I_{2}^{\otimes (j-1)}\otimes \begin{pmatrix} 1 & 0 \\ 0 & -1\end{pmatrix} \otimes I_{2}^{\otimes (n-j)},\qquad\forall\,1\leq j\leq n, \qquad I_{2}\colonequals\begin{pmatrix} 1 & 0 \\ 0 & 1\end{pmatrix}.\] The (weighted) MAX-CUT problem can be equivalently stated as [13], [15]: given \(w\colon\{1,\ldots,n\}^{2}\to[0,\infty)\) satisfying \(w_{ij}=w_{ji}\) and \(w_{ii}=0\) for all \(1\leq i,j\leq n\), compute the following quantity \[\max_{u\in(\mathbb{C}^{2})^{\otimes n}\colon\left\|u\right\|\leq1}u^{*}\Big(\sum_{i,j=1}^{n}w_{ij}(I_{2}^{\otimes n} - Z_{i}Z_{j})\Big) u.\] Define now
\[X_{j}\colonequals I_{2}^{\otimes (j-1)}\otimes \begin{pmatrix} 0 & 1 \\ 1 & 0\end{pmatrix} \otimes I_{2}^{\otimes (n-j)},\qquad\forall\,1\leq j\leq n,\] \[Y_{j}\colonequals I_{2}^{\otimes (j-1)}\otimes \begin{pmatrix} 0 & -\sqrt{-1} \\ \sqrt{-1} & 0\end{pmatrix} \otimes I_{2}^{\otimes (n-j)},\qquad\forall\,1\leq j\leq n.\]
The Quantum MAX-CUT problem is [13]–[15]: given \(w\colon\{1,\ldots,n\}^{2}\to[0,\infty)\) satisfying \(w_{ij}=w_{ji}\) and \(w_{ii}=0\) for all \(1\leq i,j\leq n\), compute the following quantity \[\max_{u\in\mathbb{C}^{2n}\colon \left\|u\right\|\leq 1}u^{*}\Big(\sum_{i,j=1}^{n}w_{ij}(I_{2}^{\otimes n}- X_{i} X_{j} - Y_{i} Y_{j} - Z_{i} Z_{j})\Big) u.\] The product state of Quantum MAX-CUT is the more restricted optimization problem of computing \[\max_{\substack{u=u_{1}\otimes\cdots\otimes u_{n}\colon\\ u_{i}\in \mathbb{C}^{2},\,\left\|u_{i}\right\|\leq1,\,\forall\,1\leq i\leq n}}u^{*}\Big(\sum_{i,j=1}^{n}w_{ij}(I_{2}^{\otimes n} - X_{i} X_{j} - Y_{i} Y_{j} - Z_{i} Z_{j})\Big) u.\]
For any \(x=(x_{1},\ldots,x_{n}),y=(y_{1},\ldots,y_{n})\in\mathbb{C}^{n}\), denote \[\langle x,y\rangle\colonequals\sum_{i=1}^{n}x_{i}\overline{y_{i}},\qquad\left\|x\right\|\colonequals\langle x,x\rangle^{1/2}.\] Denote the \((n-1)\)-dimensional sphere in \(\mathbb{R}^{n}\) as \[S^{n-1}\colonequals\{x\in\mathbb{R}^{n}\colon \left\|x\right\|=1\}.\] Define the Gaussian density function in \(\mathbb{R}^{n}\) as \[\gamma_{n}(x)\colonequals(2\pi)^{-n/2}e^{-\left\|x\right\|^{2}/2},\qquad\forall\,x\in\mathbb{R}^{n}.\] When \(A\subseteq\mathbb{R}^{n}\) is a measurable set, denote the Gaussian measure of \(A\) as \(\gamma_{n}(A)\colonequals\int_{A}\gamma_{n}(x)\,\mathrm{d}x\).
Definition 1 (Ornstein-Uhlenbeck Operator). Let \(-1<\rho<1\). Let \(f\colon\mathbb{R}^{n}\to[0,1]\) be measurable. Define the Ornstein-Uhlenbeck operator applied to \(f\) by \[T_{\rho}f(x)\colonequals \int_{\mathbb{R}^{n}}f(\rho x+y\sqrt{1-\rho^{2}})\gamma_{n}(x)\,\mathrm{d}x,\qquad\forall\,x\in\mathbb{R}^{n}.\]
Definition 2 (Noise Stability). Let \(-1<\rho<1\). Let \(\Omega\subseteq\mathbb{R}^{n}\) be measurable. Define the noise stability of \(\Omega\) with correlation \(\rho\), to be\[\int_{\mathbb{R}^{n}}1_{\Omega}(x)T_{\rho}1_{\Omega}(x)\,\gamma_{n}(x)\,\mathrm{d}x.\] More generally, for any bounded measurable \(f\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\), define its noise stability with correlation \(\rho\), to be\[\int_{\mathbb{R}^{n}}\langle f(x), T_{\rho}f(x)\rangle\gamma_{n}(x)\,\mathrm{d}x.\]
For any positive integer \(k\), denote \[\label{foptdef} f_{\rm opt}(x)\colonequals\frac{x}{\left\|x\right\|},\qquad\forall\,x\in\mathbb{R}^{k}\setminus\{0\}.\tag{1}\]
Conjecture 1 ([13]). Let \(n\geq k\) be positive integers. Let \(f\colon \mathbb{R}^{n}\to S^{k-1}\) be measurable. Then
If \(0<\rho<1\) and if \(\int_{\mathbb{R}^{n}}f(x)\gamma_{n}(x)\, \mathrm{d}x=0\), then \[\int_{\mathbb{R}^{n}}\langle f(x), T_{\rho}f(x)\rangle \gamma_{n}(x)\, \mathrm{d}x\leq \int_{\mathbb{R}^{k}}\langle f_{\rm opt}(x), T_{\rho}f_{\rm opt}(x)\rangle \gamma_{k}(x)\, \mathrm{d}x.\]
If \(-1<\rho<0\), then \[\int_{\mathbb{R}^{n}}\langle f(x), T_{\rho}f(x)\rangle \gamma_{n}(x)\, \mathrm{d}x\geq \int_{\mathbb{R}^{k}}\langle f_{\rm opt}(x), T_{\rho}f_{\rm opt}(x)\rangle \gamma_{k}(x)\, \mathrm{d}x.\]
Moreover, equality holds only when there exists a real orthogonal \(k\times n\) matrix \(M\) (\(MM^{T}=I\)) such that \(f(x)=f_{\rm opt}(Mx)\) for a.e. \(x\in\mathbb{R}^{n}\).
Remark 2. The case \(k=1\) of Conjecture 1 is known to be true, since it is Borell’s inequality [1].
Theorem 3 (Main). Conjecture 1 holds for all \(n\geq k=3\) and for all \(-.104<\rho<.104\).
In fact, we prove a stronger “stable” version of Conjecture 1 when \(n=k=3\), in 22 , 29 below:
If \(0<\rho<.104\) and if \(\int_{\mathbb{R}^{k}}f(x)\gamma_{k}(x)\, \mathrm{d}x=0\), then \[\begin{align} \int_{\mathbb{R}^{k}}\langle f(x), T_{\rho}f(x)\rangle \gamma_{k}(x)\, \mathrm{d}x &\leq\int_{\mathbb{R}^{k}}\langle f_{\rm opt}(x), T_{\rho}f_{\rm opt}(x)\rangle \gamma_{k}(x)\, \mathrm{d}x\\ &\qquad\quad+(9.4\rho-.98)\int_{\mathbb{R}^{k}}\phi(x)\left\|\int_{y\in\left\|x\right\|S^{k-1}}f(y) \,\mathrm{d}\sigma(y)\right\|^{2}\gamma_{k}(x)\,\mathrm{d}x. \end{align}\]
If \(-.104<\rho<0\), then \[\begin{align} \int_{\mathbb{R}^{k}}\langle f(x), T_{\rho}f(x)\rangle \gamma_{k}(x)\, \mathrm{d}x &\geq \int_{\mathbb{R}^{k}}\langle f_{\rm opt}(x), T_{\rho}f_{\rm opt}(x)\rangle \gamma_{k}(x)\, \mathrm{d}x\\ &\qquad\quad+(.98-9.4\left|\rho\right|)\int_{\mathbb{R}^{k}}\phi(x)\left\|\int_{y\in\left\|x\right\|S^{k-1}}f(y) \,\mathrm{d}\sigma(y)\right\|^{2}\gamma_{k}(x)\,\mathrm{d}x. \end{align}\]
Here \(\phi(x)=1-\frac{1}{\rho \left\|x\right\|}+\frac{2}{e^{2\rho \left\|x\right\|}-1}\) for all \(x\in\mathbb{R}^{k}\setminus\{0\}\) and \(\mathrm{d}\sigma\) denotes normalized (Haar) measure on the sphere.
We can also prove Theorem 3 when \(k>3\), but with different dependence on \(\rho\). Since the \(k=3\) case is the only relevant case for computational hardness of the product state of Quantum MAX-CUT, we only state a result for \(k=3\).
As shown in [13] by adapting the argument of [16], the case \(n=k=3\) of Theorem 3 (when \(\rho>0\)) implies the case \(n>k=3\) of Theorem 3 (when \(\rho>0\)). That is, the dimension of the domain can be a priori reduced to the case \(n=k\), at least when \(\rho>0\). In this paper, we show that a similar statement holds in the case \(\rho<0\). This “dimension reduction” argument works differently in the case \(\rho<0\). For technical reasons, we need to instead deal with a bilinear version of the noise stability. With this change, we show that the bilinear noise stability has a similar dimension reduction argument. That is, we verify that the case \(n=k=3\) of Theorem 3 (when \(\rho<0\)) implies the case \(n>k=3\) of Theorem 3 (when \(\rho<0\)). The bilinear inequality we prove in the case \(n=k=3\) is the following. Suppose \(0<\rho<.104\) and \(n=k=3\). We then show in 29 below that \[\int_{\mathbb{R}^{k}}\langle f(x), T_{\rho}g(x)\rangle \gamma_{k}(x)\, \mathrm{d}x\geq -\int_{\mathbb{R}^{k}}\langle f_{\rm opt}(x), T_{\rho}f_{\rm opt}(x)\rangle \gamma_{k}(x)\, \mathrm{d}x,\] \(\forall\) \(f,g\colon\mathbb{R}^{k}\to S^{k-1}\) when \(k=3\), if \(\int_{\mathbb{R}^{k}}f(x)\gamma_{k}(x)\,\mathrm{d}x=\int_{\mathbb{R}^{k}}g(x)\gamma_{k}(x)\,\mathrm{d}x\), or more generally, \[\begin{align} &\int_{\mathbb{R}^{k}}\langle f(x), T_{\rho}g(x)\rangle \gamma_{k}(x)\, \mathrm{d}x \geq -\int_{\mathbb{R}^{k}}\langle f_{\rm opt}(x), T_{\rho}f_{\rm opt}(x)\rangle \gamma_{k}(x)\, \mathrm{d}x\\ &\quad+(.98-9.4\rho)\int_{\mathbb{R}^{k}}\phi(x)\frac{1}{2}\Big(\left\|\int_{y\in\left\|x\right\|S^{k-1}}f(y) \,\mathrm{d}\sigma(y)\right\|^{2} +\left\|\int_{y\in\left\|x\right\|S^{k-1}}g(y) \,\mathrm{d}\sigma(y)\right\|^{2}\Big)\gamma_{k}(x)\,\mathrm{d}x. \end{align}\]
Without considering this bilinear noise stability, [13] proved only the case \(n=k\) of Conjecture 1 when \(\rho<0\). The case \(n=k\) in Conjecture 1 does not imply the cases \(n>k\), and this is why we considered the more general bilinear noise stability inequality above. (Although one can e.g. regard a function \(f\colon \mathbb{R}^{4}\to S^{2}\) as a function \(\overline{f}\colon \mathbb{R}^{4}\to S^{3}\) by considering \(S^{2}\) as a subset of \(S^{3}\), the inequality satisfied by \(\overline{f}\) proven in [13] is not a sharp inequality for \(f\).)
Applications to Quantum MAX-CUT require Conjecture 1 to hold for all \(n\geq k\). So, we have managed to prove a result that is independent of the dimension of the domain, since it holds for any \(n\geq k\) when \(k=3\), though not for the full range of \(\rho\) parameters. Also, the proof method of [13] does not directly apply in the case \(\rho>0\), and attempting to prove Conjecture 1 in that case led us towards Theorem 3, although the case \(\rho>0\) is irrelevant for applications to Quantum MAX-CUT.
Conjecture 1 should hold for all \(-1<\rho<1\) and for all functions \(n\geq k=3\). If such a conjecture holds, then we would be able to conclude sharp hardness of approximation for the product state of Quantum MAX-CUT problem, assuming the Unique Games Conjecture.
Conjecture 4 (Sharp Hardness for Quantum MAX-CUT, [13]). Assume that the Unique Games Conjecture is true [17]–[19]. Assume Conjecture 1 holds for all \(n\geq k=3\). Then, for any \(\varepsilon>0\), it is NP-hard to approximate the product state of Quantum MAX-CUT within a multiplicative factor of \(\alpha_{\rm BOV}+\varepsilon\).
As shown in [13], [20], we have \[\label{abovdef} \alpha_{\rm BOV} \colonequals \inf_{-1\leq\rho\leq 1}\frac{1-F^{*}(3,\rho)}{1-\rho}\approx .956.\tag{2}\] \[\label{fstdef} F^{*}(3,\rho)\colonequals\frac{2}{3}\Big(\frac{\Gamma((3+1)/2)}{\Gamma(3/2)}\Big)^{2}\rho\cdot\,\,_{2}F_{1}(1/2,1/2,3/2 +1, \rho^{2}),\qquad\forall\,-1<\rho<1.\tag{3}\] Here \(_{2}F_{1}(\cdot,\cdot,\cdot,\cdot)\) is the Gaussian hypergeometric function.
The semidefinite programming algorithm of [20] shows that Conjecture 4 is sharp, since the polynomial time algorithm of [20] approximates the product state Quantum MAX-CUT problem with a multiplicative factor of \(\alpha_{\rm BOV}-\varepsilon\), for any \(\varepsilon>0\).
A corollary of our main result Theorem 3 is a weaker version of Conjecture 4 that replaces the sharp constant \(.956\ldots\) with a larger constant.
Theorem 5 (Unique Games Hardness for Product State Quantum MAX-CUT). Assume that the Unique Games Conjecture is true. Then it is NP-hard to approximate the product state of Quantum MAX-CUT within a multiplicative factor of \(.9859\).
To the author’s knowledge, Theorem 5 is the only computational hardness result for the product state of Quantum MAX-CUT besides Conjecture 4.
Definition 3 (Correlated Gaussians). Let \(-1<\rho<1\). Let \(G_{\rho}(x,y)\) denote the joint probability density function on \(\mathbb{R}^{n}\times\mathbb{R}^{n}\) such that \[\label{gdef} G_{\rho}(x,y)\colonequals\frac{1}{(2\pi)^{n}(1-\rho^{2})^{n/2}}e^{\frac{-\left\|x\right\|^{2}-\left\|y\right\|^{2}+2\rho \langle x,y\rangle}{2(1-\rho^{2})}},\qquad\forall\,x,y\in \mathbb{R}^{n}.\tag{4}\]
We denote \(X\sim_{\rho} Y\) when \((X,Y)\in\mathbb{R}^{n}\times \mathbb{R}^{n}\) have joint probability density function \(G_{\rho}\).
Definition 4 (Correlated Random Variables on the Sphere). Let \(-1<\rho<1\) and let \(r,s>0\). Let \(G_{\rho}^{r,s}(u,v)\) denote the probability density function on \(S^{n-1}\times S^{n-1}\) such that the first variable is uniform on \(S^{n-1}\) and such that the second variable conditioned on the first has conditional density \[G_{\rho}^{r,s}(v|u)\colonequals\frac{1}{z_{\rho,r,s}}e^{\frac{\rho r s\langle u,v\rangle}{1-\rho^{2}}},\qquad\forall\,v\in S^{n-1}.\] Here \(z_{\rho,r,s}\) is a normalizing constant, chosen so that \(\int_{S^{n-1}}G_{\rho}^{r,s}(v|u)\mathrm{d}\sigma(v)=1\), where \(\sigma\) denotes the uniform probability (Haar) measure on \(S^{n-1}\).
We let \(N_{\rho}^{r,s}\) denote the above distribution on \(S^{n-1}\times S^{n-1}\) and we denote \((U,V)\sim N_{\rho}^{r,s}\) when \((U,V)\in S^{n-1}\times S^{n-1}\) have the distribution \(N_{\rho}^{r,s}\).
Definition 5 (Spherical Noise Stability). Let \(\rho\in(-1,1)\), \(r,s>0\). Let \(f\colon S^{n-1}\to[0,1]\) be measurable. Define \(g=g_{\rho,r,s}\colon[-1,1]\to\mathbb{R}\) by 7 . Define the smoothing operator \(U_{g}\) applied to \(f\) by \[U_{g}f(x)\colonequals \int_{S^{n-1}}g(\langle x,y\rangle)f(y)\,\mathrm{d}\sigma(y),\qquad\forall x\in S^{n}.\] Here \(\sigma\) denotes the (normalized) Haar probability measure on \(S^{n-1}\). The spherical noise stability of a set \(\Omega\subseteq S^{n-1}\) with parameters \(\rho,r,s\) is \[\int_{S^{n-1}}1_{\Omega}(x)U_{g}1_{\Omega}(x)\,\mathrm{d}\sigma(x).\]
The spherical noise stability has a decomposition into spherical harmonics by the Funk-Hecke Formula [13] \[\label{nine7} \int_{S^{n-1}}f(x)U_{g}f(x)\,\mathrm{d}\sigma(x) =\Big\|\int_{S^{n-1}}f(x)\mathrm{d}\sigma(x)\Big\|^{2}+\sum_{d=1}^{\infty}\lambda_{d,n}^{r,s}\|\mathrm{Proj}_{d}(f)\|^{2}.\tag{5}\] Here \(\mathrm{Proj}_{d}(f)\) is the \(L_{2}(\mathrm{d}\sigma)\) projection of \(f\) onto spherical harmonics of degree \(d\), and \(\lambda_{d,n}^{r,s}\) are specified in 10
Fix \(r,s>0\) and let \(0<\rho<1\). Define \(g\colon[-1,1]\to\mathbb{R}\) by \[\label{one0} g(t)=g_{\rho,r,s}(t)\colonequals\sqrt{\pi}\frac{\Gamma((n-1)/2)}{\Gamma(n/2)}\frac{e^{\frac{\rho rst}{1-\rho^{2}}}}{\int_{-1}^{1}(1-a^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rsa}{1-\rho^{2}}}\mathrm{d}a},\qquad\forall\,t\in[-1,1].\tag{6}\] Recall that, if \(h\colon\mathbb{R}\to\mathbb{R}\) is continuous, then
\[\begin{align} \frac{1}{\mathrm{Vol}(S^{n-1})}\int_{S^{n-1}}h(y_{1})\mathrm{d}y &=\frac{\mathrm{Vol}(S^{n-2})}{\mathrm{Vol}(S^{n-1})}\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}h(t)\mathrm{d}t\\ &=\frac{2\pi^{(n-1)/2}/\Gamma((n-1)/2)}{2\pi^{n/2}/\Gamma(n/2)}\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}h(t)\mathrm{d}t\\ &=\frac{1}{\sqrt{\pi}}\frac{\Gamma(n/2)}{\Gamma((n-1)/2)}\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}h(t)\mathrm{d}t. \end{align}\] We have chosen the constants so that \(1=\frac{1}{\mathrm{Vol}(S^{n-1})}\int_{S^{n-1}}h(y_{1})\mathrm{d}y\), when \(h=g\). When \(h\colonequals1\), we have \(\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}\mathrm{d}t=\sqrt{\pi}\frac{\Gamma((n-1)/2)}{\Gamma(n/2)}\), so \[\label{one0p} g(t)=g_{\rho,r,s}(t)\stackrel{\eqref{one0}}{=}e^{\frac{\rho rst}{1-\rho^{2}}}\cdot \frac{\int_{-1}^{1}(1-a^{2})^{\frac{n}{2}-\frac{3}{2}}\mathrm{d}a}{\int_{-1}^{1}(1-a^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rsa}{1-\rho^{2}}}\mathrm{d}a},\qquad\forall\,t\in[-1,1].\tag{7}\]
Notation: Rising Factorial. For any \(x\in\mathbb{R}\) and for any integer \(d\geq1\), we denote \((x)_{d}\colonequals \prod_{j=0}^{d-1}(x+j)\).
Let \(C_{d}^{(\alpha)}\colon[-1,1]\to\mathbb{R}\) denote the index \(\alpha\) degree \(d\) Gegenbauer polynomial, which satisfies a Rodrigues formula [21] \[(1-t^{2})^{\alpha-1/2}C_{d}^{(\alpha)}(t)=\frac{(-2)^{d}(\alpha)_{d}}{d!(d+2\alpha)_{d}}\frac{\mathrm{d}^{d}}{\mathrm{d}t^{d}}(1-t^{2})^{\alpha+d-1/2},\qquad\forall\,t\in[-1,1].\] Letting \(\alpha\colonequals\frac{n}{2}-1\), we have \[\label{one1} (1-t^{2})^{\frac{n}{2}-\frac{3}{2}}C_{d}^{(\frac{n}{2}-1)}(t)=\frac{(-2)^{d}\Big(\frac{n}{2}-1\Big)_{d}}{d!(d+n-2)_{d}}\frac{\mathrm{d}^{d}}{\mathrm{d}t^{d}}(1-t^{2})^{\frac{n}{2}+d-\frac{3}{2}},\qquad\forall\,t\in[-1,1].\tag{8}\] From [21], \[\label{one2} C_{d}^{(\frac{n}{2}-1)}(1)=\frac{(n-2)_{d}}{d!}.\tag{9}\] Then [13] defines \[\label{lamdef} \begin{align} \lambda_{d,n}^{r,s} &\colonequals\frac{\int_{-1}^{1}\frac{C_{d}^{(\frac{n}{2}-1)}(t)}{C_{d}^{(\frac{n}{2}-1)}(1)}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}} g(t)\mathrm{d}t}{\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}\mathrm{d}t} \stackrel{\eqref{one0p}}{=}\frac{\int_{-1}^{1}\frac{C_{d}^{(\frac{n}{2}-1)}(t)}{C_{d}^{(\frac{n}{2}-1)}(1)}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}} e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}{\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}\\ &\stackrel{\eqref{one1}\wedge\eqref{one2}}{=} \frac{(-2)^{d}(\frac{n}{2}-1)_{d}}{d!(d+n-2)_{d}}\frac{d!}{(n-2)_{d}} \frac{\int_{-1}^{1} \Big[\frac{\mathrm{d}^{d}}{\mathrm{d}t^{d}}(1-t^{2})^{\frac{n}{2}+d-\frac{3}{2}}\Big]e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}{\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}. \end{align}\tag{10}\] Integrating by parts \(d\) times, \[\begin{align} \lambda_{d,n}^{r,s} &=\Big(\frac{\rho rs}{1-\rho^{2}}\Big)^{d} \frac{(-2)^{d}(\frac{n}{2}-1)_{d}}{(n-2)_{2d}} \frac{(-1)^{d}\int_{-1}^{1} (1-t^{2})^{\frac{n}{2}+d-\frac{3}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}{\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}. \end{align}\]
In the case \(d=1\) we have \[\label{one6} \begin{align} \lambda_{1,n}^{r,s} &=\Big(\frac{\rho rs}{1-\rho^{2}}\Big) \frac{(2)(\frac{n}{2}-1)}{(n-1)}\frac{1}{(n-2)} \frac{\int_{-1}^{1} (1-t^{2})^{\frac{n}{2}-\frac{1}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}{\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}\\ &=\Big(\frac{\rho rs}{1-\rho^{2}}\Big)\Big(\frac{1}{n-1}\Big) \frac{\int_{-1}^{1} (1-t^{2})^{\frac{n}{2}-\frac{1}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}{\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}. \end{align}\tag{11}\]
The following Lemma follows from the Cauchy-Schwarz inequality, and from the spherical harmonic decomposition 5 .
Lemma 1 ([13]). Let \(r,s>0\). Let \(\rho>0\). Let \(f_{r},f_{s}\colon S^{n-1}\to S^{n-1}\). Then \[\mathbb{E}_{(U,V)\sim N_{\rho}^{r,s}}\langle f_{r}(U),f_{s}(V)\rangle\leq\langle \mathbb{E}f_{r},\mathbb{E}f_{s}\rangle+\lambda_{1,n}^{r,s}\sqrt{\mathbb{E}\left\|f_{r}-\mathbb{E}f_{r}\right\|^{2}}\sqrt{\mathbb{E}\left\|f_{s}-\mathbb{E}f_{s}\right\|^{2}}.\] Equality holds only when \(f_{r}(x)=f_{s}(x)=Mx\) for all \(x\in S^{n-1}\), where \(M\) is an \(n\times n\) real orthogonal matrix. \[\mathbb{E}_{(U,V)\sim N_{\rho}^{r,s}}\langle f_{r}(U),f_{s}(V)\rangle\geq\langle \mathbb{E}f_{r},\mathbb{E}f_{s}\rangle-\lambda_{1,n}^{r,s}\sqrt{\mathbb{E}\left\|f_{r}-\mathbb{E}f_{r}\right\|^{2}}\sqrt{\mathbb{E}\left\|f_{s}-\mathbb{E}f_{s}\right\|^{2}}.\] Equality holds only when \(f_{r}(x)=-f_{s}(x)=Mx\) for all \(x\in S^{n-1}\), where \(M\) is an \(n\times n\) real orthogonal matrix.
\(\mathbb{E}\) with no subscript denotes expected value on a sphere with respect to the uniform (Haar) probability measure.
\(\mathbb{E}_{(U,V)\sim N_{\rho}^{r,s}}\) denotes expected value with respect to \((U,V)\) from Definition 4.
\(\underset{X\sim_{\rho}Y}{\mathbb{E}}\) denotes expected value with respect to \((X,Y)\) from Definition 3.
\(\mathbb{E}_{R,S}\) denotes expected value with respect to \(R,S\) where \(R=\left\|X\right\|,S=\left\|Y\right\|\), and \(X,Y\) are two standard \(\rho\)-correlated Gaussians, as in Definition 3.
\(\mathbb{E}_{\gamma}\) denotes expected value with respect to the Gaussian density \(\gamma_{n}\).
Our first step towards proving Theorem 3 with \(\rho>0\) will be modifying Lemma 1. Lemmas 2 and 3 below demonstrate that optimizing the noise stability \(\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),f(Y)\rangle\) involves an interplay between \(\mathbb{E}f_{\left\|X\right\|}\) having norm \(0\) or norm \(1\).
Lemma 2. Let \(f\colon \mathbb{R}^{n}\to S^{n-1}\) be continuous. For any \(r>0\), denote \(f_{r}\colonequals f|_{rS^{n-1}}\) and denote \(\mathbb{E}f_{r}\) as the expected value of \(f_{r}\) on \(rS^{n-1}\) with respect to the uniform probability (Haar) measure on \(S^{n-1}\). Then \[\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),f(Y)\rangle \leq\underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}f_{\left\|Y\right\|}\rangle+\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\sqrt{1-\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}}\sqrt{1-\left\|\mathbb{E}f_{\left\|Y\right\|}\right\|^{2}}\Big).\] Equality holds only when \(f(x)=Mx/\left\|Mx\right\|\) for a.e. \(x\in S^{n-1}\), where \(M\) is an \(n\times n\) real orthogonal matrix.
Proof. We first write (recalling the notation of Section 1.6) \[\mathbb{E}_{X\sim_{\rho}Y}\langle f(X),f(Y)\rangle =\mathbb{E}_{R,S}\mathbb{E}_{(U,V)\sim N_{\rho}^{R,S}}\langle f_{R}(U),f_{S}(V)\rangle.\] Applying Lemma 1 and averaging over \(R,S\), and also using that \(f\) takes values in \(S^{n-1}\), so \(\mathbb{E}\left\|f_{S}-\mathbb{E}f_{S}\right\|^{2}=1+\left\|\mathbb{E}f_{S}\right\|^{2}-2\langle\mathbb{E}f_{S},\mathbb{E}f_{S}\rangle=1-\left\|\mathbb{E}f_{S}\right\|^{2}\), \[\begin{align} &\underset{R,S}{\mathbb{E}}\underset{(U,V)\sim N_{\rho}^{R,S}}{\mathbb{E}}\langle f_{R}(U),f_{S}(V)\rangle\\ &\qquad\leq\underset{R,S}{\mathbb{E}}\langle \mathbb{E}f_{R},\mathbb{E}f_{S}\rangle+\underset{R,S}{\mathbb{E}}\lambda_{1,n}^{R,S}\sqrt{\mathbb{E}\left\|f_{R}-\mathbb{E}f_{R}\right\|^{2}}\sqrt{\mathbb{E}\left\|f_{S}-\mathbb{E}f_{S}\right\|^{2}}\\ &\qquad=\underset{R,S}{\mathbb{E}}\langle \mathbb{E}f_{R},\mathbb{E}f_{S}\rangle+\underset{R,S}{\mathbb{E}}\lambda_{1,n}^{R,S}\sqrt{\mathbb{E}(1-\left\|\mathbb{E}f_{R}\right\|^{2})}\sqrt{\mathbb{E}(1-\left\|\mathbb{E}f_{S}\right\|^{2})}\\ &\qquad=\underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}f_{\left\|Y\right\|}\rangle+\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\sqrt{1-\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}}\sqrt{1-\left\|\mathbb{E}f_{\left\|Y\right\|}\right\|^{2}}\Big). \end{align}\] The equality case follows from the equality case of Lemma 1 ◻
Using again the notation of Lemma 2, we have
Lemma 3. Let \(f\colon \mathbb{R}^{n}\to S^{n-1}\). Then \[\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),f(Y)\rangle \leq\underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} +\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}f_{\left\|Y\right\|}\rangle -\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\Big).\] Equality holds only when \(f(x)=Mx/\left\|Mx\right\|\) for a.e. \(x\in S^{n-1}\), where \(M\) is an \(n\times n\) real orthogonal matrix.
Proof. Applying the inequality \(\left|ab\right|\leq (1/2)(a^{2}+b^{2})\) \(\forall\) \(a,b\in\mathbb{R}\), we have \[\begin{align} &\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\sqrt{1-\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}}\sqrt{1-\left\|\mathbb{E}f_{\left\|Y\right\|}\right\|^{2}}\\ &\qquad\qquad\leq\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}(1-\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2})\\ &\qquad\qquad=\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}-\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}. \end{align}\] Combining with Lemma 2 concludes the proof. ◻
As in the previous section, our first step towards proving Theorem 3 with \(\rho<0\) will be modifying Lemma 1. Lemmas 4 and 5 below demonstrate that optimizing the bilinear noise stability \(\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),g(Y)\rangle\) involves an interplay between \(\mathbb{E}f_{\left\|X\right\|}\) having norm \(0\) or norm \(1\), and similarly for \(\mathbb{E}g_{\left\|Y\right\|}\).
Lemma 4. Let \(f,g\colon \mathbb{R}^{n}\to S^{n-1}\) be continuous. For any \(r>0\), denote \(f_{r}\colonequals f|_{rS^{n-1}}\) and denote \(\mathbb{E}f_{r}\) as the expected value of \(f_{r}\) on \(rS^{n-1}\) with respect to the uniform probability (Haar) measure on \(S^{n-1}\). Then \[\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X), g(Y)\rangle \geq\underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}g_{\left\|Y\right\|}\rangle-\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\sqrt{1-\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}}\sqrt{1-\left\|\mathbb{E}g_{\left\|Y\right\|}\right\|^{2}}\Big).\] Equality holds only when \(f(x)=-g(x)=Mx/\left\|Mx\right\|\) for a.e. \(x\in S^{n-1}\), where \(M\) is an \(n\times n\) real orthogonal matrix.
Proof. We first write (recalling the notation of Section 1.6) \[\mathbb{E}_{X\sim_{\rho}Y}\langle f(X),g(Y)\rangle =\mathbb{E}_{R,S}\mathbb{E}_{(U,V)\sim N_{\rho}^{R,S}}\langle f_{R}(U),g_{S}(V)\rangle.\] Applying Lemma 1 and averaging over \(R,S\), and also using that \(f\) takes values in \(S^{n-1}\), so \(\mathbb{E}\left\|f_{S}-\mathbb{E}f_{S}\right\|^{2}=1+\left\|\mathbb{E}f_{S}\right\|^{2}-2\langle\mathbb{E}f_{S},\mathbb{E}f_{S}\rangle=1-\left\|\mathbb{E}f_{S}\right\|^{2}\), and similarly for \(g\), \[\begin{align} &\underset{R,S}{\mathbb{E}}\underset{(U,V)\sim N_{\rho}^{R,S}}{\mathbb{E}}\langle f_{R}(U),g_{S}(V)\rangle\\ &\qquad\geq\underset{R,S}{\mathbb{E}}\langle \mathbb{E}f_{R},\mathbb{E}g_{S}\rangle-\underset{R,S}{\mathbb{E}}\lambda_{1,n}^{R,S}\sqrt{\mathbb{E}\left\|f_{R}-\mathbb{E}f_{R}\right\|^{2}}\sqrt{\mathbb{E}\left\|g_{S}-\mathbb{E}g_{S}\right\|^{2}}\\ &\qquad=\underset{R,S}{\mathbb{E}}\langle \mathbb{E}f_{R},\mathbb{E}g_{S}\rangle-\underset{R,S}{\mathbb{E}}\lambda_{1,n}^{R,S}\sqrt{\mathbb{E}(1-\left\|\mathbb{E}f_{R}\right\|^{2})}\sqrt{\mathbb{E}(1-\left\|\mathbb{E}g_{S}\right\|^{2})}\\ &\qquad=\underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}g_{\left\|Y\right\|}\rangle+\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\sqrt{1-\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}}\sqrt{1-\left\|\mathbb{E}g_{\left\|Y\right\|}\right\|^{2}}\Big). \end{align}\] The equality case follows from the equality case of Lemma 1. ◻
Using again the notation of Lemma 4. Then
Lemma 5. Let \(f,g\colon \mathbb{R}^{n}\to S^{n-1}\). Then \[\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),g(Y)\rangle \geq\underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(-\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} +\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}g_{\left\|Y\right\|}\rangle +\frac{1}{2}(\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}+\left\|\mathbb{E}g_{\left\|X\right\|}\right\|^{2})\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\Big).\] Equality holds only when \(f(x)=-g(x)=Mx/\left\|Mx\right\|\) for a.e. \(x\in S^{n-1}\), where \(M\) is an \(n\times n\) real orthogonal matrix.
Proof. Applying the inequality \(\left|ab\right|\leq (1/2)(a^{2}+b^{2})\) \(\forall\) \(a,b\in\mathbb{R}\), we have \[\begin{align} &-\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\sqrt{1-\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}}\sqrt{1-\left\|\mathbb{E}g_{\left\|Y\right\|}\right\|^{2}}\\ &\qquad\geq-\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}(1-[\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}+\left\|\mathbb{E}g_{\left\|X\right\|}\right\|^{2}]/2)\\ &\qquad=-\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}+\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}(\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}+\left\|\mathbb{E}g_{\left\|X\right\|}\right\|^{2})/2. \end{align}\] Combining with Lemma 4 concludes the proof. ◻
In this section, we derive some bounds on the first eigenvalue \(\lambda_{1,n}^{r,s}\) as defined in 5 . From 11 , we therefore need to control the function
\[\label{two1} \frac{\int_{-1}^{1} (1-t^{2})^{\frac{n}{2}-\frac{1}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}{\int_{-1}^{1}(1-t^{2})^{\frac{n}{2}-\frac{3}{2}}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}.\tag{12}\]
This is a ratio of modified Bessel functions of the first kind. To recall their definition, first recall the Bessel function \(J_{\alpha}\) of the first kind of order \(\alpha\geq0\) is [21] \[J_{\alpha}(x)\colonequals\frac{1}{\sqrt{\pi}\Gamma(\alpha+1/2)}(x/2)^{\alpha}\int_{-1}^{1}e^{ixt}(1-t^{2})^{\alpha-1/2}\mathrm{d}t,\qquad\forall\,x\in\mathbb{R}.\] The modified Bessel function \(I_{\alpha}\) of the first kind of order \(\alpha\geq0\) is then [21] \[\begin{align} I_{\alpha}(x) \colonequals i^{-\alpha}J_{\alpha}(ix) &=\frac{1}{\sqrt{\pi}\Gamma(\alpha+1/2)}(x/2)^{\alpha}\int_{-1}^{1}e^{-xt}(1-t^{2})^{\alpha-1/2}\mathrm{d}t\\ &=\frac{1}{\sqrt{\pi}\Gamma(\alpha+1/2)}(x/2)^{\alpha}\int_{-1}^{1}e^{xt}(1-t^{2})^{\alpha-1/2}\mathrm{d}t,\qquad\forall\,x\in\mathbb{R}. \end{align}\] \[\frac{I_{\alpha+1}(a)}{I_{\alpha}(a)} =\frac{(a/2)}{\alpha+1/2}\frac{\int_{-1}^{1}e^{at}(1-t^{2})^{\alpha+1/2}\mathrm{d}t}{\int_{-1}^{1}e^{at}(1-t^{2})^{\alpha-1/2}\mathrm{d}t},\qquad\forall\,a\in\mathbb{R},\,\forall\,\alpha\geq0.\]
So, setting \(\alpha=(n/2)-1\) here, the ratio from 12 is equal to \[\frac{n-1}{2}\Big(\frac{2(1-\rho^{2})}{\rho rs}\Big)\frac{I_{n/2}(\rho rs/[1-\rho^{2}])}{I_{(n/2)-1}(\rho rs/[1-\rho^{2}])}.\] Combining with 11 , we have \[\label{lameq} \lambda_{1,n}^{r,s} =\frac{\rho rs}{1-\rho^{2}}\frac{1}{n-1}\frac{n-1}{2}\Big(\frac{2(1-\rho^{2})}{\rho rs}\Big)\frac{I_{n/2}(\rho rs/[1-\rho^{2}])}{I_{(n/2) -1}(\rho rs/[1-\rho^{2}])} =\frac{I_{n/2}(\rho rs/[1-\rho^{2}])}{I_{(n/2)-1}(\rho rs/[1-\rho^{2}])}.\tag{13}\]
We have [22] \[\label{two2} \frac{a}{\alpha+1+\sqrt{(\alpha+1)^{2}+a^{2}}}\leq\frac{I_{\alpha+1}(a)}{I_{\alpha}(a)}\leq\frac{a}{\alpha+\sqrt{\alpha^{2}+a^{2}}},\qquad\forall\,a,\alpha\geq0.\tag{14}\] Therefore, \[\label{two3} \lambda_{1,n}^{r,s}\geq\frac{\rho rs/[1-\rho^{2}]}{n/2+\sqrt{(n/2)^{2}+[\rho rs/[1-\rho^{2}]]^{2}}},\qquad\forall\,r,s>0,\,\forall\,\rho\in(0,1).\tag{15}\]
In order to use our lower bounds on \(\lambda_{1,n}^{r,s}\) in Lemma 3, we must average \(\lambda_{1,n}^{r,s}\) over \(r>0\). Let \(a>0\). Then
\[\int_{0}^{\infty}\frac{r^{3}e^{-r^{2}/2a^{2}}}{1+\sqrt{1+r^{2}}}\mathrm{d}r=e^{1/2a^{2}}a^{3}\int_{1/a}^{\infty}e^{-t^{2}/2}\mathrm{d}t.\]
Setting \(n=3\) in 13 , we get
\[\label{two9} \begin{align} &\int_{0}^{\infty}r^{n-1}\frac{I_{n/2}(ar)}{I_{(n/2)-1}(ar)}e^{-r^{2}/2}\mathrm{d}r \stackrel{\eqref{two2}}{\geq} \int_{0}^{\infty}r^{n-1}\frac{ar}{n/2+\sqrt{(n/2)^{2}+a^{2}r^{2}}}e^{-r^{2}/2}\mathrm{d}r\\ &= \int_{0}^{\infty}r^{n-1}\frac{2ar/n}{1+\sqrt{1+4a^{2}r^{2}/n^{2}}}e^{-r^{2}/2}\mathrm{d}r =(n/2a)\int_{0}^{\infty}(nr/[2a])^{n-1}\frac{r}{1+\sqrt{1+r^{2}}}e^{-n^{2}r^{2}/8a^{2}}\mathrm{d}r\\ &=(3/2a)^{3}\int_{0}^{\infty}\frac{r^{3}}{1+\sqrt{1+r^{2}}}e^{-n^{2}r^{2}/8a^{2}}\mathrm{d}r =(3/2a)^{3}e^{9/8a^{2}}(2/3)^{3}a^{3}\int_{3/2a}^{\infty}e^{-t^{2}/2}\mathrm{d}t\\ &=e^{9/8a^{2}}\int_{3/2a}^{\infty}e^{-t^{2}/2}\mathrm{d}t. \end{align}\tag{16}\]
When \(d=1\) and \(n=3\), we have an explicit expression for \(\lambda_{1,n}^{r,s}\). \[\label{one9} \begin{align} \lambda_{1,n}^{r,s} &\stackrel{\eqref{lameq}}{=}\Big(\frac{\rho rs}{1-\rho^{2}}\Big)\Big(\frac{1}{2}\Big) \frac{\int_{-1}^{1} (1-t^{2})e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t}{\int_{-1}^{1}e^{\frac{\rho rst}{1-\rho^{2}}}\mathrm{d}t} =1-\frac{1}{\Big(\frac{\rho rs}{1-\rho^{2}}\Big)}+\frac{2}{e^{\frac{2\rho rs}{1-\rho^{2}}}-1}, \end{align}\tag{17}\] for all \(r,s>0\), and for all \(0<\rho<1\). Here we used, with \(a=\rho rs/(1-\rho^{2})\), \[\int_{-1}^{1}e^{at}\mathrm{d}t=\frac{1}{a}[e^{a}-e^{-a}].\] \[\begin{align} \int_{-1}^{1}(1-t^{2})e^{at}\mathrm{d}t &=\frac{1}{a}\int_{-1}^{1}(1-t^{2})\frac{\mathrm{d}}{\mathrm{d}t}e^{at}\mathrm{d}t =\frac{1}{a}\int_{-1}^{1}2te^{at}\mathrm{d}t\\ &=\frac{1}{a^{2}}\int_{-1}^{1}2t\frac{\mathrm{d}}{\mathrm{d}t}e^{at}\mathrm{d}t =\frac{1}{a^{2}}\Big[2te^{at}|_{t=-1}^{t=1}-\int_{-1}^{1}2e^{at}\mathrm{d}t\Big]\\ &=\frac{1}{a^{2}}\Big[2(e^{a}+e^{-a})-\frac{2}{a}[e^{a}-e^{-a}]\Big].\\ \end{align}\] \[\frac{\int_{-1}^{1}(1-t^{2})e^{at}\mathrm{d}t}{\int_{-1}^{1}e^{at}\mathrm{d}t} =\frac{1}{a}\Big(2\frac{e^{a}+e^{-a}}{e^{a}-e^{-a}}-\frac{2}{a}\Big) =\frac{1}{a}\Big(2+4\frac{e^{-a}}{e^{a}-e^{-a}}-\frac{2}{a}\Big) =\frac{1}{a}\Big(2+4\frac{1}{e^{2a}-1}-\frac{2}{a}\Big).\]
The following inequality allows us to show that the second term in Lemma 3 is smaller than the last term. The proof amounts to an elementary truncated heat kernel bound, though with a change of measure using the \(\lambda_{1,n}^{r,s}\) term from 11 (i.e. 17 ).
Lemma 6. Let \(f\colon\mathbb{R}^{3}\to S^{2}\) be a radial function (for any \(r>0\), the function \(f|_{rS^{2}}\) is constant). Let \(\phi\colon\mathbb{R}^{3}\to(0,\infty)\). Then \[\begin{align} &\left|\int_{\mathbb{R}^{3}}\langle f(x)-\mathbb{E}_{\gamma}f,\, T_{\rho}[f-\mathbb{E}_{\gamma}f](x)\rangle\gamma_{3}(x)\,\mathrm{d}x\right|\\ &\leq \int_{\mathbb{R}^{3}}\phi(x)\left\|f(x)\right\|^{2}\gamma_{3}(x)\,\mathrm{d}x \sum_{d\geq2\colon d\,\mathrm{even}}\rho^{d}\sum_{k\in(2\mathbb{N})^{2}\colon \left\|k\right\|_{1}=d}\int_{\mathbb{R}^{3}} \left|\sqrt{k!}h_{k}(x)\frac{1}{\sqrt{\phi(x)}}\right|^{2}\gamma_{3}(x)\,\mathrm{d}x. \end{align}\] In particular, if \(\phi(x)=1-\frac{1}{\rho\left\|x\right\|}+\frac{2}{e^{2\rho\left\|x\right\|}-1}\) for all \(x\in\mathbb{R}^{3}\), and if \(0<\rho<1/9\), then \[\left|\int_{\mathbb{R}^{3}}\langle f(x)-\mathbb{E}_{\gamma}f,\, T_{\rho}[f-\mathbb{E}_{\gamma}f](x)\rangle\gamma_{3}(x)\,\mathrm{d}x\right| \leq(9.4\rho)\int_{\mathbb{R}^{3}}\phi(x)\left\|f(x)\right\|^{2}\gamma_{3}(x)\mathrm{d}x.\]
Proof. Let \(h_{0},h_{1},\ldots\colon\mathbb{R}\to\mathbb{R}\) be the Hermite polynomials with \(h_{m}(x)=\sum_{k=0}^{\lfloor m/2\rfloor}\frac{x^{m-2k}(-1)^{k}2^{-k}}{k!(m-2k)!}\) for all integers \(m\geq0\). It is well known [23] that \(\{\sqrt{m!}h_{m}\}_{m\geq0}\) is an orthonormal basis of the Hilbert space of functions \(\mathbb{R}\to\mathbb{R}\) equipped with the inner product \(\langle g,h\rangle\colonequals\int_{\mathbb{R}}g(x)h(x)\gamma_{3}(x)\,\mathrm{d}x\). For any \(k\in\mathbb{N}^{2}\), define \(k!\colonequals k_{1}!\cdot k_{2}!\), and define \(\left\|k\right\|_{1}\colonequals\left|k_{1}\right|+\left|k_{2}\right|\). \[\begin{align} &\int_{\mathbb{R}^{3}}\langle f(x)-\mathbb{E}_{\gamma}f,\, T_{\rho}[f-\mathbb{E}_{\gamma}f](x)\rangle\gamma_{3}(x)\,\mathrm{d}x\\ &=\sum_{d\geq2\colon d\,\mathrm{even}}\rho^{d}\sum_{k\in(2\mathbb{N})^{2}\colon \left\|k\right\|_{1}=d}\left\|\int_{\mathbb{R}^{3}} \sqrt{k!}h_{k}(x) (f(x)-\mathbb{E}_{\gamma}f)\gamma_{3}(x)\,\mathrm{d}x\right\|^{2}\\ &=\sum_{d\geq2\colon d\,\mathrm{even}}\rho^{d}\sum_{k\in(2\mathbb{N})^{2}\colon \left\|k\right\|_{1}=d}\left\|\int_{\mathbb{R}^{3}} \sqrt{k!}h_{k}(x) f(x)\gamma_{3}(x)\,\mathrm{d}x\right\|^{2}\\ &=\sum_{d\geq2\colon d\,\mathrm{even}}\rho^{d}\sum_{k\in(2\mathbb{N})^{2}\colon \left\|k\right\|_{1}=d}\left\|\int_{\mathbb{R}^{3}} \sqrt{k!}h_{k}(x)\frac{1}{\sqrt{\phi(x)}} \sqrt{\phi(x)}f(x)\gamma_{3}(x)\,\mathrm{d}x\right\|^{2}\\ &\leq\sum_{d\geq2\colon d\,\mathrm{even}}\rho^{d}\sum_{\substack{k\in(2\mathbb{N})^{2}\colon\\ \left\|k\right\|_{1}=d}}\int_{\mathbb{R}^{3}} \left|\sqrt{k!}h_{k}(x)\frac{1}{\sqrt{\phi(x)}}\right|^{2}\gamma_{3}(x)\,\mathrm{d}x \cdot\int_{\mathbb{R}^{3}}\phi(x)\left\|f(x)\right\|^{2}\gamma_{3}(x)\,\mathrm{d}x\\ &=\int_{\mathbb{R}^{3}}\phi(x)\left\|f(x)\right\|^{2}\gamma_{3}(x)\,\mathrm{d}x\cdot \sum_{d\geq2\colon d\,\mathrm{even}}\rho^{d}\sum_{\substack{k\in(2\mathbb{N})^{2}\colon\\ \left\|k\right\|_{1}=d}}\int_{\mathbb{R}^{3}} \left|\sqrt{k!}h_{k}(x)\frac{1}{\sqrt{\phi(x)}}\right|^{2}\gamma_{3}(x)\,\mathrm{d}x. \end{align}\]
Using the inequality \[\frac{r^{2}}{\phi(r)}\leq \frac{3}{\rho}r+r^{2},\qquad\forall\,r>0,\] which can e.g. be verified in Matlab
rho=.1;
r=linspace(.1,20,1000);
rsqphi = r.^2 ./ (1 - 1./(rho*r) + 2./(exp(2*rho*r) -1));
plot(r, rsqphi , r, 3*r/rho + r.^2);
legend('r^2 / phi','upper bound');
if sum( rsqphi - (3*r/rho + r.^2)>0)==0, fprintf('Verified\r'), end
\[\begin{align} &\sum_{d\geq2\colon d\,\mathrm{even}}\rho^{d}\sum_{k\in(2\mathbb{N})^{2}\colon \left\|k\right\|_{1}=d}\int_{\mathbb{R}^{3}} \left|\sqrt{k!}h_{k}(x)\frac{1}{\sqrt{\phi(x)}}\right|^{2}\gamma_{3}(x)\,\mathrm{d}x\\ &\qquad=\int_{\mathbb{R}^{3}}\frac{1}{\phi(x)}\Big[(2\pi)^{3/2}\frac{G_{\rho}(x,x)+G_{\rho}(x,-x)}{2e^{-\left\|x\right\|^{2}/2}} - \gamma_{3}(x)\Big]\, \mathrm{d}x\\ &\qquad=\int_{\mathbb{R}^{3}}\frac{1}{\phi(x)}\Big[\frac{1}{(2\pi)^{3/2}}\frac{1}{(1-\rho^{2})^{3/2}} e^{-\frac{\left\|x\right\|^{2}}{1-\rho^{2}}} \frac{e^{\frac{\rho\left\|x\right\|^{2}}{1-\rho^{2}}}+e^{-\frac{\rho\left\|x\right\|^{2}}{1-\rho^{2}}}}{2e^{-r^{2}/2}} - \gamma_{3}(x)\Big]\, \mathrm{d}x\\ &\qquad=\sqrt{\frac{2}{\pi}}\int_{r=0}^{\infty}\Big(\frac{3}{\rho}r+r^{2}\Big)\frac{1}{2(1-\rho^{2})^{3/2}}\\ &\qquad\qquad\qquad\qquad\cdot\Big(-2(1-\rho^{2})^{3/2}e^{-r^{2}/2}+e^{-r^{2}\frac{1-\rho}{1-\rho^{2}}}e^{r^{2}/2}+e^{-r^{2}\frac{1+\rho}{1-\rho^{2}}}e^{r^{2}/2}\Big)\, \mathrm{d}r\\ &\qquad=\sqrt{\frac{2}{\pi}}\int_{r=0}^{\infty}\Big(\frac{3}{\rho}r+r^{2}\Big)\frac{1}{2(1-\rho^{2})^{3/2}}\\ &\qquad\qquad\qquad\qquad\cdot\Big(-2(1-\rho^{2})^{3/2}e^{-r^{2}/2}+e^{-r^{2}\left(\frac{1}{1+\rho}-\frac{1}{2}\right)}+e^{-r^{2}\left(\frac{1}{1-\rho}-\frac{1}{2}\right)}\Big)\, \mathrm{d}r\\ &\qquad=\sqrt{\frac{2}{\pi}}\frac{1}{2(1-\rho^{2})^{3/2}}\frac{3}{\rho}\Big(-2(1-\rho^{2})^{3/2}+\frac{1}{2[\frac{1}{1+\rho}-\frac{1}{2}]}+ \frac{1}{2[\frac{1}{1-\rho}-\frac{1}{2}]} \Big)\\ &\qquad\qquad+ \sqrt{\frac{2}{\pi}}\frac{1}{(1-\rho^{2})^{3/2}}\frac{\sqrt{\pi}}{8}\Big(-2\cdot 2^{3/2}(1-\rho^{2})^{3/2}+\frac{1}{[\frac{1}{1+\rho}-\frac{1}{2}]^{3/2}}+\frac{1}{[\frac{1}{1-\rho}-\frac{1}{2}]^{3/2}}\Big). \end{align}\]
When \(\rho<1/9\), this quantity is upper bounded by \(9.4\rho\). ◻
For the negative correlation case of the main result, we require a bilinear version of Lemma 6 above. Lemma 7 follows from Lemma 6 and the Cauchy-Schwarz inequality.
Lemma 7. Let \(f,g\colon\mathbb{R}^{3}\to S^{2}\) be radial functions (for any \(r>0\), the function \(f|_{rS^{2}}\) is constant). Let \(\phi(x)=1-\frac{1}{\rho\left\|x\right\|}+\frac{2}{e^{2\rho\left\|x\right\|}-1}\) for all \(x\in\mathbb{R}^{3}\). If \(0<\rho<1/9\), then \[\left|\int_{\mathbb{R}^{3}}\langle f(x)-\mathbb{E}_{\gamma}f,\, T_{\rho}[g-\mathbb{E}_{\gamma}g](x)\rangle\gamma_{3}(x)\,\mathrm{d}x\right| \leq(9.4\rho)\int_{\mathbb{R}^{3}}\phi(x)\frac{1}{2}[\left\|f(x)\right\|^{2}+\left\|g(x)\right\|^{2}]\gamma_{3}(x)\mathrm{d}x.\]
Proof of Theorem 3 when \(\rho>0\). The dimension reduction Theorem [13] implies that we may assume \(n=k=3\). From Lemma 3 \[\label{four0} \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),f(Y)\rangle -\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \leq \underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}f_{\left\|Y\right\|}\rangle -\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\Big).\tag{18}\] It remains to show that the right side is nonpositive. Since \(\mathbb{E}_{\gamma}f\colonequals\int_{\mathbb{R}^{n}}f(x)\gamma_{n}(x)\,\mathrm{d}x=0\), we have by Lemma 6 that \[\label{four1} \begin{align} \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}f_{\left\|Y\right\|}\rangle &=\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle \mathbb{E}f_{\left\|X\right\|}-\mathbb{E}_{\gamma}f, \mathbb{E}f_{\left\|Y\right\|}-\mathbb{E}_{\gamma}f\rangle\\ &\leq9.4\rho\int_{\mathbb{R}^{n}}\phi(x)\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\gamma_{n}(x)\, d x. \end{align}\tag{19}\] Here \(\phi(x)\colonequals 1 - \frac{1}{\rho\left\|x\right\|}+\frac{2}{\exp^{2\rho\left\|x\right\|} -1}\stackrel{\eqref{one9}}{=}\lambda_{1,n}^{\left\|x\right\|,1}\). Meanwhile, the last term in 18 satisfies \[\label{four2} \begin{align} &\underset{X\sim_{\rho}Y}{\mathbb{E}}\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \stackrel{\eqref{oudef}}{=}\int_{\mathbb{R}^{3}}\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\Big(\int_{\mathbb{R}^{3}}\lambda_{1,n}^{\|\rho x+y\sqrt{1-\rho^{2}}\|,\left\|x\right\|}\gamma_{n}(y)\,\mathrm{d}y\Big)\gamma_{n}(x)\,\mathrm{d}x\\ &\qquad=\int_{\mathbb{R}^{3}}\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\Big(\sqrt{\frac{2}{\pi}}\frac{1}{4\pi}\int_{u\in S^{2}}\int_{r=0}^{\infty}r^{2}\lambda_{1,n}^{\|\rho x+ru\sqrt{1-\rho^{2}}\|,\left\|x\right\|}e^{-\left\|r\right\|^{2}/2}\,\mathrm{d}r \mathrm{d}u\Big)\gamma_{n}(x)\,\mathrm{d}x\\ &\qquad\stackrel{\eqref{one9}}{\geq}\int_{\mathbb{R}^{3}}\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\Big(\sqrt{\frac{2}{\pi}}\frac{1}{4\pi}\int_{u\in S^{2}}\int_{r=0}^{\infty}r^{2}\lambda_{1,n}^{\|ru\sqrt{1-\rho^{2}}\|,\left\|x\right\|}e^{-\left\|r\right\|^{2}/2}\,\mathrm{d}r \mathrm{d}u\Big)\gamma_{n}(x)\,\mathrm{d}x\\ &\qquad=\int_{\mathbb{R}^{3}}\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\Big(\sqrt{\frac{2}{\pi}}\int_{r=0}^{\infty}\lambda_{1,n}^{r\sqrt{1-\rho^{2}},\left\|x\right\|}e^{-\left\|r\right\|^{2}/2}\,\mathrm{d}r\Big)\gamma_{n}(x)\,\mathrm{d}x. \end{align}\tag{20}\] The inequality used that \(\lambda_{1,2}^{a,b}\) is an increasing function of \(a>0\) by 17 , so the average over \(r,u\) is smallest when \(\rho x=0\).
From 16 when \(n=3\) and \(a=\rho rs/(1-\rho^{2})\) with 13 , if \(s>0\) and \(0<\rho<1/5\), then \[\begin{align} \sqrt{\frac{2}{\pi}}\int_{r=0}^{\infty}r^{2}\lambda_{1,n}^{r\sqrt{1-\rho^{2}},s}e^{-\left\|r\right\|^{2}/2}\,\mathrm{d}r &\stackrel{\eqref{two9}\wedge\eqref{lameq}}{\geq} \sqrt{\frac{2}{\pi}}e^{\frac{9(1-\rho^{2})}{8s^{2}\rho^{2}}}\int_{\frac{3\sqrt{1-\rho^{2}}}{2\rho s}}^{\infty}e^{-t^{2}/2}\mathrm{d}t\\ &\geq(.98)\Big(1 - \frac{1}{\rho s}+\frac{2}{e^{2\rho s} -1}\Big). \end{align}\]
The last inequality can be verified e.g. with Matlab
rho=.03;
r=linspace(0,1000,1000);
phi=.98*(1 - 1./(rho * r ) +2./(exp(2* rho *r ) -1));
y=exp( 9*(1-rho^2) ./ (8* r.^2 * rho^2)) ...
.* erfc( 3*sqrt(1-rho^2) ./ (2*sqrt(2) * rho *r));
plot(r, phi, r, y);
legend('lambda lower bd','integral quant (larger)');
if sum(y-phi<0)==0, fprintf('Verified\r'), end
Substituting this into 20 , we get
\[\label{four3} \underset{X\sim_{\rho}Y}{\mathbb{E}}\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \geq.98\int_{\mathbb{R}^{3}}\phi(x)\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\gamma_{n}(x)\,\mathrm{d}x.\tag{21}\]
Combining 18 , 21 and 19 , we have \[\label{four4} \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),f(Y)\rangle -\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \leq(9.4\rho - .98)\int_{\mathbb{R}^{n}}\phi(x)\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\gamma_{n}(x)\, d x.\tag{22}\] If \(\rho<.104\), the right side is nonpositive, and it is equal to zero only when \(\mathbb{E}f_{\left\|x\right\|}=0\) for a.e. \(x\in\mathbb{R}^{3}\). That is, \[\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),f(Y)\rangle \stackrel{\eqref{four4}}{\leq}\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \stackrel{\eqref{foptdef}\wedge\eqref{nine7}}{=}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\rm opt}(X),f_{\rm opt}(Y)\rangle.\] with equality only when \(\mathbb{E}f_{\left\|x\right\|}=0\) for a.e. \(x\in\mathbb{R}^{3}\). Finally, if \(\mathbb{E}f_{\left\|x\right\|}=0\) for a.e. \(x\in\mathbb{R}^{3}\), then Lemma 2 implies that we must have \(f=f_{\rm opt}(M\cdot)\) for some real \(3\times 3\) orthogonal matrix \(M\). ◻
Proof of Theorem 3 when \(\rho<0\). In order to prove Theorem 3 for negative \(\rho\), we consider instead \(\rho>0\) but with a bilinear variant of the noise stability over functions \(f,g\colon\mathbb{R}^{n}\to S^{2}\) under the constraint that \(\mathbb{E}_{\gamma}f=\mathbb{E}_{\gamma}g=0\).
So, within the proof below, we have \(\rho>0\), and we will show that \[\label{six1} \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),g(Y)\rangle\geq -\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\rm opt}(X),f_{\rm opt}(Y)\rangle.\tag{23}\] Dimension Reduction (Theorem 6 below) implies that we may assume \(n=k=3\) in order to prove 23 . Equation 23 proves Theorem 3 for negative correlations, since \[\begin{align} \underset{X\sim_{(-\rho)}Y}{\mathbb{E}}\langle f(X),f(Y)\rangle &\stackrel{\eqref{gdef}}=\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),f(-(Y))\rangle \stackrel{\eqref{six1}}{\geq}-\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\rm opt}(X),f_{\rm opt}(Y)\rangle\\ &\stackrel{\eqref{gdef}}=\underset{X\sim_{(-\rho)}Y}{\mathbb{E}}\langle f_{\rm opt}(X),f_{\rm opt}(-Y)\rangle \stackrel{\eqref{foptdef}}{=}\underset{X\sim_{(-\rho)}Y}{\mathbb{E}}\langle f_{\rm opt}(X),-f_{\rm opt}(Y)\rangle. \end{align}\]
So, it remains to prove 23 . From Lemma 5 \[\label{five0} \underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\langle f(X),g(Y)\rangle +\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\Big) \geq \underset{X\sim_{\rho}Y}{\mathbb{E}}\Big(\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}g_{\left\|Y\right\|}\rangle +\frac{1}{2}(\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}+\left\|\mathbb{E}g_{\left\|X\right\|}\right\|^{2})\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\Big).\tag{24}\] It remains to show that the right side is nonnegative. Since \(\mathbb{E}_{\gamma}f=\mathbb{E}_{\gamma}g\) by assumption, \[\begin{align} \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}g_{\left\|Y\right\|}\rangle &=\left\|\mathbb{E}_{\gamma}f\right\|^{2}+\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle \mathbb{E}f_{\left\|X\right\|}- \mathbb{E}_{\gamma}f, \mathbb{E}g_{\left\|Y\right\|}-\mathbb{E}_{\gamma}g\rangle\\ &\geq \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle \mathbb{E}f_{\left\|X\right\|}- \mathbb{E}_{\gamma}f, \mathbb{E}g_{\left\|Y\right\|}-\mathbb{E}_{\gamma}g\rangle. \end{align}\] So, by Lemma 7, \[\label{five1} \begin{align} \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle \mathbb{E}f_{\left\|X\right\|}, \mathbb{E}g_{\left\|Y\right\|}\rangle &\geq\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle \mathbb{E}f_{\left\|X\right\|}-\mathbb{E}_{\gamma}f, \mathbb{E}g_{\left\|Y\right\|}-\mathbb{E}_{\gamma}g\rangle\\ &\geq-9.4\rho\int_{\mathbb{R}^{n}}\phi(x)\frac{1}{2}(\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}+\left\|\mathbb{E}g_{\left\|x\right\|}\right\|^{2})\gamma_{n}(x)\, d x. \end{align}\tag{25}\] Here \(\phi(x)\colonequals 1 - \frac{1}{\rho\left\|x\right\|}+\frac{2}{\exp^{2\rho\left\|x\right\|} -1}\stackrel{\eqref{one9}}{=}\lambda_{1,n}^{\left\|x\right\|,1}\). Meanwhile, 21 says that
\[\label{five3} \underset{X\sim_{\rho}Y}{\mathbb{E}}\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \geq.98\int_{\mathbb{R}^{3}}\phi(x)\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}\gamma_{n}(x)\,\mathrm{d}x.\tag{26}\] \[\label{five3p2} \underset{X\sim_{\rho}Y}{\mathbb{E}}\left\|\mathbb{E}g_{\left\|X\right\|}\right\|^{2}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \geq.98\int_{\mathbb{R}^{3}}\phi(x)\left\|\mathbb{E}g_{\left\|x\right\|}\right\|^{2}\gamma_{n}(x)\,\mathrm{d}x.\tag{27}\] Adding these together, we get \[\label{five3p3} \underset{X\sim_{\rho}Y}{\mathbb{E}}\frac{1}{2}(\left\|\mathbb{E}f_{\left\|X\right\|}\right\|^{2}+\left\|\mathbb{E}g_{\left\|X\right\|}\right\|^{2})\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \geq.98\int_{\mathbb{R}^{3}}\phi(x)\frac{1}{2}(\left\|\mathbb{E}f_{\left\|x\right\|}\right\|^{2}+\left\|\mathbb{E}g_{\left\|x\right\|}\right\|^{2})\gamma_{n}(x)\,\mathrm{d}x.\tag{28}\]
Combining 24 , 28 and 25 , we have \[\label{five4} \begin{align} &\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),g(Y)\rangle +\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|}\\ &\qquad\qquad\qquad\qquad\geq(.98-9.4\rho)\int_{\mathbb{R}^{n}}\phi(x)\frac{1}{2}(\left\|\mathbb{E}f_{\left\|x\right\|}(x)\right\|^{2}+\left\|\mathbb{E}g_{\left\|x\right\|}(x)\right\|^{2})\gamma_{n}(x)\, d x. \end{align}\tag{29}\] If \(\rho<.104\), the right side is nonnegative, and it is equal to zero only when \(\mathbb{E}f_{\left\|x\right\|}=\mathbb{E}g_{\left\|x\right\|}=0\) for a.e. \(x\in\mathbb{R}^{3}\). That is, \[\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),g(Y)\rangle \stackrel{\eqref{five4}}{\geq}-\underset{X\sim_{\rho}Y}{\mathbb{E}}\lambda_{1,n}^{\left\|X\right\|,\left\|Y\right\|} \stackrel{\eqref{foptdef}\wedge\eqref{nine7}}{=}-\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\rm opt}(X),f_{\rm opt}(Y)\rangle.\] with equality only when \(\mathbb{E}f_{\left\|x\right\|}=\mathbb{E}g_{\left\|x\right\|}=0\) for a.e. \(x\in\mathbb{R}^{3}\).
If \(\mathbb{E}f_{\left\|x\right\|}=\mathbb{E}g_{\left\|x\right\|}=0\) for all \(x\in\mathbb{R}^{3}\), then Lemma 4 implies that we must have \(f=-g=f_{\rm opt}(M\cdot)\) almost surely, for some real \(3\times 3\) orthogonal matrix \(M\). So, 23 is proven, and the negative correlation case of Theorem 3 follows. ◻
Proof of Theorem 5. From [13]: assuming the Unique Games Conjecture, for any \(-1<\rho<0\) and for any \(\varepsilon>0\), it is NP-hard to approximate the product state of Quantum MAX-CUT within a multiplicative factor of \(\alpha_{\rho,\rm BOV}+\varepsilon\), where \[\alpha_{\rho,\rm BOV} \colonequals \lim_{n\to\infty}\sup_{f\colon\mathbb{R}^{n}\to S^{2}}\frac{1-\int_{\mathbb{R}^{n}}\langle f(x), T_{\rho}f(x)\rangle\,\gamma_{n}(x)\,\mathrm{d}x}{1-\rho}.\] Theorem 3 says, if \(-.104<\rho<0\), then \[\label{labeq} \alpha_{\rho,\rm BOV}=\frac{1-\int_{\mathbb{R}^{3}}\langle f_{\rm opt}(x), T_{\rho}f_{\rm opt}(x)\rangle\,\gamma_{3}(x)\,\mathrm{d}x}{1-\rho}.\tag{30}\] An explicit formula for the noise stability of \(f_{\rm opt}\) from [13] implies \[\alpha_{\rho,\rm BOV} \stackrel{\eqref{foptdef}\wedge\eqref{labeq}}{=}\frac{1-F^{*}(3,\rho)}{1-\rho} \stackrel{\eqref{fstdef}}{=}\frac{1-\frac{2}{3}\Big(\frac{\Gamma((3+1)/2)}{\Gamma(3/2)}\Big)^{2}\rho\cdot\,\,_{2}F_{1}(1/2,1/2,3/2 +1, \rho^{2})}{1-\rho}.\] Since NP hardness then applies from [13] for any \(-.104<\rho<0\), we then have NP-hardness for the infimum of \(\alpha_{\rho,\rm BOV}\) over all \(-.104<\rho<0\). That is, it is NP-hard to approximate the product state of Quantum MAX-CUT within a multiplicative factor of \[\begin{align} \inf_{-.104<\rho<0}\alpha_{\rho,\rm BOV} &=\alpha_{(-.104),\rm BOV} =\frac{1-\frac{2}{3}\Big(\frac{\Gamma((3+1)/2)}{\Gamma(3/2)}\Big)^{2}(-.104)\cdot\,\,_{2}F_{1}(1/2,1/2,3/2 +1, (.104)^{2})}{1-(-.104)}\\ &=.98584579\ldots. \end{align}\]
The above function of \(\rho\) is monotone, with minimum at the endpoint \(\rho=-.104\). ◻
Theorem 6 (Bilinear Dimension Reduction). Let \(\rho>0\). Define \[s_{n}\colonequals \inf_{\substack{f,g\colon\mathbb{R}^{n}\to S^{k-1}\\\mathbb{E}_{\gamma}f=\mathbb{E}_{\gamma}g=0 }}\quad \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),g(Y)\rangle.\] Then \[s_{n}=s_{k},\qquad\forall\, n\geq k.\]
The inequality \(s_{n}\leq s_{k}\) easily holds for all \(n\geq k\), so the content of Theorem 6 is that \(s_{n}\geq s_{k}\) for all \(n\geq k\).
Theorem 6 is a fairly straightforward adaptation of [13] to the bilinear setting (itself an adaptation of [16] to the sphere-valued setting). In order to emphasize the relation between the argument below and that of [13], we will match the notation from [13], where appropriate.
We say a vector field \(W\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\) is a tame vector field if \(W\) is bounded and \(C^{\infty}\), and all of the partial derivatives of \(W\) of any order exists and are bounded.
For any \(f\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\), for any \(t\in\mathbb{R}\), denote \[\label{vdef} \mathcal{V}_{t,W}f(x)\colonequals\frac{f(x)+tW(x)}{\max(1,\left\|f(x)+tW(x)\right\|)},\qquad\forall\,x\in\mathbb{R}^{n}.\tag{31}\] Denote also \[B^{k}\colonequals\{x\in\mathbb{R}^{k}\colon\left\|x\right\|\leq 1\}.\]
Definition 6. Let \(f,g\colon\mathbb{R}^{n}\to B^{k}\) satisfy \(\mathbb{E}_{\gamma}f=\mathbb{E}_{\gamma}g=0\). Fix \(0<\rho<1\). We say that \(f,g\) are optimally stable with correlation \(\rho\) if, for all \(h,k\colon\mathbb{R}^{n}\to B^{k}\) with \(\mathbb{E}_{\gamma}h=\mathbb{E}_{\gamma}k=0\), \[\int_{\mathbb{R}^{k}}\langle f(x), T_{\rho}g(x)\rangle \gamma_{k}(x)\, \mathrm{d}x\leq \int_{\mathbb{R}^{k}}\langle h(x), T_{\rho}k(x)\rangle \gamma_{k}(x)\, \mathrm{d}x.\]
Lemma 8. If \(f,g\) are optimally stable with correlation \(\rho\), then \(g=-f\).
Proof. Since \(f=g=0\) has \(\mathbb{E}_{\gamma}\langle f,T_{\rho}g\rangle=0\), if \(f,g\) are optimally stable, then \(\mathbb{E}_{\gamma}\langle f,T_{\rho}g\rangle\leq0\). So, the Cauchy-Schwarz inequality implies that \[\begin{align} \mathbb{E}_{\gamma}\langle f,T_{\rho}g\rangle &=-\left|\mathbb{E}_{\gamma}\langle f,T_{\rho}g\rangle\right|\geq -\left|\mathbb{E}_{\gamma}\langle f,T_{\rho}f\rangle\right|^{1/2}\left|\mathbb{E}_{\gamma}\langle g,T_{\rho}g\rangle\right|^{1/2}\\ &\geq-\max\Big(\left|\mathbb{E}_{\gamma}\langle f,T_{\rho}f\rangle\right|,\left|\mathbb{E}_{\gamma}\langle g,T_{\rho}g\rangle\right|\Big). \end{align}\] All of these inequalities become equalities when \(g=-f\), since \(\mathbb{E}_{\gamma}\langle f,T_{\rho}f\rangle\geq0\), so \[\mathbb{E}_{\gamma}\langle f,T_{\rho}(-f)\rangle =-\mathbb{E}_{\gamma}\langle f,T_{\rho}f\rangle =-\left|\mathbb{E}_{\gamma}\langle f,T_{\rho}f\rangle\right|.\] ◻
Lemma 9 (Lemma 6.8, [13]). Let \(f\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\) be measurable. Then there exists vectors fields \(W_{1},\ldots,W_{k}\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\) such that \[\mathrm{span}\Big\{\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma} [\mathcal{V}_{t,W}f]\colon 1\leq i\leq k\Big\}=\mathbb{R}^{k}.\]
Lemma 10 (Lemma 6.9, [13]). Let \(f,g\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\) be optimally stable. Then, for any bounded measurable vector fields \(W,Z\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\) such that \(\frac{\mathrm{d}}{\mathrm{d}t}|_{t=0}\mathbb{E}_{\gamma} \mathcal{V}_{t,W}f=\frac{\mathrm{d}}{\mathrm{d}t}|_{t=0}\mathbb{E}_{\gamma} \mathcal{V}_{t,Z}g=0\), we have \[\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{V}_{t,W}f(X),\, \mathcal{V}_{t,Z}g(Y)\rangle=0.\]
Proof. \[f_{\alpha,\beta}\colonequals\frac{f(x)+\beta W(x)+\sum_{i=1}^{k}\alpha_{i}W_{i}(x) }{\max\Big(1,\|f(x)+\beta W(x)+\sum_{i=1}^{k}\alpha_{i}W_{i}(x)\|\Big)}.\] \[g_{\alpha,\beta}\colonequals\frac{g(x)+\beta Z(x)+\sum_{i=1}^{k}\alpha_{i}W_{i}(x) }{\max\Big(1,\|g(x)+\beta Z(x)+\sum_{i=1}^{k}\alpha_{i}W_{i}(x)\|\Big)}.\]
Define \(L\colon\mathbb{R}^{2k+1}\to\mathbb{R}^{2k}\) by \[L(\alpha,\theta,\beta)\colonequals \left(\mathbb{E}_{\gamma}f_{\alpha,\beta},\,\, \mathbb{E}_{\gamma}g_{\theta\beta}\right),\qquad\forall\,\alpha,\theta\in\mathbb{R}^{k},\,\forall\,\beta\in\mathbb{R}.\] Then by assumption, we have \[\label{three1} \frac{\partial L}{\partial\beta}(0,0)\stackrel{\eqref{vdef}}{=}\left(\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,W}f,\,\,\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,Z}g\right) =0.\tag{32}\] And by definition of \(L\), we have, for all \(1\leq i\leq k\), \[\label{three2} \begin{align} \frac{\partial L}{\partial\alpha_{i}}(0,0)&=\left(\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,W_{i}}f,\,\, 0 \right).\\ \frac{\partial L}{\partial\theta_{i}}(0,0)&=\left( 0,\,\, \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,W_{i}}g \right).\\ \end{align}\tag{33}\] From Lemma 9, we conclude that the matrix of partial derivatives \(DL\) of \(L\) is a \((2k+1)\times 2k\) matrix of rank \(2k\). So, by the Implicit Function Theorem, there exists \(\varepsilon>0\) and a differentiable curve \(\eta\colon(-\varepsilon,\varepsilon)\to\mathbb{R}^{2k+1}\) with \(\eta(0)=0\), \(\eta'(0)\neq0\) and \(L(\eta(t))=0\) for all \(t\in(-\varepsilon,\varepsilon)\). The last property implies, by the Chain Rule, that \[\label{three3} 0=\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}L(\eta(t)) =\sum_{i=1}^{k}\frac{\partial L}{\partial\alpha_{i}}(0,0)\eta_{i}'(0) +\sum_{i=1}^{k}\frac{\partial L}{\partial\theta_{i}}(0,0)\eta_{k+i}'(0)\qquad+\frac{\partial L}{\partial \beta}(0,0)\eta_{2k+1}'(0).\\\tag{34}\] The last term is zero by 32 . Lemma 9 and 33 imply that set \(\Big\{\frac{\partial L}{\partial\alpha_{i}}(0,0),\frac{\partial L}{\partial\theta_{i}}(0,0)\Big\}_{i=1}^{k}\) consists of \(2k\) linearly independent vectors. We conclude from 34 that \(\eta_{i}'(0)=\eta_{i+k}'(0)=0\) for all \(1\leq i\leq k\). Since \(\eta'(0)\neq0\), we conclude that \(\eta_{2k+1}'(0)\neq0\). Let \(J(t)\colonequals \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{(\eta_{i}(t))_{i=1}^{k},\eta_{2k+1}(t)}(X),\, g_{(\eta_{i+k}(t))_{i=1}^{k},\eta_{2k+1}(t)}(Y)\rangle\) for all \(t\in(-\varepsilon,\varepsilon)\). Since \(L(\eta(t))=0\) for all \(t\in(-\varepsilon,\varepsilon)\), \((f,g)\) satisfy \(\mathbb{E}_{\gamma}f=\mathbb{E}_{\gamma}g=0\) for all \(t\in(-\varepsilon,\varepsilon)\). Since \((f,g)\) are optimally stable, we therefore have \((d/dt)J(t)=0\). Using the previous facts and the chain rule, \[\begin{align} 0&=\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}J(t) =\sum_{i=1}^{k}\eta_{i}'(t)\frac{\partial }{\partial\alpha_{i}}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\alpha,\beta}(X),\, g_{\theta,\beta}(Y)\rangle +\sum_{i=1}^{k}\eta_{i+k}'(t)\frac{\partial }{\partial\theta_{i}}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\alpha,\beta}(X),\, g_{\theta,\beta}(Y)\rangle\\ &\qquad+\eta_{2k+1}'(0)\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\alpha,\eta_{2k+1}(t)}(X),\, g_{\theta,\eta_{2k+1}(t)}(Y)\rangle\\ &=\eta_{2k+1}'(0)\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\alpha,\eta_{2k+1}(t)}(X),\, g_{\theta,\eta_{2k+1}(t)}(Y)\rangle\\ &\stackrel{\eqref{vdef}}{=}\eta_{2k+1}'(0)\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{V}_{t,W}f(X),\, \mathcal{V}_{t,Z}g(Y)\rangle. \end{align}\] Since \(\eta_{2k+1}'(0)\neq0\), the last term is zero, i.e. the proof is concluded. ◻
Lemma 11 ([13]). Let \(f,g\) be optimally stable. \(\exists\) \(\lambda\in\mathbb{R}^{2k}\) such that \[\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{V}_{t,W}f(X),\, \mathcal{V}_{t,Z}g(Y)\rangle =\Big\langle \Big(\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,W}f ,\, \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,Z}g\Big),\lambda\Big\rangle.\]
Proof. Let \(W,Z\colon\mathbb{R}^{n}\to\mathbb{R}^{n}\), and define \(\phi_{1},\phi_{2},\psi\) by \[\phi_{1}(W)\colonequals\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,W}f.\] \[\phi_{2}(Z)\colonequals\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,Z}g.\] \[\psi(W,Z)\colonequals\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{V}_{t,W}f(X),\, \mathcal{V}_{t,Z}g(Y)\rangle.\] Note that \(\phi_{1},\phi_{2},\psi\) are linear functions of \((W,Z)\). Let \(\mathcal{X}\) be a finite-dimensional subspace of bounded measurable vectors fields such that \(\{(\phi_{1}(W),\phi_{2}(Z))\colon W,Z\in\mathcal{X}\}\) spans \(\mathbb{R}^{2k}\). Define \(L\colon \mathcal{X}\to \mathbb{R}^{2k+1}\) by
\[L(W,Z)\colonequals\Big(\phi_{1}(W),\phi_{2}(Z),\psi(W,Z)\Big).\] From Lemma 9, \((0,\ldots,0,1)\) is not in the range of \(L\). So, there exists \(\lambda=\lambda(\mathcal{X})\in\mathbb{R}^{2k}\) such that \((-\lambda,1)\) is orthogonal to the range of \(L\) (with respect to the standard inner product in \(\mathbb{R}^{2k+1}\).) That is, \(\langle L(W,Z), (-\lambda,1)\rangle=0\) for all \(W,Z\in\mathcal{X}\). That is, \(\psi(W,Z)=\langle(\phi_{1}(W),\phi_{2}(Z)),\lambda\rangle\). As in [13], \(\lambda\) does not depend on \(\mathcal{X}\). ◻
For any \(t\in\mathbb{R}\), denote \(t_{+}\colonequals\max(t,0)\).
Lemma 12 (Lemma 6.7, [13]). Let \(f\colon\mathbb{R}^{n}\to B^{k}\) be measurable. Let \(W\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\) be a bounded measurable vector field. For any \(x\in\mathbb{R}^{n}\), \[\mathcal{V}_{t,W}f(x)= f(x)+tW(x)-t\langle W(x),f(x)\rangle_{+}f(x)+O(t^{2}),\] where the \(O(t^{2})\) term is uniform in \(x\). Also, \[\frac{\mathrm{d}}{\mathrm{d}t}|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,W}f =\mathbb{E}_{\gamma}[W - \langle f,W\rangle_{+}f1_{\{\left\|f\right\|=1\}}].\]
We write \(\lambda\) from Lemma 10 as \(\lambda=(\lambda^{(1)},\lambda^{(2)})\in\mathbb{R}^{2k}\) so that \(\lambda^{(i)}\in\mathbb{R}^{k}\) for \(i=1,2\).
Lemma 13 (First Variation, Lemma 6.11, [13]). Let \(f,g\) be optimally stable. Then \[\|T_{\rho}f-\lambda^{(1)}\|g=T_{\rho}f-\lambda^{(1)},\qquad\|T_{\rho}g-\lambda^{(2)}\|f=T_{\rho}g-\lambda^{(2)}.\qquada.s.\]
Proof. Combining Lemmas 12 and 11, Definition 1, Lemma 12 again, and Definition 3, \[\begin{align} &\Big\langle \Big( \mathbb{E}_{\gamma}[W - \langle f,W\rangle_{+}f1_{\{\left\|f\right\|=1\}}],\, \mathbb{E}_{\gamma}[Z - \langle g,Z\rangle_{+}g1_{\{\left\|g\right\|=1\}}]\Big),\lambda\Big\rangle\\ &=\Big\langle \Big(\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,W}f ,\, \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{V}_{t,Z}g\Big),\lambda\Big\rangle\\ &=\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{V}_{t,W}f(X),\, \mathcal{V}_{t,Z}g(Y)\rangle\\ &=\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X),\, Z(Y)-\langle Z(Y),g(Y)\rangle_{+}g(Y)\rangle +\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle W(X)-\langle W(X),f(X)\rangle_{+}f(X),\, g(Y)\rangle\\ &=\mathbb{E}_{\gamma}\langle T_{\rho}f,\, Z-\langle Z,g\rangle_{+}g\rangle +\mathbb{E}_{\gamma}\langle W-\langle W,f\rangle_{+}f,\, T_{\rho}g\rangle. \end{align}\] The above equality can be rearranged to be \[\mathbb{E}_{\gamma}\langle T_{\rho}f -\lambda^{(1)},\, Z-\langle Z,g\rangle_{+}g\rangle +\mathbb{E}_{\gamma}\langle T_{\rho}g - \lambda^{(2)},\,W-\langle W,f\rangle_{+}f\rangle =0.\] The proof then concludes as in [13]. ◻
Let \(W\colon\mathbb{R}^{n}\to\mathbb{R}^{n}\) be a tame vector field. Define \[F_{0}(x)\colonequals x,\quad \frac{\mathrm{d}}{\mathrm{d}t}F_{t}(x)=W(F_{t}(x)),\qquad\forall\,t\in\mathbb{R},\,\forall\,x\in\mathbb{R}^{n}.\] Denote \(F_{t}=F_{t,W}\), where appropriate. \[\label{sdef} \mathcal{S}_{t,W}f(x)\colonequals f(F_{t}^{-1}(x)),\qquad\forall\,t\in\mathbb{R},\,\forall\,x\in\mathbb{R}^{n}.\tag{35}\] \[\mathrm{div}_{\gamma}W(x)\colonequals\mathrm{div}W(x)-\langle W(x),x\rangle.\]
Lemma 14 (First Variation, Lemma 6.13, [13]). Let \(f,g\colon\mathbb{R}^{n}\to\mathbb{R}^{k}\) and let \(W,Z\colon \mathbb{R}^{n}\to\mathbb{R}^{n}\) be tame vector fields. Then \[\begin{align} \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{S}_{t,W}f(X),\, \mathcal{S}_{t,Z}g(Y)\rangle &=\mathbb{E}_{\gamma}[\langle f,T_{\rho}f\rangle\mathrm{div}_{\gamma}W + \langle f, D_{W}T_{\rho}f\rangle]\\ &\quad+\mathbb{E}_{\gamma}[\langle g,T_{\rho}g\rangle\mathrm{div}_{\gamma}Z + \langle g, D_{Z}T_{\rho}g\rangle]. \end{align}\]
Let \(\lambda\in\mathbb{R}^{2k}\) from Lemma 11. Define \[\label{qdef} Q(W,Z)\colonequals\frac{\mathrm{d}^{2}}{\mathrm{d}t^{2}}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{S}_{t,W}f(X),\, \mathcal{S}_{t,Z}g(Y)\rangle -\Big\langle\lambda,\,\Big(\frac{\mathrm{d}^{2}}{\mathrm{d}t^{2}}\Big|_{t=0}\mathbb{E}_{\gamma} \mathcal{S}_{t,W}f ,\, \frac{\mathrm{d}^{2}}{\mathrm{d}t^{2}}\Big|_{t=0}\mathbb{E}_{\gamma} \mathcal{S}_{t,Z}g \Big) \Big\rangle.\tag{36}\]
Lemma 15 (Second Variation, Lemma 6.15, [13]). Let \(W,Z\colon\mathbb{R}^{n}\to\mathbb{R}^{n}\) be tame vector fields such that \[\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}f=\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,Z}g=0.\] Let \(f,g\) be optimally stable. Then \[Q(W,Z)\geq0.\]
Proof. Similar to Lemma 10, for any \(\alpha\in\mathbb{R}^{k}\) and \(\beta\in\mathbb{R}\), define
\[f_{\alpha,\beta}\colonequals\frac{f(F_{\beta,W}(x))+\sum_{i=1}^{k}\alpha_{i}W_{i}(x) }{\max\Big(1,\|f(F_{\beta,W}(x))+\sum_{i=1}^{k}\alpha_{i}W_{i}(x)\|\Big)}.\] \[g_{\alpha,\beta}\colonequals\frac{g(F_{\beta,Z}(x))+\sum_{i=1}^{k}\alpha_{i}W_{i}(x) }{\max\Big(1,\|g(F_{\beta,Z}(x))+\sum_{i=1}^{k}\alpha_{i}W_{i}(x)\|\Big)}.\]
Define \(L\colon\mathbb{R}^{2k+1}\to\mathbb{R}^{2k}\) by \[L(\alpha,\theta,\beta)\colonequals \left(\mathbb{E}_{\gamma}f_{\alpha,\beta},\,\, \mathbb{E}_{\gamma}g_{\theta\beta}\right),\qquad\forall\,\alpha,\theta\in\mathbb{R}^{k},\,\forall\,\beta\in\mathbb{R}.\] Then by assumption, we have \[\label{three1z} \frac{\partial L}{\partial\beta}(0,0)=\left(\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}f,\,\, \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,Z}g\right) =0.\tag{37}\] And by definition of \(L\), we have, for all \(1\leq i\leq k\), \[\label{three2z} \begin{align} \frac{\partial L}{\partial\alpha_{i}}(0,0)&=\left(\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W_{i}}f, \,\, 0 \right).\\ \frac{\partial L}{\partial\theta_{i}}(0,0)&=\left( 0, \,\, \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W_{i}}g \right).\\ \end{align}\tag{38}\] From Lemma 9, we conclude that the matrix of partial derivatives \(DL\) of \(L\) is a \((2k+1)\times 2k\) matrix of rank \(2k\). So, by the Implicit Function Theorem, there exists \(\varepsilon>0\) and a differentiable curve \(\eta\colon(-\varepsilon,\varepsilon)\to\mathbb{R}^{2k+1}\) with \(\eta(0)=0\), \(\eta'(0)\neq0\) and \(L(\eta(t))=0\) for all \(t\in(-\varepsilon,\varepsilon)\). The last property implies, by the Chain Rule, that \[\label{three3z} 0=\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}L(\eta(t)) =\sum_{i=1}^{k}\frac{\partial L}{\partial\alpha_{i}}(0,0)\eta_{i}'(0) +\sum_{i=1}^{k}\frac{\partial L}{\partial\theta_{i}}(0,0)\eta_{k+i}'(0)\qquad+\frac{\partial L}{\partial \beta}(0,0)\eta_{2k+1}'(0).\\\tag{39}\] The last term is zero by 37 . Lemma 9 and 38 imply that set \(\Big\{\frac{\partial L}{\partial\alpha_{i}}(0,0),\frac{\partial L}{\partial\theta_{i}}(0,0)\Big\}_{i=1}^{k}\) consists of \(2k\) linearly independent vectors. We conclude from 39 that \(\eta_{i}'(0)=\eta_{i+k}'(0)=0\) for all \(1\leq i\leq k\). Since \(\eta'(0)\neq0\), we conclude that \(\eta_{2k+1}'(0)\neq0\). (Observe \(L(\eta(t))=0\) for all \(t\in(-\varepsilon,\varepsilon)\), i.e. \(\mathbb{E}_{\gamma}f=\mathbb{E}_{\gamma}g=0\) for all \(t\in(-\varepsilon,\varepsilon)\), so the desired constraints hold for \(f,g\) for all \(t\in(-\varepsilon,\varepsilon)\).)
Taking another derivative of 39 and using \(\eta_{i}'(0)=\eta_{i+k}'(0)=0\) for all \(1\leq i\leq k\) along with \(L(\eta(t))=0\) for all \(t\in(-\varepsilon,\varepsilon)\) and 37 , \[\label{three4} 0=\frac{\mathrm{d}^{2}}{\mathrm{d}t^{2}}\Big|_{t=0}L(\eta(t)) =\sum_{i=1}^{k}\frac{\partial L}{\partial\alpha_{i}}(0,0)\eta_{i}''(0) +\sum_{i=1}^{k}\frac{\partial L}{\partial\theta_{i}}(0,0)\eta_{k+i}''(0)\qquad+\frac{\partial^{2} L}{\partial \beta^{2}}(0,0)(\eta_{2k+1}'(0))^{2}.\\\tag{40}\]
Define \(J_{0}(\alpha,\theta,\beta)\colonequals \underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f_{\alpha,\beta}(X),\, g_{\theta,\beta}(Y)\rangle\) and \(J_{1}(t)\colonequals J_{0}(\eta(t))\). Then, since \(f,g\) are optinally stable and \(L(\eta(t))=0\) for all \(t\in(-\varepsilon,\varepsilon)\), we have \(J_{1}'(0)=0\). Since \(\eta_{i}'(0)=\eta_{i+k}'(0)=0\) for all \(1\leq i\leq k\), we have \(0=J_{1}'(0)=\eta_{2k+1}'(0)\frac{\partial J_{0}}{\partial\beta}(0,0)\). Since \(\eta_{2k+1}'(0)\neq0\), we have \(\frac{\partial J_{0}}{\partial\beta}(0,0)=0\). Taking another derivative of \(J_{1}\) and using optimality of \(f,g\), we have \[\label{three5} 0\leq J_{1}''(0)=\sum_{i=1}^{k}\frac{\partial J_{0}}{\partial\alpha_{i}}(0,0)\eta_{i}''(0) +\sum_{i=1}^{k}\frac{\partial J_{0}}{\partial\theta_{i}}(0,0)\eta_{k+i}''(0)\qquad+\frac{\partial^{2} J_{0}}{\partial \beta^{2}}(0,0)(\eta_{2k+1}'(0))^{2}.\tag{41}\] Here we again used \(\eta_{i}'(0)=\eta_{i+k}'(0)=0\) for all \(1\leq i\leq k\). Lemma 14 says \(\frac{\partial}{\partial\alpha_{i}}J_{0}=\langle\lambda, \frac{\partial}{\partial\alpha_{i}}L\rangle\) for all \(1\leq i\leq k\) and \(\frac{\partial}{\partial\theta_{i}}J_{0}=\langle\lambda, \frac{\partial}{\partial\theta_{i}}L\rangle\) for all \(1\leq i\leq k\). So, we can rewrite 41 as \[\label{three6} \begin{align} 0\leq J_{1}''(0) &=\sum_{i=1}^{k}\eta_{i}''(0)\langle\lambda, \frac{\partial}{\partial\alpha_{i}}L\rangle +\sum_{i=1}^{k}\eta_{k+i}''(0)\langle\lambda, \frac{\partial}{\partial\theta_{i}}L\rangle\qquad+\frac{\partial^{2} J_{0}}{\partial \beta^{2}}(0,0)(\eta_{2k+1}'(0))^{2}\\ &\stackrel{\eqref{three4}}{=}-(\eta_{2k+1}'(0))^{2}\Big\langle\lambda,\frac{\partial^{2} L}{\partial \beta^{2}}(0,0)\Big\rangle+\frac{\partial^{2} J_{0}}{\partial \beta^{2}}(0,0)(\eta_{2k+1}'(0))^{2}. \end{align}\tag{42}\] Since \((\eta_{2k+1}'(0))^{2}>0\), we can factor it out of 42 to get \[0\leq -\Big\langle\lambda,\frac{\partial^{2} L}{\partial \beta^{2}}(0,0)\Big\rangle+\frac{\partial^{2} J_{0}}{\partial \beta^{2}}(0,0).\] This inequality concludes the proof, since the right side is \(Q(W,Z)\). ◻
Lemma 16 (Second Variation of Translations, adapted from Lemma 6.17, [13]). Fix \(w\in\mathbb{R}^{n}\). Let \(W\colon\mathbb{R}^{n}\to\mathbb{R}^{n}\) be the constant vector field \(W\colonequals w\). Let \(f,g\) be optimally stable. Assume that \[\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}f=\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,Z}g=0.\] Then \[Q(W,W)=2(1-1/\rho)\mathbb{E}_{\gamma}\|D_{w}T_{\sqrt{\rho}}f\|^{2}.\]
Proof. Let \(z\in\mathbb{R}^{n}\). Let \(Z\colon\mathbb{R}^{n}\to\mathbb{R}^{n}\) be the constant vector field \(Z\colonequals z\). From 35 , \(\mathcal{S}_{t,W}f(x)=f(x-tw)\) for all \(t\in\mathbb{R}\), \(x\in\mathbb{R}^{n}\). So, \[\label{three7} \frac{\mathrm{d}^{2}}{\mathrm{d}t^{2}}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}f=\mathbb{E}[D_{w}(D_{w}f)] ,\qquad\frac{\mathrm{d}^{2}}{\mathrm{d}t^{2}}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,Z}g=\mathbb{E}[D_{z}(D_{z}g)].\tag{43}\] Using the product rule and Definition 1, \[\begin{align} &\frac{\mathrm{d}^{2}}{\mathrm{d}t^{2}}\Big|_{t=0}\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle\mathcal{S}_{t,W}f(X),\, \mathcal{S}_{t,Z}g(Y)\rangle\\ &\qquad=\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle D^{2}_{w,w}f(X), g(Y)\rangle +\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle f(X), D^{2}_{z,z}g(Y)\rangle +2\underset{X\sim_{\rho}Y}{\mathbb{E}}\langle D_{w}f(X), D_{z}g(Y)\rangle\\ &\qquad=\mathbb{E}_{\gamma}\langle D^{2}_{w,w}f, T_{\rho}g\rangle +\mathbb{E}_{\gamma}\langle T_{\rho}f, D^{2}_{z,z}g\rangle +2\mathbb{E}_{\gamma}\langle D_{w}f, T_{\rho}D_{z}g\rangle\\ &\qquad=\mathbb{E}_{\gamma}\langle D^{2}_{w,w}f, T_{\rho}g\rangle +\mathbb{E}_{\gamma}\langle T_{\rho}f, D^{2}_{z,z}g\rangle +\frac{2}{\rho}\mathbb{E}_{\gamma}\langle D_{w}f, D_{z}T_{\rho}g\rangle. \end{align}\]
Substituting the definition of \(Q\) 36 , using 43 , then integrating by parts twice, \[\label{three8} \begin{align} Q(W,Z) &=\mathbb{E}_{\gamma}\langle D^{2}_{w,w}f, T_{\rho}g - \lambda^{(2)}\rangle +\mathbb{E}_{\gamma}\langle T_{\rho}f - \lambda^{(1)}, D^{2}_{z,z}g\rangle +\frac{2}{\rho}\mathbb{E}_{\gamma}\langle D_{w}f, D_{z}T_{\rho}g\rangle\\ &=-\sum_{i=1}^{k}\mathbb{E}_{\gamma}D_{w}f_{i}\mathrm{div}_{\gamma}(w(T_{\rho}g_{i} - \lambda_{i}^{(2)})) -\sum_{i=1}^{k}\mathbb{E}_{\gamma}D_{z}g_{i}\mathrm{div}_{\gamma}(z(T_{\rho}f_{i} - \lambda_{i}^{(1)}))\\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad -\frac{2}{\rho}\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(wD_{z}T_{\rho}g_{i})\rangle\\ &=\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(w\mathrm{div}_{\gamma}(w(T_{\rho}g_{i} - \lambda_{i}^{(2)}))) +\sum_{i=1}^{k}\mathbb{E}_{\gamma}g_{i}\mathrm{div}_{\gamma}(z\mathrm{div}_{\gamma}(z(T_{\rho}f_{i} - \lambda_{i}^{(1)})))\\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad -\frac{2}{\rho}\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(wD_{z}T_{\rho}g_{i})\rangle. \end{align}\tag{44}\]
We examine the penultimate terms. From the First Variation Lemma 13, a.s. \[\|T_{\rho}f-\lambda^{(1)}\|g=T_{\rho}f-\lambda^{(1)},\qquad\|T_{\rho}g-\lambda^{(2)}\|f=T_{\rho}g-\lambda^{(2)}.\] So, using first the product rule then Lemma 13, \[\begin{align} \sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(w\mathrm{div}_{\gamma}(w(T_{\rho}g_{i} - \lambda_{i}^{(2)}))) &=\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}((T_{\rho}g_{i} - \lambda_{i}^{(2)})w\mathrm{div}_{\gamma}(w)\,\,+w D_{w}T_{\rho}g_{i} ) \\ &=\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(\|T_{\rho}g_{i} - \lambda_{i}^{(2)}\|f_{i}w\mathrm{div}_{\gamma}(w)\,\,+w D_{w}T_{\rho}g_{i} ). \end{align}\] The penultimate term here is zero by [13], and a similar calculation for the other term in 44 allows us to rewrite 44 as \[\begin{align} Q(W,Z) &=\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(w D_{w}T_{\rho}g_{i}) +\sum_{i=1}^{k}\mathbb{E}_{\gamma}g_{i}\mathrm{div}_{\gamma}(z D_{z}T_{\rho}f_{i})\\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad -\frac{2}{\rho}\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(wD_{z}T_{\rho}g_{i})\rangle. \end{align}\] Lemma 17, rewrites this as \[Q(W,Z)=-\mathbb{E}_{\gamma}\langle D_{w}T_{\sqrt{\rho}}f,\, D_{w}T_{\sqrt{\rho}}g\rangle -\mathbb{E}_{\gamma}\langle D_{z}T_{\sqrt{\rho}}f,\, D_{z}T_{\sqrt{\rho}}g\rangle +\frac{2}{\rho}\mathbb{E}_{\gamma}\langle D_{w}T_{\sqrt{\rho}}f,\, D_{z}T_{\sqrt{\rho}}g\rangle.\]
Using now Lemma 8, we may assume that \(g=-f\), so that \[Q(W,Z)=\mathbb{E}_{\gamma}\langle D_{w}T_{\sqrt{\rho}}f,\, D_{w}T_{\sqrt{\rho}}f\rangle +\mathbb{E}_{\gamma}\langle D_{z}T_{\sqrt{\rho}}f,\, D_{z}T_{\sqrt{\rho}}f\rangle -\frac{2}{\rho}\mathbb{E}_{\gamma}\langle D_{w}T_{\sqrt{\rho}}f,\, D_{z}T_{\sqrt{\rho}}f\rangle.\]
Choosing \(w=z\), we get \[Q(W,W)=2(1-1/\rho)\mathbb{E}_{\gamma}\|D_{w}T_{\sqrt{\rho}}f\|^{2}.\] ◻
Lemma 17 (Lemma 6.6, [13]). Let \(w\in\mathbb{R}^{n}\). Then \[\sum_{i=1}^{k}\mathbb{E}_{\gamma}f_{i}\mathrm{div}_{\gamma}(w D_{z}T_{\rho}g_{i}) =-\mathbb{E}_{\gamma}\langle D_{w}T_{\sqrt{\rho}}g,\, D_{z}T_{\sqrt{\rho}}f\rangle.\]
Proof of Theorem 6. Let \(f,g\colon\mathbb{R}^{n}\to S^{k-1}\) be optimally stable. Let \(w\in\mathbb{R}^{n}\). Define \(W\colon\mathbb{R}^{n}\to\mathbb{R}^{n}\) so that \(W(x)\colonequals w\) for all \(x\in\mathbb{R}^{n}\). Let \(W\) satisfy \[\label{ten1} \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}f=\frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}g=0.\tag{45}\] Since \(f=-g\) by Lemma 8, 45 is equivalent to \[\label{ten2} \frac{\mathrm{d}}{\mathrm{d}t}\Big|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}f=0.\tag{46}\] Define \(L\colon \mathbb{R}^{n}\to \mathbb{R}^{k}\) by \(L(w)\colonequals\frac{\mathrm{d}}{\mathrm{d}t}|_{t=0}\mathbb{E}_{\gamma}\mathcal{S}_{t,W}f\). Let \(\mathrm{ker}(L)\colonequals\{w\in\mathbb{R}^{n}\colon L(w)=0\}\) denote the kernel of \(L\). Since \(L\) is a linear map, the rank-nullity theorem implies that \(\mathrm{ker}(L)\) has dimension at least \(n-k\).
Let \(w\in\mathrm{ker}(L)\). Then Lemma 15 says \(Q(W,W)\geq0\), while Lemma 16 says \[Q(W,W)=2(1-1/\rho)\mathbb{E}_{\gamma}\|D_{w}T_{\sqrt{\rho}}f\|^{2}\leq0.\] We conclude that there exists a linear subspace \(\mathrm{ker}(L)\subseteq\mathbb{R}^{n}\) of dimension at least \(n-k\) with \[\mathbb{E}_{\gamma}\|D_{w}T_{\sqrt{\rho}}f\|^{2}=0,\qquad\forall\,w\in\mathrm{ker}(L).\] That is, we may assume that \(f\) is constant on \(\mathrm{ker}(L)\), as noted e.g. in [13]. That is, we may assume a priori that \(f\colon\mathbb{R}^{k}\to\mathbb{R}^{k-1}\). Theorem 6 follows. ◻
Acknowledgement. We acknowledge and thank Yeongwoo Hwang, Joe Neeman, Ojas Parekh, Kevin Thompson and John Wright for helpful discussions.