October 02, 2025
In this paper, we provide details on the proofs of the quantum polynomial time algorithm of Biasse and Song (SODA 16) for computing the \(S\)-unit group of a number field. This algorithm directly implies polynomial time methods to calculate class groups, \(S\)-class groups, relative class group and the unit group, ray class groups, solve the principal ideal problem, solve certain norm equations, and decompose ideal classes in the ideal class group. Additionally, combined with a result of Cramer, Ducas, Peikert and Regev (Eurocrypt 2016), the resolution of the principal ideal problem allows one to find short generators of a principal ideal. Likewise, methods due to Cramer, Ducas and Wesolowski (Eurocrypt 2017) use the resolution of the principal ideal problem and the decomposition of ideal classes to find so-called “mildly short vectors” in ideal lattices of cyclotomic fields.
Let \(K\) be a number field of degree \(n\) and \(\mathcal{O}\) be an order in \(K\) with discriminant \(\Delta\). The set of elements \(\alpha\in K\) such that \(\exists (e_i)_{i\leq |S|}\in \mathbb{Z}^ {|S|},\;\;(\alpha) = \mathfrak{p}^ {e_1}\cdots \mathfrak{p}^{e_{|S|}}\) is a multiplicative group called the \(S\)-unit group of \(K\). This notion generalizes the units of \(\mathcal{O}\) which are \(S\)-units for \(S=\varnothing\), and computing the \(S\)-unit group is an important task in computational number theory. Most notably it applies to the computation of the ideal class group of \(\mathcal{O}\), the resolution of the principal ideal problem in \(\mathcal{O}\), and the resolution of norm equations of the form \(\mathcal{N}_{L/K}(x) = \theta\) where \(\theta\in K\), as shown by Simon [1] and Fieker [2], [3].
The ideal class group \(\operatorname{Cl}(\mathcal{O})\) is the finite abelian group consisting of the invertible fractional ideals of \(\mathcal{O}\) up to principal factors and has order \(|\Delta|^{O(1)}\). Computing the ideal class group is an essential task in number theory that occurs in particular in the resolution of unproven heuristics such as the Cohen-Lenstra heuristics [4] on class groups of quadratic number field, Littlewood’s bounds [5] on \(L(1,\chi)\), or Bach’s bound [6] on the maximum norm of the generators required to generate the class group. Besides being a fundamental problem, computing the ideal class group is also strongly connected to number theoretic problems occurring in cryptography. For example, it is at the heart of the only known unconditional classical subexponential algorithm for integer factorization [7]. Finding relations between elements in \(\operatorname{Cl}(\mathcal{O})\) also occurs in curve-based cryptography. Indeed, both classical [8], [9] and quantum [10] subexponential methods for computing isogenies between elliptic curves depend on it.
Given an ideal \(\mathfrak{a}\subseteq\mathcal{O}\), deciding whether or not \(\mathfrak{a}\) is principal, and if so, finding \(\alpha\in\mathcal{O}\) such that \(\mathfrak{a}= (\alpha)\) is called the Principal Ideal Problem. It has direct applications to the computation of relative class groups and unit groups, and computing the \(S\)-class group of a number field. It is is also relevant to lattice-based cryptography, which has received a considerable attention since it allows quantum-safe cryptosystems and homomorphic encryption schemes. For efficiency reasons, there have been many proposals of schemes using lattices arising from ideals in the ring of integers of a number field, and in particular principal ideals generated by a small element (for example, see the homomorphic encryption scheme of Smart and Vercauteren [11] and the multilinear maps of Garg, Gentry and Halevi [12]). It was subsequently proved that solving the principal ideal problem in polynomial time directly induces a polynomial time attack on schemes relying on the hardness of finding the short generator of a principal ideal [13].
Computing the ideal class group and the unit group is a problem that has been extensively studied in both the classical and quantum setting. Despite these efforts, there are no known polynomial time algorithms for these tasks. On the other hand, there are quantum polynomial time algorithms for several hard computational problems in number theory based on quantum algorithms for the Hidden Subgroup Problem (HSP). Shor showed that integer factorization and the discrete logarithm problem could be solved in polynomial time [14], and Hallgren described a polynomial time algorithm for solving the Pell’s equation [15]. Similar methods were used to compute the class group and the unit group in polynomial time in classes of number fields of fixed degree [16], [17]. The approach of [16] relies on the resolution of the HSP in a bounded and discretized approximation of \(\mathbb{R}^m\), which does not seem to apply when the degree of the fields grows to infinity. In a recent breakthrough, Eisenträger, Hallgren, Kitaev and Song [18] described a polynomial time algorithm for computing the unit group in classes of number fields of arbitrary degree. One of the main tools they developed is a continuous HSP definition on \(\mathbb{R}^m\) and an efficient quantum algorithm solving it. In essence, their new HSP definition enforces stringent continuity properties on the function that hides the subgroup. This makes the function more amenable to quantum Fourier sampling.
In this paper, we present a quantum algorithm to compute the \(S\) unit group of a number field of arbitrary degree in polynomial time. It readily applies to the computation of the ideal class group and to the resolution of the principal ideal problem, and well as to other related tasks in computational number theory. We follow a different framework than the previous work in constant-degree number fields due to Hallgren [16]. We show that both the ideal class group computation and PIP reduce to a more general problem of computing the \(S\)-unit group for suitable set of prime ideals \(S\). For example, for the ideal class group computation \(S\) is chosen to be a succinct generating set of \(\operatorname{Cl}(\mathcal{O})\). Then we give an efficient quantum algorithm for computing the \(S\)-unit group by extending the work by Eisenträger, Hallgren, Kitaev and Song [18]. We show an efficient quantum reduction from the \(S\)-unit group problem to HSP on \(\mathbb{R}^m\) as defined in [18], which then can be solved efficiently by the quantum HSP algorithm in [18]. We also show how to get exact compact representations of the desired field elements with respect to a given integral basis for \(\mathcal{O}\), while [18] only returns fixed point rational approximations of the units. Compact representations are usually easier for further algebraic processing. Our main results are summarized in the next theorem.
Theorem 1 (\(S\)-unit group computation). There is a quantum algorithm for computing the \(S\)-unit group of a number field \(K\) in compact representation which runs in polynomial time in the parameters \(n=\deg(K)\), \(\log(|\Delta|)\), \(|S|\) and \(\max_{\mathfrak{p}\in S}\{\log(\mathcal{N}(\mathfrak{p}))\}\), where \(\Delta\) is the discriminant of the ring of integers of \(K\).
Corollary 1. There are quantum polynomial time algorithms for the resolution of the following tasks in computational number theory:
Ideal class group computation (under GRH),
\(S\)-class group computation (under GRH),
Relative class group and unit group computation (under GRH),
Ideal class decomposition in the ideal class group (under GRH),
Principal Ideal Problem,
Ray class group computation (under GRH),
Norm equation resolution,
where GRH denotes the Generalized Rieman Hypothesis.
As an important corollary, combining recent works in lattice cryptanalysis [13], [19], our results induce a quantum polynomial-time attack on an entire family of cryptosystems relying on the hardness of finding a short generator of a principal ideal.
In this section we review some useful background in number theory and introduce some definitions and notations. The notions of ideal class group and \(S\)-unit group are standard, and can be found in many books. We suggest Neukirch’s book [20] for the fundamental aspects of this theory and Cohen’s book [21] for the algorithmic aspects. We invite the reader who is already familiar to these topics to pay attention to the non-standard notion of \(E\)-ideal that we introduce in the following.
A number field \(K\) is a finite extension of \(\mathbb{Q}\). Its ring of integers \({\mathcal{O}_K}\) has the structure of a \(\mathbb{Z}\)-lattice of degree \(n=[K:\mathbb{Q}]\), and the orders \(\mathcal{O}\subseteq{\mathcal{O}_K}\) are the sublattices of \({\mathcal{O}_K}\) which have degree \(n\) and which are equipped with a ring structure. Throughout this paper, we assume that \(\mathcal{O}\) is an order in a number field \(K\), and we denote by \(\omega_1,\ldots,\omega_n\) a \(\mathbb{Z}\)-basis, that is \(\mathcal{O}= \mathbb{Z}\omega_1\oplus\ldots\oplus\mathbb{Z}\omega_n.\) A number field has \(n_1\) real embeddings and \(n_2\) pairs of complex embeddings which we denote \((\sigma_j:K\to \mathbb{R})_{j\leq n_1},((\sigma_j,\overline{\sigma_j}):K\to \mathbb{C})_{j\leq n_2}\) with \(n_1+n_2 = n = \mathrm{deg}(K)\). These embeddings define two essential maps, namely the norm and trace maps which are given by \(\mathcal{T}(x):=\sum_\sigma \sigma(x)\in\mathbb{Q}\) and \(\mathcal{N}(x):= \prod_\sigma \sigma(x)\in\mathbb{Q}\). The trace map is additive while the norm map is multiplicative. Note that \(\mathcal{T}(\mathcal{O})\subseteq\mathbb{Z}\) and \(\mathcal{N}(\mathcal{O})\subseteq\mathbb{Z}\). We measure the size of the ring \(\mathcal{O}\) by \(\log|\Delta|\) where \(\Delta := \left( \det(\sigma_j(\omega_k))\right)^2\) is its discriminant, and it equals the volume of the fundamental domain of \(\mathcal{O}\). Equivalently, the discriminant can be defined from the trace map by \(\Delta:= \det(\mathcal{T}(\omega_i\omega_j))_{i,j\leq n}\).
The fractional ideals of \(\mathcal{O}\) generalize the notion of ring ideals of \(\mathcal{O}\). They are the subsets of \(K\) of the form \(\mathfrak{a}= \frac{1}{d}I\) where \(d\in\mathbb{Z}^+\) and \(I\subseteq \mathcal{O}\) is an (integral) ideal of \(\mathcal{O}\). A fractional ideal \(\mathfrak{a}\) is invertible if \(\mathfrak{a}^{-1} := \{ x \in K \mid \;x\mathfrak{a}\subseteq \mathcal{O}\}\) is also a fractional ideal. The invertible fractional ideals have a multiplicative group structure, and the principal fractional ideals are one of its subgroups. The ideal class group is defined by \[\operatorname{Cl}(\mathcal{O}) := \mathcal{I}/\mathcal{P},\] where \(\mathcal{I}\) is the multiplicative group of fractional invertible ideals of \(\mathcal{O}\) and \(\mathcal{P}\) is the subgroup of elements of \(\mathcal{I}\) that are principal. This means that we identify \(\mathfrak{a}\) and \(\mathfrak{b}\) in \(\operatorname{Cl}(\mathcal{O})\) if there is \(\alpha\in K\) such that \(\mathfrak{a}=(\alpha)\mathfrak{b}\). Ideals are sublattices of \(\mathcal{O}\) of rank \(n\), and we define their norm by \(\mathcal{N}(I):= \left| \mathcal{O}/ I\right|\). This notion naturally extends to fractional ideals using the multiplicative rule \(\mathcal{N}(\mathfrak{a}/\mathfrak{b}) := \mathcal{N}(\mathfrak{a})/\mathcal{N}(\mathfrak{b})\). This notion of norm extends the norm on \(K\) in the sense that if \(\mathfrak{a}= (\alpha)\), then \(\mathcal{N}(\mathfrak{a})=\mathcal{N}(\alpha)\).
The \(S\)-units are a generalization of the units \(\mathcal{O}^*\), which are the invertible elements of \(\mathcal{O}\). The unit group can alternatively be defined as the \(\alpha\in\mathcal{O}\) with \(|\mathcal{N}(\alpha)|=1\), or the \(\alpha\in\mathcal{O}\) such that \((\alpha) = \mathcal{O}\). The unit group \(\mathcal{O}^*\) satisfies \(\mathcal{O}^* \simeq \mu\times \langle \varepsilon_1\rangle \times \ldots \times \langle \varepsilon_r\rangle,\) where \(r:= n_1 + n_2 - 1\), \(\mu\) is the set of roots of unity and the \(\varepsilon_i\) are torsion-free units. Let \(S = \{\mathfrak{p}_i\}\) be a finite set of prime ideals of \(\mathcal{O}\), the \(S\)-units are the elements \(\alpha\in K\) such that there is \((v_i(\alpha))_{i\leq |S|}\in \mathbb{Z}^ {|S|}\) with \((\alpha) = \mathfrak{p}_1^ {v_1(\alpha)}\cdots \mathfrak{p}_{|S|}^ {v_{|S|}(\alpha)}.\) Note that the \(S\)-units are elements of \(K\). They form a multiplicative group \(U(S)\) satisfying \(U(S) \simeq \mu\times \langle \varepsilon_1\rangle \times \ldots \times \langle \varepsilon_{r+|S|}\rangle,\) where \(r:= n_1 + n_2 - 1\), \(\mu\) is the set of roots of unity and the \(\varepsilon_i\) are torsion-free \(S\)-units.
The number field \(K\) can be naturally embedded into \(E:=\mathbb{R}^{n_1}\times\mathbb{C}^{n_2}\) by setting \(z\in \mathcal{O}\mapsto(\sigma_1(z),\ldots,\sigma_{n_1+n_2}(z))\). As in [18], we denote by \(\underline{\mathcal{O}}\) the image of \(\mathcal{O}\) via this embedding. The set \(\underline{\mathcal{O}}\) inherits from the lattice structure of \(\mathcal{O}\), i.e. it can be identified as a lattice in \(\mathbb{R}^n\), as well as from the multiplication between elements (which is performed component-wise). The image of the fractional ideals of \(K\) in \(E\) are lattices \(\Lambda\subseteq E\) with the property that \(x\Lambda\subseteq\Lambda\) for all \(x\in\underline{\mathcal{O}}\). We define the \(E\)-ideals as all the lattices in \(E\) satisfying this property. When there is no ambiguity, we identify a fractional ideal of \(\mathcal{O}\) and the corresponding \(E\)-ideal.
Definition 1 (\(E\)-ideals). Let \(E:=\mathbb{R}^{n_1}\times\mathbb{C}^{n_2}\) and \(\underline{\mathcal{O}}\) the image of \(\mathcal{O}\) via the embedding \(K\rightarrow E\). An \(E\)-ideal is a lattice \(\Lambda\subseteq E\) such that \(\forall x\in \underline{\mathcal{O}}, \;x\Lambda\subseteq\Lambda.\)
We review the definition of continuous HSP proposed by Eisenträger et al. [18], for which they have shown an efficient quantum algorithm.
Definition 2 (Continuous HSP over \(\mathbb{R}^m\)). The unknown subgroup \(L\subseteq \mathbb{R}^m\) is a full-rank lattice satisfying some promise: the norm of the shortest vector is at least \(\lambda\) and the unit cell volume is at most \(d\). The oracle has parameters \((a,r,\varepsilon)\). Let \(f: \mathbb{R}^m \rightarrow \mathcal{S}\) be a function, where \(\mathcal{S}\) is the set of unit vectors in some Hilbert space. We assume that \(f\) hides \(L\) in the following way.
\(f\) is periodic on \(L\), i.e. \(f(x)=f(x+v)\) for all \(x\in \mathbb{R}^m\) and \(v\in L\);
\(f\) is Lipschitz with constant \(a\), i.e. \(\bigl\||f(x)\rangle-|f(y)\rangle\bigr\| \le a\|x-y\|\) for all \(x,y\in \mathbb{R}^m\);
If the distance between the cosets (\(x\bmod L\)) and (\(y\bmod L\)) is greater or equal to \(r\), i.e.if \(\min_{v\in L}\|x-y-v\|\ge r\), then \(\bigl|\langle{f(x)}|{f(y)}\rangle\bigr|\le\varepsilon\).
Under these conditions, the problem is to compute a basis of \(L\) by a quantum algorithm that can make oracle calls \(|x\rangle\mapsto|x\rangle\otimes|f(x)\rangle\).
Actually, the definition also applies more generally to other topological groups \(G = \mathbb{R}^k/\Lambda \times D\) with a proper metric on \(G\) [18]. Here \(G\) is decomposed to a continuous part, which is the quotient of \(\mathbb{R}^k\) over some lattice \(\Lambda\), and a discrete part that is finitely generated. It is nonetheless sufficient to consider HSP on \(\mathbb{R}^m\), because the more general case can be reduced to HSP on \(\mathbb{R}^m\) [18], and hence can be solved efficiently. In the following, we define a control group \(G\) on which a first version of our HSP oracle will be defined. We prove HSP properties on \(G\), and then extend it to \(\mathbb{R}^m\).
Suppose \(\sigma_1,\ldots,\sigma_{n_1}\) are the real embeddings of \(K\), and that \(\sigma_{n_1+1},\ldots,\sigma_{n_1+n_2}\) are the (non-pairwise conjugate) complex embeddings of \(K\). Assume also that \(S = \{\mathfrak{p}_1,\ldots,\mathfrak{p}_s\}\) where \(\mathcal{N}(\mathfrak{p}_i) = p_i^{e_1}\). An element \(x\in U_S\) satisfies \(\prod_{i=1}^{n_1+2n_2} \sigma_i(x) = \mathcal{N}(x) = \prod_i p_i^{e_iv_i(x)}\). This means that we know that \[\log|\sigma_{1}(x)| = -\sum_{i=2}^{n_1}\log|\sigma_i(x)| - 2\sum_{i=n_1+1}^{n_2}\log|\sigma_i(x)| + \sum_{i\leq s} e_iv_i(x)\log p_i.\] Therefore, \(x\in U_S\) corresponding to \((x_1,\ldots,x_n)=(\sigma_1(x),\ldots,\sigma_n(x))\in\mathbb{R}^{n_1}\times\mathbb{C}^{n_2}\) is uniquely identified by the element \(x^G\in G := \mathbb{R}^{n_1 + n_2 - 1}\times \mathbb{Z}_2^{n_1} \times \left(\mathbb{R}/\mathbb{Z}\right)^{n_2}\times \mathbb{Z}^s\) where
\(x_i^G = \log(|x_{i+1}|)\) for \(1\leq i< n_1+n_2\),
\(x_i^{G} = \delta_i\in\mathbb{Z}_2\) where \(x_{i-n_1-n_2+1} = (-1)^{\delta_i}|x_{i-n_1-n_2+1}|\) for \(n_1+n_2\leq i< 2n_1 + n_2\).
\(x_i^{G} = \theta_i\in\mathbb{R}/\mathbb{Z}\) where \(x_{i-2n_1-n_2+1} = e^{2i\pi\theta_i}|x_{i-2n_1-n_2+1}|\) for \(2n_1+n_2\leq i< 2n_1 + 2n_2\).
\(x_i^G = v_{\mathfrak{p}_{i-2n_1 + 2n_2}}(x)\) for \(2n_1+2n_2\leq i< 2n_1+2n_2+s\).
Conversely, we have a map \(\phi:G\rightarrow \mathbb{R}^{n_1}\times\mathbb{C}^{n_2}\) such that \(\phi(x^G)=x\) by choosing \(|x_i| = e^{x_{i+1}^{G}}\) for \(1\leq i<n_1+n_2\), and \[\label{eq:last-coordinate} |x_1| = \frac{\prod_{i=1}^{s}p_{i}^{e_ix_{i+2n_1+2n_2}^G }}{\prod_{i=2}^{n_1}|x_i|\prod_{i=n_1+1}^{n_1+n_2}|x_i|^2}.\tag{1}\] Then we do
\(x_i\leftarrow (-1)^{x^G_{i+x_1+x_2-1}}|x_i|\) for \(1\leq i\leq n_1\) and
\(x_i\leftarrow e^{2i\pi x^G_{i + 2x_1+x_2 - 1}}|x_i|\) for \(n_1 < i \leq n_1+n_2\).
Definition 3 (Control group \(G\)). Let \(K\) be a number field of signature \((n_1,n_2)\), and \(S\) a set of primes above \((p_i)_{i\leq s}\). We define the following groups:
\(G = \mathbb{R}^{n_1 + n_2 - 1}\times \mathbb{Z}_2^{n_1} \times \left(\mathbb{R}/\mathbb{Z}\right)^{n_2}\times \mathbb{Z}^s\) the control group.
\(L = U_S^G\subseteq G\) the image of the \(S\)-unit group of \(K\), which is a lattice.
The map \(\phi\) is readily extended beyond elements of \(G\) that correspond to an \(S\)-unit. In this case, \(\phi(u,v)\in\mathbb{R}^{n_1}\times \mathbb{C}^{n_2}\) for \(u\in\mathbb{R}^{n_1+n_2-1}\times \mathbb{Z}_2^{n_1}\times (\mathbb{R}/\mathbb{Z})^{n_2}\) and \(v\in\mathbb{Z}^s\) does not necessarily correspond to an element \(x\in K\) with \(\mathcal{N}(x) = \prod_i p_i^{e_iv_i}\). On the other hand, in general, there is no canonical way to map an element of \(\mathbb{R}^{n_1}\times \mathbb{C}^{n_2}\) that is not an \(S\)-unit to an element of \(G\).
The control group can be seen as the projection of \(\tilde{G} = \mathbb{R}^k\times \mathbb{Z}^l\) where
\(k = n_1 + 2n_2 -1\).
\(l = n_1 + s\).
We denote by \(\gamma:\tilde{G}\rightarrow G\) the projection map, and by \(\tilde{L}\subseteq\tilde{G}\) the pre-image of \(L\) by \(\gamma\). It is a lattice in \(\tilde{G}\). We also construct an oracle \(g = f_q\circ f_c: G/L\rightarrow \mathcal{H}\) where
\(f_c(t,v) = e^{\boldsymbol{t}}\underline{\mathcal{O}}\prod_{\mathfrak{p}\in S}\mathfrak{p}^{-v_i}\), which is a lattice in \(E:= \mathbb{R}^{n_1}\times \mathbb{C}^{n_2}\).
\(f_q(L_{E}) = |L_E\rangle: = \gamma \sum_{v\in L_E} g_s(v) |\mathrm{str}_{\nu,n}(v)\rangle\) which is a quantum state (see Section 4.4 for a definition of the straddle encoding \(|\mathrm{str}_{\nu,n}(v)\rangle\) instroduced in [18].
To prove the HSP properties of \(f: G\rightarrow \mathcal{H}\), we need a notion of distance between ideals of \(E = \mathbb{R}^{n_1}\times \mathbb{C}^{n_2}\). An ideal in \(E\) is a lattice that is stable by multiplication by elements in \(\underline{\mathcal{O}}\) (the embedding of \(\mathcal{O}\) in \(E\)). We deal with elements in \(E\) by embedding them in \(\mathbb{R}^n = \mathbb{R}^{n_1 + 2n_2}\) (via \(z\in\mathbb{C}\mapsto \mathfrak{Re}(z),\mathfrak{Im}(z)\)). Each \(E\)-ideal \(\mathcal{L}\) can be defined by a matrix \(M_\mathcal{L}\in\mathbb{R}^{n\times n}\) whose rows are a \(\mathbb{Z}\)-basis of \(\mathcal{L}\). Note that \(E\)-ideals \(\mathcal{L},\mathcal{L'}\) can be multiplied, but \(M_{\mathcal{L}\mathcal{L'}}\) is in general not equal to \(M_\mathcal{L}M_\mathcal{L'}\).
Notation 1. The Euclidean norm is used in different spaces. When there is a potential ambiguity, we use a subscript to specify the space. More specifically, suppose there is a group \(H\) and \(s,t\) such that \(\alpha:H\hookrightarrow \mathbb{R}^s\times \mathbb{C}^t\), then for \(x\in H\), we denote by \(\|x\|_H = \|\alpha(x)\|\), i.e. if \(x = (x_1,\ldots,x_{s+t})\), then \(\|x\|_H = \sqrt{\sum_{i\leq s} |x_i|^2 + 2\sum_{i> s} |x_i|^2}\).
Definition 4 (Matrix distance between \(E\)-ideals). Let \(\mathcal{L},\mathcal{L}'\) be two \(E\)-ideals. We define the matrix distance* between \(\mathcal{L}\) and \(\mathcal{L}'\) by \[\operatorname{dist}(\mathcal{L},\mathcal{L}') = \inf_{A,M_\mathcal{L},M_{\mathcal{L}'}}\{\|A\|_2:\;M_\mathcal{L}= M_{\mathcal{L}'}e^A,\;A\in \operatorname{Gl}_n(\mathbb{R})\}\]*
As in [22], given an element \(x\in E\), we define the matrix \(\operatorname{diag}(x)\in\mathbb{R}^{n\times n}\) which is not exactly a diagonal matrix. \[\operatorname{diag}(x) := \begin{pmatrix} x_1 & & & & & \\ & \ddots & & & & \\ & & x_{n_1} & & & \\ & & & \Xi(x_{n_1+1}) & & \\ & & & & \ddots & \\ & & & & & \Xi(x_{n_2}) \end{pmatrix}\text{ where } \Xi(z) := \begin{pmatrix} \mathfrak{Re}(z) & -\mathfrak{Im}(z)\\ \mathfrak{Im}(z) & \mathfrak{Re}(z) \end{pmatrix}.\] Given \(x\in E\), the above matrix has the important property that \(M_{(x)\cdot\mathcal{L}} = M_\mathcal{L}\cdot\operatorname{diag}(x)\) where \(\mathcal{L}\) is an \(E\)-ideal, and \((x)\) denotes the \(E\)-ideal \(x\cdot \underline{\mathcal{O}}\) (a principal ideal generated by \(x\)). This is a case where ideal multiplication corresponds to a product of matrices (although \(\operatorname{diag}(x)\) is not \(M_{(x)}\)).
Lemma 1. Matrices of the form \(\operatorname{diag}(x)\) have the following properties:
\(\forall x_1,x_2\in E\), \(\operatorname{diag}(x_1) + \operatorname{diag}(x_2) = \operatorname{diag}(x_1 + x_2)\).
\(\forall x_1,x_2\in E\), \(\operatorname{diag}(x_1)\cdot\operatorname{diag}(x_2) = \operatorname{diag}(x_1\cdot x_2)\).
\(\forall x\in E\), \(e^{\operatorname{diag}(x)} = \operatorname{diag}(e^x)\) where \(e^x = (e^{x_1},\ldots,e^{x_{n_1+n_2}})\).
\(\forall x\in E\), if \(\|\operatorname{diag}(x)-I\|<1\), then \(\log(\operatorname{diag}(x)) = \operatorname{diag}(\log(x))\) where \(\log(x) = (\log(x_1),\ldots,\log(x_{n_1+n_2}))\).
Proof. For 1) and 2), it suffices to check that \(\forall z,z'\in \mathbb{C}\), \(\Xi(z) + \Xi(z') = \Xi(z + z')\), and \(\Xi(z)\cdot\Xi(z') = \Xi(zz')\). Then, since \(e^A = \sum_{k=0}^\infty \frac{A^k}{k!}\), we have \[e^{\operatorname{diag}(x)} = \sum_{k=0}^\infty \frac{\operatorname{diag}(x)^k}{k!} = \operatorname{diag}\left( \sum_{k=0}^\infty \frac{x^k}{k!}\right) = \operatorname{diag}(e^x).\] Likewise, to prove 4), we simply use the convergence of \(\sum_{k=1}^\infty (-1)^{k+1}\frac{(B-I)^k}{k}\) to \(\log(B)\) whenever \(\|B-I\|<1\). ◻
Our algorithms for the Class Group Problem () and the Principal Ideal Problem () consist of reductions to the continuous hidden subgroup problem in two steps, and invoking the quantum HSP algorithm [18] at the end. \[\begin{align} \text{\sf CGP}& \leq_C S_{\text{\sf CGP}}\text{-units} \leq_Q \text{HSP}(\mathbb{R}^{O(n)}), \\ \text{\sf PIP}& \leq_Q S_{\text{\sf PIP}}\text{-units} \leq_Q \text{HSP}(\mathbb{R}^{O(n)}) \, . \end{align}\] Specifically, we first reduce them to \(S\)-unit problems with proper choices of \(S\), which are almost entirely classical except that we apply a quantum algorithm for factoring ideals in the case of 1. We describe these reductions to \(S\) problems in Sect. [sect_red2su]. Next we show a quantum reduction from \(S\) problem for any \(S\) to \(\text{HSP}(\mathbb{R}^m)\), with \(m = O(|S|,n)\). This is the main technical contribution of this work and it generalizes the reduction from (ordinary) unit-group problem to by Eisenträger et al. [18]. The details will appear in Section 8, and we give an overview below.
Given \(S=\{\mathfrak{p}_1,\ldots,\mathfrak{p}_k\}\), we want to establish a function that hides the \(S\)-unit group according to Definition 2. To warm up, we review the reduction for the ordinary unit group (i.e., \(S=\emptyset\)) [18].
Observe that the unit group can be identified as a subgroup of \(G := \mathbb{R}^{n_1+n_2}\times\mathbb{Z}_2^{n_1}\times(\mathbb{R}/\mathbb{Z})^{n_2}\), and the mapping \[\begin{align} \varphi:\quad& (u_1,\ldots,u_{n_1+n_2},\mu_1,\ldots,\mu_{n_1},\theta_1,\ldots,\theta_{n_2})\\ \mapsto &(\ldots,(-1)^{\mu_i}e^{u_i},\ldots,\ldots,e^{2\pi i \theta_i}e^{u_i},\ldots) \, . \end{align}\] translates between the so-called log coordinates and the conjugate vector representation. To see this, note that under canonical embeddings, any \(z\in\mathcal{O}\) has the conjugate vector representation \((\ldots,\sigma_i(z),\ldots)\in \mathbb{R}^{n_1}\times\mathbb{C}^{n_2}\). If in addition \(z\) is invertible, then \(\sigma_i(z) \neq 0\). Therefore, we can write \(\sigma_i (z) = (-1)^{\mu_i}e^{u_i}\) with \(\mu_i \in \mathbb{Z}_2\) and \(u_i \in \mathbb{R}\) if \(\sigma_i\) is real, or \(\sigma_i(z) = e^{2\pi i \theta_i}e^{u_i}\) with \(\theta_i \in \mathbb{R}/\mathbb{Z}\) and \(u_i \in \mathbb{R}\) if \(\sigma_i\) is complex.
Now one defines \(f\) in [18] as composition of two mappings: \[f: G \stackrel{g}{\longrightarrow} \{E\text{-ideals}\} \stackrel{f_q}{\longrightarrow} \{\text{quantum states}\} \, .\] Given \(x\in G\), \(g(x) : = \varphi(x)\underline{\mathcal{O}}\subseteq E\) produces an \(E\)-ideal which is a transformed lattice of \(\underline{\mathcal{O}}\). This is motivated by the fact that \(\alpha \mathcal{O}=\mathcal{O}\) for any unit \(\alpha\in \mathcal{O}^*\). Actually, one can verify easily that \(g(x) = g(y)\) iff. \(\varphi(x-y) \in \mathcal{O}^*\). Namely \(g\) is periodic on \(\mathcal{O}^*\). For lacking of a canonical basis to represent real-valued lattices uniquely, which is needed to apply the quantum HSP algorithm, a quantum mapping \(f_q\) follows. It encodes a lattice \(L\) into a quantum state \(|L\rangle\) that is roughly composed of quantum superposition over all lattice points, and hence provides a canonical representation for lattices. We will give more details of the quantum encoding in Sect. 4.
Very informally, one can show that small shift on an input to \(g\) causes small variance on the output lattice, but two inputs that are far apart modulo any unit will be mapped to lattices that have small overlap. Moreover, \(f_q\) preserves the “closeness” of lattices. Namely, quantum encodings of two lattices will have substantial inner product if and only if the lattices are very well lined up. To formalize these statements and thus proving the HSP properties, nonetheless, turn out to be highly non-trivial. It involves for example defining proper distance measures on various input and output spaces, and analyzing the continuity properties of \(f\) with respect to these metrics. This has been a great amount of efforts in [18] with further details in [22]
Other than these analytic properties, to make an efficient reduction, one needs to implement \(f=f_q\circ g\) efficiently. In fact, \(f_q\) can be implemented efficiently on a quantum computer by standard techniques. Computing \(g\), on the other hand, is much more tricky. For instance \(e^{u_i}\) will involve doubly-exponential numbers if we manipulate them naively. Instead one splits the computation into small pieces, in the spirit of repeated squaring, and carefully controls the precision. There is one key observation that guarantees that the size of any intermediate step does not blow up. That is \(\mathcal{N}(z) = \pm 1\) for any unit \(z\) and hence \(\prod_{i=1}^{n_1} e^{u_i} \prod_{j=1}^{n_2} e^{2 u_{n_1+j}} = 1\). This indicates one redundant coordinate, and we can hence restrict \(f\) on \(\mathbb{R}^{n_1+n_2-1}\times\mathbb{Z}_2^{n_1}\times(\mathbb{R}/\mathbb{Z})^{n_2}\) instead. This characterization is also essential to show a suitable bound on the volume of the unit cell of \(\mathcal{O}^*\).
It is now easier to describe our generalized reduction for \(S\). Let \(S=\{\mathfrak{p}_1,\ldots,\mathfrak{p}_k\}\). By definition, if \(\alpha\in \mathcal{O}\) is an \(S\)-unit, we have \[\alpha\cdot \mathcal{O}\cdot \mathfrak{p}_1^{-v_{\mathfrak{p}_1}(\alpha)}\cdots \mathfrak{p}_{k}^{-v_{\mathfrak{p}_{k}}(\alpha)} = \mathcal{O},\] where \(v_\mathfrak{p}(\alpha)\) is the coefficient of \(\mathfrak{p}\) in the power of \((\alpha)\mathcal{O}\) (the valuation of \(\alpha\) at \(\mathfrak{p}\)). Therefore the group of \(S\)-units \(U_S\) corresponds to the subgroup of \(G = \mathbb{R}^ {n_1+n_2-1}\times \mathbb{Z}_2^{n_1}\times (\mathbb{R}/\mathbb{Z})^{n_2}{\times \mathbb{Z}^{s}}\) such that \(\phi(y,v)\cdot \underline{\mathcal{O}}\cdot \mathfrak{p}_1^{-v_1}\cdots \mathfrak{p}_{|S|}^{-v_{|S|}} = \underline{\mathcal{O}}\). This motivates us to define the function \(f_c: G \to \{E\text{-ideals}\}\) by: \[f_c:(y,v_1,\ldots,v_{|S|})\longmapsto \phi(y,v)\cdot \underline{\mathcal{O}}\cdot \mathfrak{p}_1^{-v_1}\cdots \mathfrak{p}_{|S|}^{-v_{|S|}}\, .\]
We can show that \({\hat{g}}\) is periodic on \(U_S\). We then apply the same quantum encoding \(f_q\) on the output of \({\hat{g}}\). Namely, our oracle function behaves like: \[f: G \stackrel{f_c}{\longrightarrow} \{E\text{-ideals}\} \stackrel{f_q}{\longrightarrow} \{\text{quantum states}\} \, .\]
While the classical mappings \(g\) and \(f_c\) bear some similar motivation and we reuse \(f_q\), to prove HSP properties of our function \(f\) is not straightforward. We need to define new metrics tailored to the specific group structure that the \(S\)-units belong and the \(E\)-ideals (lattices in \(\mathbb{R}^n\)) that our \(f_c\) may possibly generate. Then we show quantitatively that under these metrics, small variance in inputs induces slightly perturbed lattices, whereas large variance of inputs modulo any \(S\)-units will induce with high fraction of mismatch. Finally we relate the new metrics to the analysis of [18] and conclude the properties. We further extend the function \(f\) to obtain an instance on \(\mathbb{R}^m\) and work out the necessary bounds \((\lambda,d)\) as required, which allows us to invoke the quantum HSP algorithm to recover \(U_S\).
Our algorithm relies on a classical oracle that takes an element in \(G\) and maps it to \[f_c(y,v_1,\cdots.v_{|S|}) = \phi(y,v)\cdot \underline{\mathcal{O}}\mathfrak{p}_1^{-v_1}\cdots\mathfrak{p}_{|S|}^{-v_{|S|}} \, .\] Then the corresponding lattice is encoded by an approximation of the superposition of all its points denoted by \(f_q\). As \(G = \mathbb{R}^ {n_1+n_2-1}\times \mathbb{Z}_2^{n_1}\times (\mathbb{R}/\mathbb{Z})^{n_2}\times \mathbb{Z}^{|S|}\), we need to work with approximations of real numbers. To perform the necessary arithmetic operations between \(E\)-ideals presented in Section 4.2, we use the results of Buchmann and Pohst [23] and of Buchmann and Kessler [24] which rely on fixed point approximations. More specifically, they use the rounding of the \(2\)-adic expansion of real numbers. The approximation of \(a\in\mathbb{R}\) of precision \(q\in\mathbb{Z}_{>0}\) is \(\widehat{a}\in\mathbb{Z}\) such that \(\left| \frac{\widehat{a}}{2^q} - a\right| \leq \frac{1}{2^{q+1}}\). However, it seems that this notion of approximation is not stable when we multiply two approximate numbers together. We made a slight adjustment to their claims to incorporate the case of approximations such that \(\left| \frac{\widehat{a}}{2^{q_0}} - a\right| \leq \frac{1}{2^{q}}\) for some \(q_0\geq q\). Then in Section 4.3 we show that the classical oracle runs in polynomial time with respect to the size of the input.
Let \((y,v_1,\cdots,v_|S|)\in \mathbb{R}^ {n_1+n_2-1}\times \mathbb{Z}_2^{n_1}\times (\mathbb{R}/\mathbb{Z})^{n_2}\times \mathbb{Z}^{|S|}\). The naive computation of \[f_c(y,v_1,\cdots,v_{|S|}) = \phi(y,v)\cdot \underline{\mathcal{O}}\cdot \mathfrak{p}_1^{-v_1}\cdots \mathfrak{p}_{|S|}^{-v_{|S|}}\] involves computing \((e^{u_i})_{i\leq n_1+n_2}\), where \(y = (u_1,\cdots,n_{n_1+n_2},\theta)\) and \(u_1\) is computed by the rule given by 1 with a phase \(\theta\in\mathbb{Z}_2^{n_1}\times (\mathbb{R}/\mathbb{Z})^{n_2}\). Any rational approximation of \(e^{u_i}\) has at least \(\left\lceil \log_2(e^{u_i}) \right\rceil \in O(u_i)\) bits where \(\log_2\) denotes the base \(2\) logarithm. As this is exponential in the bit size of the entry, we need to proceed differently to evaluate \(f_c\). The authors of [18] described a way to split up the computation ensuring that we only manipulate values of polynomial size. We adapt this method to our specific classical oracle that differs by a term of the form \(\prod_{\mathfrak{p}_i\in S}\mathfrak{p}_i^{-v_i}\) from the one described in [18].
Our input can be split between \((u_1,\cdots,u_{n_1+n_2},v_1,\cdots,v_{|S|})\in \mathbb{R}^{n_1+n_2}\times \mathbb{Z}^{|S|}\) and a phase \(\theta\in \mathbb{Z}_2^{n_1}\times (\mathbb{R}/\mathbb{Z})^{n_2}\). As mentioned in [18], the phase can be dealt with separately and is not computationally problematic. To make our presentation simpler, we show how to split up the computation in the absence of phase. To avoid the expensive computations with the \(e^ {u_i}\), we use \(E\)-ideal arithmetic which we analyze in Section 4.2. Our main concern when splitting up the computation is that we want to reduce it to operations between \(E\)-ideals of determinant \(\sqrt{|\Delta|}\). This gives us upper and lower bounds on the vectors in play, which in turns bounds the computational complexity of arithmetic operations as we see in Section 4.3.
Let \((u_1,\cdots,u_{n_1+n_2},v_1,\cdots,v_{|S|})\in\mathbb{R}^{n_1+n_2}\times\mathbb{Z}^{|S|}\) be an input vector where \(u_1\) satisfies the condition given by 1 . We can separate the evaluation of the oracle in two steps by rewriting it as \[\left(u_1,\cdots,u_{n_1+n_2-1},u'_{n_1+n_2},0,\cdots,0\right) + \left(0,\cdots,0,\frac{1}{2}\sum_je_jv_j\log(p_j),v_1,\cdots,v_{|S|}\right).\] where \(u'_{n_1+n_2} = -\frac{1}{2}\sum_{j\leq n_1}u_j - \sum_{n_1<j< n_1+n_2}u_j\). The first term is evaluated the same way as [18]. More specifically, we separate real numbers between integer and fractional part. We define \((r_j)_{j\leq n_1+n_2}\in\mathbb{Z}^{n_1+n_2}\) and \((s_j)_{j\leq n_1+n_2}\in [0,1)^{n_1+n_2}\) by \(u_j = r_j + s_j\) for \(j<n_1+n_2\), \(r_{n_1+n_2} := -\sum_{j<r_1+r_2}r_j\) and \(s_{n_1+n_2} := u'_{n_1+n_2}-r_{n_1+n_2}\). As \(s_i < 1\), we calculate \(e^{s_i}\) to a given precision \(q\) by using the formula \(e^x = \sum_{k\leq M}\frac{x^ k}{k!} + O(x^{M+1})\). The number of terms in the sum has to satisfy \(M\in O(q)\). This way, we can compute \(\phi(s_1,\cdots,s_{n_1+n_2})=(e^ {s_1},\cdots.e^{s_{n_1+n_2}})\) and the corresponding \(E\)-ideal \(A_{-1}:=(e^ {s_1},\cdots.e^{s_{n_1+n_2}})\cdot\underline{\mathcal{O}}\) by multiplication with each generator of \(\underline{\mathcal{O}}\). Let \((a_k^{(j)})\in\{-1,0,1\}\) be such that \(r_j = \sum_{k\leq \lceil\log_2(r_j)\rceil}a_j^{(k)}2^k\) is the binary decomposition of \(r_j\) for \(j < n_1+n_2\) and \(a_k^{(n_1+n_2)} := -\sum_{j<n_1+n_2}a_k^{(j)}\) and \(\log_2(r) := \max_j\lceil\log_2(r_j)\rceil\). Note that we have \(u'_{n_1+n_2} = \sum_k a_k^{(n_1+n_2)}2^k\), but the \(a_k^{(n_1+n_2)}\) are not its binary decomposition. They take values in \([-n_1-n_2,n_1+n_2]\). The \(E\)-ideal generated by the integer part of the \(u_i\) satisfies \[\begin{align} (e^{r_1},\cdots,e^{r_{n_1+n_2}})\cdot\underline{\mathcal{O}} & = \prod_{k\leq \log_2(r)}\left( e^{a_1^{(k)}2^k},\cdots,e^{a_{n_1+n_2}^{(k)}2^k}\right)\cdot\underline{\mathcal{O}}\nonumber\\ &= \prod_{k\leq \log_2(r)}\left[ \underbrace{\left( e^{a_1^{(k)}},\cdots,e^{a_{n_1+n_2}^{(k)}}\right)\cdot\underline{\mathcal{O}}}_{A_k}\right] ^{2^k}.\label{eq:def95Aj} \end{align}\tag{2}\] The norm of the \(E\)-ideals \(A_k\) for \(k\leq \log_2(r)\) is \(\mathcal{N}(A_k) = e^{\sum_j a_j^{(k)}}\mathcal{N}(\underline{\mathcal{O}})=1\). Therefore \(\det(A_k) = \sqrt{|\Delta|}\).
Likewise, the bit size of \(e^{e_1v_i\log(p_i)}\) is at least proportional to \(v_i\), and therefore exponential in the bit size of \(v_i\) which is part of the input. Therefore, we need to split up the computation of the \(E\)-ideal \[\left(0,\cdots,0,\frac{1}{2}\sum_je_jv_j\log(p_j),v_1,\cdots,v_{|S|}\right) \longmapsto \left( 1 , \cdots , 1 , e^{\frac{1}{2}\sum_je_jv_j\log(p_j)}\right)\cdot\underline{\mathcal{O}} \cdot\prod_{j}\mathfrak{p}_j^{-v_j}.\] Let \((b_j^{(k)})\) such that \(v_j = \sum_{k\leq \lceil\log_2(v_j)}b_j^{(k)}2^k\) and \(\log_2(v):= \max_j\log_2(v_j)\). Then we have the decomposition \[\begin{align} \left( 1 , \cdots , 1 , e^{\frac{1}{2}\sum_je_jv_j\log(p_j)}\right)\cdot\underline{\mathcal{O}} \cdot\prod_{j}\mathfrak{p}_j^{-v_j} & = \prod_{j\leq |S|} \left( \left( 1 , \cdots , 1 , e^{e_j\log(p_j)}\right)\cdot\underline{\mathcal{O}}\cdot\mathfrak{p}_j^{-1}\right)^{v_j}\nonumber\\ &= \prod_{j\leq |S|} \prod_{k\leq \log_2(v)} \left( \left( 1,\cdots,1,e^{e_j\log(p_j)}\right)\cdot\underline{\mathcal{O}}\cdot \mathfrak{p}_j^{-1}\right)^{b_j^{(k)}2^k}.\nonumber\\ &= \prod_{k\leq \log_2(v)}\left( \prod_{j\leq|S|}\left( \underbrace{(1,\cdots,1,p^{e_j})\cdot\underline{\mathcal{O}}\cdot \mathfrak{p}_j^{-1}}_{B_{j,k}}\right)^{b_j^{(k)}}\right)^{2^k}\label{eq:def95Bj} \end{align}\tag{3}\] The calculation is decomposed the following way: first compute \(B_k := \prod_{j\leq |S|}B^{b_j^{(k)}}_{j,k}\) which involves \(\log_2(v)\cdot|S|\) multiplications between the \(E\)-ideals \(B_{j,k}\) which have determinant \(\sqrt{|\Delta|}\), and then return \(\prod_{k\leq \log_2(v)}B_{k}^{2^k}\) which requires at most \(\log_2(v)^2\) multiplications between the \(E\)-ideals \(B_k\) which also have determinant \(\sqrt{|\Delta|}\).
Proposition 1. Algorithm 1 is correct and involves a polynomial number of multiplications between \(E\)-ideals of determinant \(\sqrt{|\Delta|}\).
The arithmetic between \(E\)-ideals is directly inspired from the arithmetic between ideals in a number field. To evaluate our classical oracle, we need an efficient implementation of the \(E\)-ideal multiplication. Let \(A = \oplus_{j\leq n} \mathbb{Z}a_j\) and \(B = \oplus_{k\leq n} \mathbb{Z}b_k\) be \(E\)-ideals generated by the \(a_j,b_k\in E\). Then the \(E\)-ideal \(A\cdot B\) is the lattice generated by the \(n^2\) elements \((a_j\cdot b_k)_{j,k\leq n}\). The multiplication of two \(E\)-ideals can be described by the two following steps:
Calculate all the cross terms \(a_j\cdot b_k\) for \(j,k\leq n\).
Compute a basis \((c_j)_{j\leq n}\) of \(\sum_{j,k}\mathbb{Z}a_j\cdot b_k\).
The main challenge of \(E\)-ideal multiplication is that we need to deal with rational approximations of lattices. We need to estimate how much precision is needed to ensure accuracy, and how much precision is lost after each operation. We use the same strategy as in [18], but we need to make a slight adaptation since rational approximations such that \(\left| \frac{\widehat{a}}{2^{q}}-a\right|\leq \frac{1}{2^{q+1}}\) do not seem to be stable by multiplication.
Let \(\Lambda\) be an \(E\)-ideal for which we want to find a basis of short vectors in polynomial time. As the Euclidean norm is preserved by the mapping of \(\Lambda\) in \(\mathbb{R}^m\), this problem boils down to computing a short basis of an ideal in \(R^m\). Since the original description of the LLL reduction algorithm [25], the problem of finding a short basis (up to an approximation factor) of a lattice in polynomial time is well understood. The difficulty in this context is that we are dealing with rational approximations of real numbers. Let \(a = (a_1,\cdots,a_m)\in\mathbb{R}^m\) and \(q_0\geq q\geq 0\), we say that \(\widehat{a} = (\widehat{a}_1,\cdots,\widehat{a}_m)\in\mathbb{Z}^m\) is an approximation of \(a\) with precision \(q\) if \(\forall j\leq m, \;\left| \frac{\widehat{a}_j}{2^{q_0}} - a_j\right| \leq \frac{1}{2^q}\) (we showed in the previous paragraph that we can assume \(q_0=q\)). Given an approximate generating set for the lattice \(\Lambda\in\mathbb{R}^m\), we want to compute a basis of short vectors that approximates a basis of short vectors for \(\Lambda\in\mathbb{R}^m\). We rely on a result from Buchmann and Kessler [24] and its modification by Eisenträger, Halgren, Kitaev and Song [18]. These methods were applied to to the case of approximations where \(q_0 = q-1\). However, as we mentioned above, this type or rational approximation is not stable by multiplication, thus forcing us to relax a little bit the definition of rational approximation and extend the results of [24] and [18] to the case \(q_0\geq q\).
Let \(\widehat{a}_1,\cdots,\widehat{a}_k\in\mathbb{Z}^m\) be rational approximations of \(a_1,\cdots,a_k\in\mathbb{R}^m\) of precision \(q\) (and denominator \(q_0\geq q\)). Let \(r\leq k\) be the rank of the lattice generated by \((a_j)_{j\leq k}\). The approach described in [24] consists of applying the LLL reduction algorithm to the rank \(k\) lattice generated by the independent vectors \(\tilde{a}_j := (e_j,\widehat{a}_j), j\leq k\) where \(e_j\) is the \(j\)-th unit vector of \(\mathbb{Z}^k\). The LLL algorithm outputs vectors \(\tilde{b}_j = (m_j,\widehat{b}_j), j\leq k\) such that if the input precision \(q\) is large enough, \(m_1,\cdots,m_{k-r}\) are independent relations for \(a_1,\cdots,a_k\) (i.e. \(\sum_l m_l^{(j)}a_j^{(l)} = 0\)) and the vectors \(b_j = \sum_j m_{k-r+j}^{(l)} a_l, j\leq r\) are a basis for the lattice \(\sum_j \mathbb{Z}a_j\).
The following proposition states our modification of the result of [24] incorporating the cases where \(q_0\geq q\).
Proposition 2 (Theorem 4.1 of [24]). Let \(L = \sum_j \mathbb{Z}a_j\), \(\mu\leq \lambda_1(L) = \min\{ \| v\| \;\mid v\in L\setminus\{0\}\}\), \(\alpha \geq \max\{ \| a_j\| \;\mid 1\leq j \leq k\}\) and \(q\in\mathbb{Z}_{>0}\) such that \[2^q > \frac{(\sqrt{mk} + 1)\lambda 2^{\frac{k-1}{2}}}{\mu},\] where \(\lambda := \left( k\sqrt{m}2^{q_0-q} + \sqrt{k}\right) \frac{\alpha^r}{\det(L)}\). Then the vectors \(m_1,\cdots,m_{k-r}\) are linearly independent relations for the \((a_j)_{j\leq k}\) and the vectors \(b_j = \sum_{l\leq k} m_{k-r+j}^{(l)} a_l\) form a basis of \(L\) satisfying \[\| b_j\| \leq \left( \sqrt{km}+1\right) 2^{\frac{k-1}{2}+1+q_0-q}\lambda_{j}(L_r),\] where \(\lambda_j(L)\) is the \(j\)-th minima of the lattice \(L\) and \(L_r\) is the lattice generated by \(a_1,\cdots,a_r\) which, without loss of generality, are assumed to be linearly independent. Moreover, the Euclidean norm of the \(m_j\) satisfies \[\begin{align} \| m_j\| &\leq 2^{\frac{k-1}{2}}\lambda \;\text{for } 1\leq j\leq k-r\\ \| m_j\| &\leq 2^{\frac{k-1}{2} + q_0 + 1}\lambda_{j-k+r}(L_r)\;\text{for } k-r + 1\leq j\leq k. \end{align}\]
Prop. 2 gives upper bounds for \(\|m_j\|\) in terms of the \(\lambda_j(L_r)\). To obtain upper bounds in terms of the minimas of \(L\), we need to repeat the operation in the rank \(r\) lattice generated by \(b_1,\cdots,b_r\). Before doing so, Proposition 2 needs to be refined. Indeed, the bit size of the coefficients of the \(m_j\) is greater than \(q\). As these are multiplied to the coefficients of the \(\widehat{a}_j\), which are \(q\)-bit approximations or the \(a_j\), the precision deteriorates too much. In [18], a bound on the \(\|m_j\|\) that does not depend on \(q\) is proved. As we use a slight different definition of rational approximation, we had to adapt this result, which we omit since it is rather straightforward.
To estimate the asymptotic complexity of the classical oracle, we need to combine the results of Section 4.1 and Section 4.2. Let \((y,v_1,\cdots,v_|S|)\in \mathbb{R}^ {n_1+n_2-1}\times \mathbb{Z}_2^{n_1}\times (\mathbb{R}/\mathbb{Z})^{n_2}\times \mathbb{Z}^{|S|}\). We want to compute a poly-size basis of \[f_c(y,v_1,\cdots,v_{|S|}) = \phi(y)\cdot \underline{\mathcal{O}}\cdot \mathfrak{p}_1^{-v_1}\cdots \mathfrak{p}_{|S|}^{-v_{|S|}},\] in polynomial time in \(\max_j\{\log(|y_j|)\}\), \(|S|\), \(\max_j\{\log(p_j)\}\), \(\max_j\{\log|v_j|\}\), \(m\), and \(\log|\Delta|\). According to [24], if the precision \(q\) is chosen minimal before each \(E\)-ideal multiplication, then the complexity of each multiplication is in \(O\left( (m + k)^6 \cdot k^3 \log(\alpha/\mu)^3\right)\) where \(k = n^2\). As we mentioned earlier, we can ensure at each round that \(q_0=q\) to minimize the loss of precision. However, we need to take \(q\) large enough in input of the classical oracle to account for the loss of precision induced by all the arithmetic operations between \(E\)-ideals involved.
Theorem 2. The complexity of the classical oracle is in \[\tilde{O}\left( \|(y,v)\|^2 n^{5+\varepsilon} \left( \left(n\log\left( |\Delta|\right) + n^2 + \|(y,v)\|^2\right)^{1+\varepsilon}\right) + |S|\max_j(\log(p_j)^3)\right),\] where \(\varepsilon > 0\) is arbitrarily small.
Proof. Here again, we omit the proof which is tedious but rather straghtforward. ◻
Let \(g_s(\cdot)\) be the Gaussian function \(g_s(x): = e^{-\pi \|x\|^2/s^2}, x\in \mathbb{R}^n\). For any set \(S\subset \mathbb{R}^n\), denote \(g_s(S): =\sum_{x\in S} g_s(x)\). Given a lattice \(L\), the quantum encoding maps \(L\) to the lattice Gaussian state via \[\begin{CD} \{\text{Lattices over }E\} \quad @>{f_q}>> \mathcal{S}\text{ (unit vectors in a Hilbert space)}\\ L @>>> |L\rangle: = \gamma \sum_{v\in L} g_s(v) |\mathrm{str}_{\nu,n}(v)\rangle \end{CD},\] where \(\gamma\) is a factor that normalized the state. Here \(|\mathrm{str}_{\nu,n}(v)\rangle\) is the straddle encoding of a real-valued vector \(v\in\mathbb{R}^n\), as defined in [18]. Intuitively, we discretize the space \(\mathbb{R}^n\) by a grid \(\nu\mathbb{Z}^n\), and we encode the information about \(v\) by a superposition over all grid nodes surrounding \(v\). Specifically, for the one-dimensional case, the straddle encoding of a real number is \[x \in \mathbb{R}\mapsto |\mathrm{str}_\nu(x)\rangle: = \cos(\frac{\pi}{2} t) |k\rangle + \sin(\frac{\pi}{2} t) |k+1\rangle \, ,\] where \(k: = \lfloor x/\nu \rfloor\) denotes the nearest grid point no bigger than \(x\) and \(t: = x/\nu - k\) denotes the (scaled) offset. Repeat this for each coordinate of \(v=(v_1,\ldots, v_n)\) we get \(|\mathrm{str}_{\nu,n}(v)\rangle := \bigotimes_{i=1}^n |\mathrm{str}_\nu(v_i)\rangle\). We recall some properties about straddle encoding from [18]. This will be useful to prove the HSP properties of our function.
Fact 1. Let \(v,w\in \mathbb{R}^n\). The following hold
\(\| |\mathrm{str}_{\nu,n}(v)\rangle - |\mathrm{str}_{\nu,n}(w)\rangle \| \leq \frac{\pi}{2\nu} \sqrt n \cdot \| v - w\|\).
If \(\| v - w \| \geq 2\sqrt n \nu\), then \(\langle \mathrm{str}_{\nu,n}(v) | \mathrm{str}_{\nu,n}(w)\rangle = 0\).
In our lattice Gaussian states, we will always make sure \(\lambda_1(L)> 2\sqrt n \nu\) so that \[\langle \mathrm{str}_{\nu,n}(v) | \mathrm{str}_{\nu,n}(u)\rangle = 0 \text{ whenever } v \neq u \, .\] In this case we can compute the normalization factor \(\gamma = \left( g_{\frac{s}{\sqrt 2}}(L)\right)^{-1/2}\). As shown in [18], one can efficiently compute \(f_q\) if the lattice satisfies certain conditions and a good basis is given (e.g., \(L\) is LLL-reduced). Namely there is an efficient quantum circuit creating lattice Gaussian states. We state this result as a fact below and will invoke it as a black-box.
Fact 2. Let \(L\) be an LLL-reduced basis. Assume that \(\lambda_1(L)\geq \lambda\), \(\det(L) \leq d\) and \(s\geq n^{n/2+1} 2^n \lambda^{-n+1} d\). There is a quantum algorithm that takes \(L\) as input and produces a state that is \(2^{-\Omega(n)}\)-close to \(|L\rangle= \gamma \sum_{v\in L} g_s(v) |\mathrm{str}_{\nu,n}(v)\rangle\) within time \(\mathrm{poly}(n, \log s, \log \frac{1}{\nu})\).
Theorem 3. Let \(f\) be the function \(G = \mathbb{R}^{n_1 + n_2 - 1}\times \mathbb{Z}_2^{n_1} \times \left(\mathbb{R}/\mathbb{Z}\right)^{n_2}\times \mathbb{Z}^s \rightarrow \mathcal{H}\) defined by \((\boldsymbol{y},\boldsymbol{v})\mapsto |\phi(\boldsymbol{y}) \underline{\mathcal{O}}\prod_{\mathfrak{p}\in S}\mathfrak{p}^{-v_i}\rangle\). There is \(r,\varepsilon>0\) such that \[d_{G/L}(x,y) := \min_{v\in U^G_S}\|x-y-v\|\ge r \Rightarrow \bigl|\langle{f(x)}|{f(y)}\rangle\bigr|\le\varepsilon\]
Our proof relies on some statements on lattices available in [22]. As in [22], we first introduce a central notion called the approximate intersecting sublattice of two lattices \(L\) and \(L'\) in \(\mathbb{R}^m\).
Definition 5 (\(\delta\)-approximate intersecting sublattice). Let \(L\) and \(L'\) be two lattices of dimension \(n\) in \(\mathbb{R}^m\). Let \(Y:= \{ (x,x'): x\in L_R, x' \in L'_R, \|x - x'\|\leq \delta\}\) and \(X: = Y|_1\) (\(X': = Y|_2\)) be the corresponding set of points \(x\) (resp. \(x'\)). Define \(\Lambda:=\langle X\rangle\) (\(\Lambda':=\langle X' \rangle\) resp.) be the sublattice generated by points in \(X\) (\(X'\) resp.). We call \(\Lambda\) (\(\Lambda'\)) the \(\delta\)-approximate intersecting sublattice of \(L\) (resp. \(L'\)) between \(L\) and \(L'\).
Here \(L_R = L \cap \mathbf{B}_R\) are the lattice points inside a sphere of radius \(R=\sqrt n s\), where \(s\) is the Gaussian width in the lattice Gaussian state. This definition indeed captures the overlap (up to \(\delta\)-approximation) between two lattices. Intuitively, \(\Lambda\) and \(\Lambda'\) can be paired up that are “close”, and all the other pairs of points will be “far” apart. This overlap is the main contribution to the inner product between the quantum encoding of two lattices, and we show that if it generates a proper sublattice, we can bound the scalar product. This is formalized below as shown in [18]. We sketch a proof for completeness.
Fact 3 (Lemma E.6 of [22]). Let \(L,L',\Lambda\) and \(\Lambda'\) be as in Definition 5. Suppose that: \(\lambda_1\geq \lambda\), \(\lambda_1' \geq \lambda\). Then there is a one-one correspondence \(h:\Lambda\to \Lambda'\) such that
\(\forall x \in \Lambda, \| x - h(x)\| \leq \beta \|x\|\) with \(\beta:= n(\sqrt n R/\lambda)^n \cdot \frac{\delta}{R}\);
For any \(x \in L_R\) and any \(x' \in L'_R\), if \(x' \neq h(x)\) (in particular if \(x\notin \Lambda\) or \(x'\notin \Lambda'\), \(\|x - x'\| > \delta\).
Proof. (Sketch) Pick \(x_i \in X: i = 1,\ldots, n\) that are linearly independent and let \(x_i'\) be the corresponding points in \(X'\). Let \(h: x_i \mapsto x_i'\) and this extends to a linear map from \(\bar\Lambda\) to \(\bar\Lambda'\). The second property holds immediately by definition. To show the first one, let \(x\in \Lambda\) and write it as \(x= \sum_i \alpha_i x_i, \alpha_i \in \mathbb{R}\). Using Cramer’s rule, Hadamard inequality and Minkowski’s second theorem, one can get \(|\alpha_i| \leq (\sqrt n R /\lambda)^n \frac{\|x\|}{R}\). Therefore \(\|x - h(x)\| = \|\sum_i \alpha_i (x_i - h(x_i))\| \leq \sum_i |\alpha_i| \delta \leq \beta \|x\|\) with \(\beta=n(\sqrt n R/\lambda)^n\cdot \frac{\delta}{R}\). ◻
If we pick the straddle encoding fine enough such that \(2\sqrt n \nu < \delta\), it follows that the inner product between their quantum encodings will be solely contributed by \(\Lambda\) and \(\Lambda'\). In particular:
Fact 4 (Lemma E.7 of [22]). Let \(\mathcal{L}\) and \(\mathcal{L}'\) be two \(E\)-ideals with \(\max\{\det(\mathcal{L}), \det(\mathcal{L}')\} \leq d\) and \(\min{\lambda_1(\mathcal{L}), \lambda_1(\mathcal{L}')}\ge \lambda\). Let \(\Lambda\) and \(\Lambda'\) be the \(\delta\)-intersecting sublattices of \(\mathcal{L}\) and \(\mathcal{L}'\) respectively, as defined in Definition 5. If \(\Lambda\subsetneq \mathcal{L}\) (which implies \(\Lambda' \subsetneq \mathcal{L}'\)), then \(\langle \mathcal{L} | \mathcal{L}'\rangle \leq 3/4\) whenever \(s\geq 4\pi n^{n/2+3} d /\lambda^{n-1}\).
The two previous claims give us a sufficient condition for \(\langle \mathcal{L} | \mathcal{L}'\rangle\leq 3/4\). To prove the \((r,\varepsilon)\)-condition, we need to relate the properties of \(\Lambda\) to our notion of distance between the preimages in \(G\). We first prove a sufficient condition on \(\operatorname{dist}(\mathcal{L},\mathcal{L}')\) in Lemma [lemma:psub], which ensures that the approximate intersecting sublattices \(\Lambda\) and \(\Lambda'\) be proper.
Lemma 2. If \(\operatorname{dist}(\mathcal{L},\mathcal{L}') \geq r= \frac{1}{2\sqrt n |\Delta|}\) and \(\beta < \frac{1}{20n^{n+2}|\Delta|}\), then the \(\delta\)-intersecting sublattices \(\Lambda\) and \(\Lambda'\) of \(\mathcal{L}\) and \(\mathcal{L}'\) respectively, as defined in Definition 5, become proper sublattices. Namely \(\Lambda\subsetneq \mathcal{L}\) and \(\Lambda' \subsetneq \mathcal{L}'\).
On the other hand, if \(\Lambda= \mathcal{L}\) and \(\Lambda'= \mathcal{L}'\), then there is \(W\) satisfying \(M_{\mathcal{L}'}=M_\mathcal{L}W\) for any bases \(M_{\mathcal{L}'},M_\mathcal{L}\) of \(\mathcal{L}',\mathcal{L}\) that is of the form \(W = e^{\operatorname{diag}(a)}\) for some \(a\) with \(\|a\|\leq \frac{1}{4\sqrt{n}|\Delta|}\).
Proof. Suppose for contradiction that \(\Lambda= \mathcal{L}\) and \(\Lambda' = \mathcal{L}'\). Let \(M_h\) be the matrix induced by \(h\) (wrt to some choice of basis for \(\mathcal{L}\) and \(\mathcal{L}'\)). First we claim that \(\|M_h - I\|_\infty \leq \beta^{(1)} : = n^{n+1} \beta\). To show this, we pick a short basis \((v_1, \ldots, v_n)\) for \(\mathcal{L}\) such that \(\|v_k\| \leq \sqrt k \lambda_k(\mathcal{L})\) for \(k\leq n\), which always exists. Then any \(w \in \mathbb{R}^n\) with \(\|w\| = 1\) can be written as \(w = \sum_i \alpha_i v_i, \alpha_i \in \mathbb{R}\). By Cramer’s rule we have \[|\alpha_i| = \left|\frac{\det(v_1, \ldots, v_{i-1}, w, v_{i+1}, \ldots, v_n)}{\det(v_1,\ldots, v_n)} \right| \leq \frac{(\sqrt n)^n \Pi_{j\neq i} \lambda_j(\mathcal{L})}{\sqrt i \det(\mathcal{L})} \leq n^{n}/{\sqrt i \lambda_i(\mathcal{L})}\, .\] The first inequality uses Hadamard’s inequality and the second inequality invokes Minkowski’s second theorem \(\Pi_j \lambda_j(\mathcal{L}) \leq n^{n/2}\det(\mathcal{L})\). Then \[\|w(M_h - I)\| = \left\|{\sum_i \alpha_i (h(v_i) - v_i)}\right\| \leq \sum_i |\alpha_i|\cdot \|h(v_i) - v_i\| \leq n\cdot \frac{n^n}{\sqrt i \lambda_i(\mathcal{L})} \cdot \beta \|v_i\|\leq n^{n+1}\beta \, .\] This implies that \(\|M_h - I\|_\infty \leq \beta^{(1)}\).
Next, by choosing \(W:= M_h\), we have \(\|W - I\|_2 \leq \beta^{(2)}:=\sqrt{n}\beta^{(1)} = n^{n+3/2}\beta\), and \(M_{\mathcal{L}'} = M_{\mathcal{L}}W\) where \(M_{\mathcal{L}'}\) (resp. \(M_{\mathcal{L}}\)) are matrices for the choice of basis of \(\mathcal{L}'\) (resp. \(\mathcal{L}\)) that corresponds to \(M_h\) (i.e. \(M_{\mathcal{L}'} = M_{\mathcal{L}}M_h\)).
Then, since \(\beta^{(2)} < \left( 20\sqrt{n}|\Delta|\right)^{-1}\), \(W\) is necessarily diagonal (see Claim 1), and hence \(M_{\mathcal{L}'} = M_{\mathcal{L}}e^{\operatorname{diag}(a_i)}\) with \(\|e^{\operatorname{diag}(a_i)} - I\|_2\leq \beta^{(2)}\). This implies that2 \[\|a\|_E =\|\operatorname{diag}(a_i)\|_2 \leq 5\beta^{(2)} < \frac{1}{4\sqrt n |\Delta|}\] when \(\beta < \frac{1}{2n^{n+2}|\Delta|}\), and hence since \(\|\operatorname{diag}(a_i)\|_2\geq \operatorname{dist}(\mathcal{L},\mathcal{L}')\), it contradicts the hypothesis that \(\operatorname{dist}(\mathcal{L},\mathcal{L}') \ge r\). ◻
The following claim is taken from an unpublished version of [22].
Claim 1 (Sections E.2 and E.3 of [22]). Let \(\mathcal{L}\) (resp. \(\mathcal{L}'\)) be \(E\)-ideals of norm 1 admitting a basis represented by the matrix \(M_\mathcal{L}\) (resp. \(M_{\mathcal{L}'}\)) satisfying \(M_{\mathcal{L}'} = M_\mathcal{L}W\) for some matrix \(W\). If \(\|W-I\| < \left( 2\sqrt{n}|\Delta|\right)^{-1}\), then \(W = \operatorname{diag}(z)\) for some \(z\in E\).
Proof. For completeness, we reproduce the proof of this statement as it is presented in [22]. The matrix \(W\) is of the form \(\operatorname{diag}(z)\) if and only if it commutes with all matrices of the form \(\operatorname{diag}(z)\). To check that, it suffices to show that \(M_\mathcal{L}(W\operatorname{diag}(\omega_j) - \operatorname{diag}(\omega_j) W) = 0\) where \((\omega_k)_{k\leq n}\) is an integral basis of \(\mathcal{O}\). Indeed both the \(\omega_k\) and the rows \(b_1,\cdots,b_n\) of \(M_\mathcal{L}\) are linearly independent. We can assume that \(\|\omega_j\|_E\leq \lambda_n(\mathcal{O})\) and \(\|b_j\|_E\leq \lambda_n(\mathcal{L})\). Moreover, we know that \[\lambda_1(\mathcal{L}')\geq \sqrt{n}\mathcal{N}(\mathcal{L}')^{1/n},\;\;\lambda_n(\mathcal{O})\leq \sqrt{n|\Delta|},\;\; \lambda_n(\mathcal{L})\leq \sqrt{n|\Delta|}\mathcal{N}(\mathcal{L})^{1/n}\]. Therefore, since \(\mathcal{N}(\mathcal{L}),\mathcal{N}(\mathcal{L}')=1\), we get \[\forall k ,\;\|b_k(W\operatorname{diag}(\omega_j) - \operatorname{diag}(\omega_j)W)\|\leq 2\|b_k\|_2\|W-I\|_2\|\omega_j\|_E< \sqrt{n}\mathcal{N}(\mathcal{L}')^{1/n}\leq \lambda_1(\mathcal{L}').\] Since \(M_{\mathcal{L}'} = M_\mathcal{L}W\), each \(b_k(W\operatorname{diag}(\omega_j) - \operatorname{diag}(\omega_j)W)\) is a vector of \(\mathcal{L}'\), therefore they have to be \(0\). ◻
Proof of Theorem 3. We need to show that there are \(r,\varepsilon>0\) such that \[d_{G/L}(x,y)\geq r \Rightarrow |\langle f(x)|f(y)\rangle|<\varepsilon,\] where \(d_{G/L}\) is the regular Euclidean distance in \(G/L\), i.e. \(d_{G/L}(x,y) = \min_{u\in L}\| x-y-u\|\). Let \(\mathcal{L}=f_c(x)\) be the lattice corresponding to \(x\) and \(\mathcal{L}'=f_c(y)\) be the one corresponding to \(y\). With the notations of Definition 5, whenever \(\Lambda\subsetneq\mathcal{L}\) (and \(\Lambda'\subsetneq\mathcal{L}'\)), we necessarily have \(|\langle f(x)|f(y)\rangle|\leq 3/4\). Hence, by contraposition, we assume that \(|\langle f(x)|f(y)\rangle| > 3/4\) (which implies \(\Lambda = \mathcal{L}\) and \(\Lambda' = \mathcal{L}'\)), and we prove that this implies that \(d_{G/L}(x,y)\) must be less than a certain bound \(r\).
First, Lemma 2 implies that there is \(\operatorname{diag}(a_i)_{i\leq n}\) (in the sense of the diagonal matrices discussed in Lemma 1) such that \(M_{\mathcal{L}'} = M_{\mathcal{L}}W\) for \(W = e^{\operatorname{diag}(a_i)}\) where \(\|a\|\leq \frac{1}{4\sqrt{n}|\Delta|}\). This means that the matrix distance \(\operatorname{dist}(\mathcal{L},\mathcal{L}')\) is necessarily less than \(\frac{1}{4\sqrt{n}|\Delta|}\).
Next, we want to prove that if \(\operatorname{dist}(\mathcal{L},\mathcal{L}') = \|A\|\) for some \(A\in\operatorname{GL}_n(\mathbb{R})\) with \(M_{\mathcal{L'}} = M_{\mathcal{L}}e^A\), then \(A\) is necessarily of the form \(\operatorname{diag}(a'_i)\). We know that \(\|A\|\leq \|\operatorname{diag}(a_i)\|\leq \frac{1}{4\sqrt{n}|\Delta|}\). Moreover, for all \(A\) close to the zero matrix, the expansion of the matrix exponential tells us that \[\|e^{A} - I\| = \|\sum_{k\geq 1}\frac{A^k}{k!}\| \leq \|A\|\sum_{k\geq 0}\frac{\|A\|^k}{k!}\leq (e-1)\|A\| < 2\|A\|.\] Hence \(\|e^A-I\|< \frac{1}{2\sqrt{n}|\Delta|}\) and we can apply Claim 1 to argue that \(e^A\) is diagonal. Therefore, Since all \(e^A\) with \(M_{\mathcal{L'}} = M_{\mathcal{L}}e^A\) and \(\operatorname{dist}(\mathcal{L},\mathcal{L}') = \|A\|\) must be diagonal, we have that the matrix distance satisfies \(\operatorname{dist}(\mathcal{L},\mathcal{L}') = \|a\|\) for some \(a\) with \(M_{\mathcal{L}'} = M_{\mathcal{L}}e^{\operatorname{diag}(a)}\) (where \(\operatorname{diag}\) of matrices in \(\mathbb{R}^{n\times n}\) is still understood as in Lemma 1).
In terms of \(E\)-lattices, this means that \(\mathcal{L}' = \phi(a^G)\cdot \mathcal{L}\) where \(a^G\) is an element of \(G\) corresponding to \(a\). To construct such an element, we first notice that \(\det(e^{\operatorname{diag}(a)})=1\), which means that the element \(x^a\in\mathbb{R}^{n_1}\times\mathbb{C}^{n_2}\), corresponding to \(e^{\operatorname{diag}(a)}\) satisfies \(|x^a_1| = \frac{1}{\prod_{i=2}^{n_1} |x^a|_i\prod_{i=n_1+1}^{n_2}|x^a_i|^2}\). We can therefore follow the construction of elements of \(G\) from \(S\)-units by treating \(x^a\in \mathbb{R}^{n_1}\times\mathbb{C}^{n_2}\) as if it were in \(U_S\) with all valuations according to primes in \(S\) being \(0\). (i.e. all coordinates of \(a^G\) according to \(\mathbb{Z}^s\) are set to 0). Since \(M_{\phi(a^G)} = e^{\operatorname{diag}(a)}\) is close to the identity matrix, we notice that this construction also directly implies that the real entries of \(\phi(a^G)=x^a\) are close to \(1\), i.e. they are positive, and therefore all entries of \(a^G\) according to \(\mathbb{Z}_2^{n_1}\) are zero. Moreover, each diagonal block \(\Xi_i\) corresponding to a complex coordinate of \(\phi(a^G) = x^a\) is close to the identity block: \[\Xi_i = \begin{pmatrix} \mathfrak{Re} (x^a_i) & -\mathfrak{Im}(x^a_i)\\ \mathfrak{Im}(x^a_i) & \mathfrak{Re} (x^a_i) \end{pmatrix} \sim \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}.\] More specifically, since \(\|e^{\operatorname{diag}(a)}-I\|<\frac{1}{2\sqrt{n}|\Delta|}\), we know that
\(|\mathfrak{Im}(x^a_i)|\leq \frac{1}{2\sqrt{n}|\Delta|}\)
\(| \mathfrak{Re} (x^a_i) - 1|\leq \frac{1}{2\sqrt{n}|\Delta|}\).
Hence, if \(\theta_i\in\mathbb{R}/\mathbb{Z}\) satisfies \(x^a = |x^a|e^{2i\pi\theta_i}\), we have \[|\theta_i|\leq \frac{\pi}{2}|\sin(\theta_i)| = \frac{\pi}{2}\frac{|\mathfrak{Im}(x^a_i)|}{|\mathfrak{Re} (x^a_i)|}\leq \pi |\mathfrak{Im}(x^a_i)| \leq \frac{\pi}{2\sqrt{n}|\Delta|}\] Let \(r = \frac{\pi}{2|\Delta|}\). We have \(\|a^G\|\leq r\). Hence \(d_{G/L}(x,y) \leq r\). This proves by contraposition that if \(d_{G/L}(x,y)\geq r\), then \(|\langle f(x)|f(y)\rangle|<\varepsilon = 3/4\). ◻
Proposition 3 (Lipschitz property of \(f\)). There is \(a > 0\) such that \[\left| |f(x)\rangle - | f(y) \rangle\right| < a\cdot d_{G/L}(x,y)\]
Proof. Let \(z\in G\) such that \(z = x-y-u\) where \(u\in L\) is such that \(d_{G/L}(x,y) = \|x-y-u\|\). If one of the components of \(z\) according to \(\mathbb{Z}_2^{n_1}\), or \(\mathbb{Z}^s\) is non-zero, then \(d_{G/L}(x,y)\geq 1\), and then by the triangle inequality \(\left| \langle f(x) | f(y) \rangle\right|\leq 2\leq 2d_{G/L}(x,y)\). Now we assume that all components of \(z\) according to \(\mathbb{Z}_2^{n_1}\) and \(\mathbb{Z}^s\) are zero. In particular, this means that \(\mathcal{L} = (e^z)\mathcal{L'}\) where \(\mathcal{L} = f_c(x)\), \(\mathcal{L'}=f_c(y)\), and \(z\in E\) correspond to the canonical mapping of the components of \(z\) according to \(\mathbb{R}^{n_1+n_2-1}\times (\mathbb{Z}_2)^{n_1}\times \left(\mathbb{R}/\left(\frac{1}{n^2}\mathbb{Z}\right)\right)^{n_2}\). Therefore, we have \(M_{\mathcal{L}} = M_\mathcal{L'}\cdot e^{\operatorname{diag}(z)}\), and thus: \[\begin{align} d_{G/L}(x,y) &= \|z\| \geq \|\phi(z)\|= \|\operatorname{diag}(\phi(z))\|_2\\ &\geq \inf\{\|A\|_2: M_{\mathcal{L}} = M_\mathcal{L'}\cdot e^{A}\}\\ &\geq a_0 \left| |f(x)\rangle - | f(y) \rangle\right| \;\;\text{by~\cite{EHKS14long}} \end{align}\] Then we obtain the desired result with \(a = \max\{2,1/a_0\}\). ◻
We have demonstrated the HSP property of \(f\). We will now use this to derive the HSP property of \(\tilde{f}\) which is a function from \(\tilde{G}\) to \(\mathcal{H}\) obtained from \(f\).
In the previous sections we described an oracle \(f:G\rightarrow \mathcal{H}\) which satisfies the HSP properties of Definition 2 (in particular: pseudoinjectivity and Lipschitz property). We now show how to construct an oracle over \(\mathbb{R}^m\) that hides the \(S\)-unit groups and that inherits the HSP properties of \(f\). The control group \(G\) can be seen as the projection of \(\tilde{G} = \mathbb{R}^k\times \mathbb{Z}^l\) where
\(k = n_1 + 2n_2 -1\).
\(l = n_1 + s\).
We denote by \(\gamma:\tilde{G}\rightarrow G\) the projection map, and by \(\tilde{L}\subseteq\tilde{G}\) the pre-image of \(L\) by \(\gamma\).
Definition 6 (Oracle on \(\tilde{G}\)). We define \(\tilde{f}:\tilde{G}\rightarrow \mathcal{H}\) by \[\tilde{f}(\tilde{x}) = f\circ\gamma(\tilde{x}).\]
We have the following diagram:
\[\begin{tikzcd} \mathbb{R}^m & \tilde{G} = \mathbb{R}^k\times \mathbb{Z}^l \arrow[l] \arrow[r,"\gamma"] \arrow[dl,"\tilde{f}"] \arrow[d,"\tilde{\alpha}"] & G = \mathbb{R}^{n_1+n_2-1}\times \mathbb{Z}_2^{n_1}\times \left(\mathbb{R}/\mathbb{Z}\right)^{n_2}\times \mathbb{Z}^s \arrow[d,"\alpha"] \arrow[dr,"f = f_q\circ f_c"] & \\ \mathcal{H} & \tilde{G}/\tilde{L} \arrow[r,"\pi"] & G/L & \mathcal{H} \end{tikzcd}\]
We proceed by first showing that \(\tilde{f}\) satisfies the HSP properties, and then we use techniques from [18] and [22] to derive an oracle on \(\mathbb{R}^m\) that satisfies the HSP properties and that hides the \(S\)-unit group.
In this section, we show that \(\tilde{f}\) is an a,\(r,\varepsilon\)-oracle on \(\tilde{G}\). Following the framework of [18], [22], we use the following distance on \(\tilde{G}/\tilde{L}\).
Definition 7 (Distance on \(\tilde{G}/\tilde{L}\)). Let \(\tilde{x},\tilde{y}\in\tilde{G}\). We define \(d_{\tilde{G}}(x,y) = \|x-y\|\) if \(x-y\) does not have any non-zero components on \(\mathbb{Z}^l\), and \(d_{\tilde{G}}(x,y) = \infty\) otherwise. Then \[d_{\tilde{G}/\tilde{L}} = \inf_{\tilde{u}\in\tilde{L}} d_{\tilde{G}}(x , y+u).\]
Proposition 4 (Lipschitz property of \(\tilde{f}\)). Assume \(f\) is an \(a,r,\varepsilon\)-oracle. Then \[\forall \tilde{x},\tilde{y}\in\tilde{G},\;\| | \tilde{f}(\tilde{x})\rangle - | \tilde{f}(\tilde{y})\rangle\| \leq a\cdot d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y}).\]
Proof. Suppose \(d_{\tilde{G}/\tilde{L}}=\infty\), then the inequality holds trivially. Otherise, Let \(\tilde{u}\in \tilde{L}\) such that \(d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y}) = \|\tilde{x}-\tilde{y}-\tilde{u}\|\). In particular, all coordinates of \(\tilde{z}:= \tilde{x}-\tilde{y}-\tilde{u}\) with respect to \(\mathbb{Z}^l\) are \(0\). Let \(u=\gamma(\tilde{u})\), \(x=\gamma(\tilde{x})\), and \(y=\gamma(\tilde{y})\). We have \[\|\tilde{x}-\tilde{y}-\tilde{u}\| \geq \|x - y - u\| \geq \min_{u\in L} \|x-y-u\| = d_{G/L}(x,y).\] Hence \(a\cdot d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y}) \geq a\cdot d_{G/L}(x,y) \geq \| | \tilde{f}(\tilde{x})\rangle - | \tilde{f}(\tilde{y})\rangle\|\) ◻
Proposition 5 (Pseudoinjectivity of \(\tilde{f}\)). Assume \(f\) is an \(a,r,\varepsilon\)-oracle for \(r\ll 1\). Then \[d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y})\geq r \Rightarrow \left| \langle \tilde{f}(\tilde{x}) | \tilde{f}(\tilde{y}) \rangle\right| < \varepsilon\]
Proof. Let \(x = \gamma(\tilde{x})\), \(y = \gamma(\tilde{y})\), and \(u\in L\) such that \(d_{G/L}(x,y) = \|x-y-u\|\). If \(z = x-y-u\) has no component on \(\mathbb{Z}_2^{n_1}\) or \(\mathbb{Z}^s\), then \(d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y}) = d_{G/L}(x,y)\) and therefore, if \(d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y}) \geq r\), then \[|\langle \tilde{f}(\tilde{x}) | \tilde{f}(\tilde{y})\rangle| = |\langle f(x) | f(y) \rangle| \leq \varepsilon.\]
On the other hand, if for such a \(u\), we have components on \(\mathbb{Z}_2^{n_1}\) or \(\mathbb{Z}^s\), then either \(d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y}) = d_{\tilde{G}}(\tilde{x},\tilde{y}+\tilde{u})=\infty\), where \(\tilde{u}\in\tilde{L}\) is the corresponding preimage, or \(d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y})\geq 1\). So we only know that \(d_{\tilde{G}/\tilde{L}}(\tilde{x},\tilde{y}) \geq d_{G/L}(x,y)\) in this case. However, we also have that \(d_{G/L}(x,y)\geq 1\) because of the integer components. Since \(r\ll 1\) we necessarily have \(d_{G/L}(x,y)\geq r\), and therefore \(|\langle \tilde{f}(\tilde{x}) | \tilde{f}(\tilde{y})\rangle| = |\langle f(x) | f(y) \rangle| \leq \varepsilon.\) ◻
Assume we have an \(\tilde{a},\tilde{r},\tilde{\varepsilon}\)-oracle \(\tilde{f}\) that hides \(U_S\) on \(\tilde{G} = \mathbb{R}^{k}\times \mathbb{Z}^l\). Following [18] and [22] we derive an oracle \(g:\mathbb{R}^m\rightarrow\mathcal{H}\) for \(m = k+l\) defined by \[|g(\boldsymbol{x},y_1,\ldots,y_l)\rangle = \sum_{z_1,\ldots,z_l\in\{0,1\}}\left( \bigotimes_{j=1}^l|\psi(y_j,z_j\rangle \right) \otimes|\tilde{f}(\boldsymbol{x},s(y_1,z_1),\ldots,s(y_l,z_l)\rangle,\] where \(s(y,z) = \lfloor y/\lambda\rfloor+z\), and \(|\psi(y,z)\rangle = \cos(\frac{\pi t}{2})|\mathrm{str}_\nu(t)\rangle\) with \(t = y/\lambda - s(y,z)\).
Theorem 4 (Theorem 6.1 of [18]). If \(\tilde{f}\) is an \(\tilde{a},\tilde{r},\tilde{\varepsilon}\)-oracle, then \(g\) is an \(a',r',\varepsilon'\)-oracle with the following identities: \[\begin{align} a'^2 &= \tilde{a}^2 + l\left(\frac{\pi}{2\nu\lambda}(1+\nu)\right)^2\\ r'^2 &= \tilde{r}^2 + l(2\nu\lambda)^2\\ \varepsilon' &= \tilde{\varepsilon}. \end{align}\]
Finally, we need to bound the first minima and the fundamental volume of the lattice of \(S\)-units. In the following, we show that these values have polynomial size with respect to the input. To bound the first minima of \(U_S\subseteq G\) and the volume of \(G/U_S\) (which are preserved by the embedding of \(U_S\) into \(\tilde{G}\)), we rely on an analogue of Dirichlet unit theorem that applies to \(S\)-units. The classical results are known for the case where the lattice of \(S\)-units is embedded in \(\mathbb{R}^{r+|S|}\) (where \(r\) is the rank of the unit group of \(\mathcal{O}\)) via the logarithm embedding \[\operatorname{Log}(\alpha) := \left( \log(|\alpha|_1),\cdots,\log(|\alpha|_r),\log(|\alpha|_{\mathfrak{p}_1}),\cdots, \log(|\alpha|_{\mathfrak{p}_{|S|}}\right),\] where \(|\alpha|_j := |\sigma_j(\alpha)|\) and \(|\alpha|_{\mathfrak{p}_j}:= p_j^{-e_jv_{\mathfrak{p}_j}(\alpha)}\). In this case, we know from [26] that \(\|\operatorname{Log}(\alpha)\|_{\infty} \geq \frac{\log(n)}{6n^4}\) where \(\|v\|_{\infty}\) denote the usual infinity norm on the vector v, and \[\operatorname{Vol}\left(\mathbb{R}^{r+|S|}/\operatorname{Log}(U_S)\right) \leq \left( 300 \log(P) \sqrt{|\Delta|} \left( \frac{e}{2} \log(|\Delta|) \right)^{n-1}\right)^{|S|+r-\frac{n}{2}},\] where \(P = \max_j\mathcal{N}(\mathfrak{p}_j)\) (see [26]).
Proposition 6. The first minima of \(U_S\subseteq G\) satisfies \(\lambda_1(U_S)\geq \frac{\log(n)}{6n^4}\) where the norm on elements of \(G\) is defined by \[\|(z,v_1,\cdots,v_{|S|})\| := \sqrt{\sum_j z_j^2} + \sum_j |v_j|e_j\log(p_j).\] Moreover, the volume of the lattice of \(S\)-units satisfies \[\operatorname{Vol}(G/U_S) \leq \frac{2^{n_1}}{\log(2)^{|S|}} \left( 300 \log(P) \sqrt{|\Delta|} \left( \frac{e}{2} \log(|\Delta|) \right)^{n-1}\right)^{|S|+r-\frac{n}{2}},\] where \(P = \max_j\mathcal{N}(\mathfrak{p}_j)\).
Proof. Let \(((z_j),(v_k))\in G\) corresponding to an \(S\)-unit \(\alpha\). We immediately see that \[\|((z_j),(v_k))\| \geq \|\operatorname{Log}(\alpha)\|_{\infty},\] which proves the lower bound on \(\lambda_1(U_S)\).
To compute an upper bound on the volume of \(G/U_S\), we follow the same approach as [22]. First, we consider the exact sequence \(0\rightarrow \mathbb{Z}^{n_1}\times(\mathbb{R}/\mathbb{Z})^{n_2} \rightarrow G\rightarrow\mathbb{R}^{m_1+n_2-1}\times\mathbb{Z}^s\rightarrow 0\). Let \(\mu(K)\) be the group of torsion units, and \(L_S\subseteq \mathbb{R}^{n_1+n_2-1}\times\mathbb{Z}^s\) be the rank-\(n_1+n_2+s-1\)-lattice that is the projection of \(U_S\). Then we have the exact sequence \[0\rightarrow (\mathbb{Z}^{n_1}\times(\mathbb{R}/\mathbb{Z})^{n_2})/\mu(K) \rightarrow G/U_S\rightarrow (\mathbb{R}^{n_1+n_2-1}\times\mathbb{Z}^s)/L_S.\] Hence \(\operatorname{Vol}(G/U_S) = \operatorname{Vol}\left((\mathbb{Z}^{n_1}\times(\mathbb{R}/\mathbb{Z})^{n_2})/\mu(K)\right) \operatorname{Vol}\left( (\mathbb{R}^{n_1+n_2-1}\times\mathbb{Z}^s)/L_S\right)\)
The volume of \((\mathbb{R}^{n_1+n_2-1}\times\mathbb{Z}^s)/L_S\) is equal to the absolute value of determinant of the matrix of a basis of \(L_S\). Let \((\alpha_j)_{j\leq r+|S|}\) be a minimal generating set for \(U_S/\mu(K)\). Its matrix \(M\) with respect to the embedding in \(\mathbb{R}^{r}\times Z^{|S|}\) is related to the matrix \(M':=(\operatorname{Log}(\alpha_j))\) by the relation \(M = D\cdot M'\) where
\[D= \left( \begin{array}{cccccc} 1 & (0) & & & & \\ & \ddots & & & & \\ & (0) & 1 & & & \\ & & & 1/e_1\log(p_1) & & (0) \\ & & & & \ddots & \\ & & & (0) & & 1/e_{|S|}\log(p_{|S|}) \end{array} \right).\] We therefore have \[\begin{align} \operatorname{Vol}\left( (\mathbb{R}^{n_1+n_2-1}\times\mathbb{Z}^s)/L_S\right) &= \det(M) = \prod_j \frac{1}{e_j\log(p_j)} \det(M')\\ &= \prod_j \frac{1}{e_j\log(p_j)}\operatorname{Vol}(R^{r+|S|}/\operatorname{Log}(U_S)). \end{align}\] Additionally, we have \(\operatorname{Vol}\left((\mathbb{Z}^{n_1}\times(\mathbb{R}/\mathbb{Z})^{n_2})/\mu(K)\right) = \frac{2^{n_1}}{|\mu(K)|}\), therefore, by using the upper bound on \(\operatorname{Vol}(R^{r+|S|}/\operatorname{Log}(U_S))\), we get \[\operatorname{Vol}(G/U_S) \leq \frac{2^{n_1}}{\log(2)^{|S|}} \left( 300 \log(P) \sqrt{|\Delta|} \left( \frac{e}{2} \log(|\Delta|) \right)^{n-1}\right)^{|S|+r-\frac{n}{2}}.\] ◻
The solution of is given to us as approximations of generators of the hidden subgroup. For many applications, an exact (and polynomially bounded) representation is preferable. Therefore, we process the solutions to the \(S\) problem classically to produce a compact representation of the generators of the \(S\)-unit group.
Definition 8 (Compact representation). Let \(l>0\) be a constant, a compact representation of \(\alpha\in\mathcal{O}\) with respect to the integral basis \((\omega_j)_{j\leq n}\) of \(\mathcal{O}\) is a set of exact representations of polynomial size algebraic numbers \(\gamma_j\) satisfying \(\alpha = \gamma_0\gamma_1^{l}\cdots\gamma_k^{l^k},\) where \(k\) is polynomial in the size of the input.
Biasse and Fieker [27] described an efficient method based on [28] to classically compute a compact representation of an algebraic number in polynomial time. These methods rely on the knowledge of an exact representation of the algebraic number we wish to represent (which is not the case here). A modification of [28] using the approximation of the vector corresponding to an algebraic number yields a compact representation of that number. Our algorithm for the compact representation of an \(S\)-unit takes as input \(l>0\) and a rational approximation (to an arbitrary polynomial precision \(q\)) of a vector of the form \[(\log(|\alpha|_1,\cdots ,\log(|\alpha|_{n_1+n_2}),\theta,v_{\mathfrak{p}_1}(\alpha),\cdots,v_{\mathfrak{p}_{|S|}}(\alpha)),\] where \(\alpha\) is an \(S\)-unit. We can assume that \(\prod_{j}\mathfrak{p}_j^{v_{\mathfrak{p}_j}(\alpha)} \subseteq\mathcal{O}\). If not, we replace each \(\log(|\alpha|_j)\) by \[\left( \sum_{v_{\mathfrak{p}_k}(\alpha) < 0} |v_{\mathfrak{p}_k}(\alpha)|e_k\log(p_k)\right) + \log(|\alpha|_j)\] (where \(\mathcal{N}(\mathfrak{p}_k) = p_k^{e_k}\)), thus calculating a compact representation of \(\alpha\prod_{v_{\mathfrak{p}_k}(\alpha) < 0} p_k^{|v_{\mathfrak{p}_k}(\alpha)|e_k}\). From that, we can easily derive a compact representation of \(\alpha\). Then, we choose \(k_0\) minimal such that \(\frac{\log(|\alpha|_j}{l^{k_0}} \leq \log(|\Delta|)\), initiate an ideal \(I\) to \(\prod_{j}\mathfrak{p}_j^{\left\lfloor v_{\mathfrak{p}_j}(\alpha)/l^k\right\rfloor}\), and we compute rational approximations \(v_j\) of \(\left(|\alpha|_j\right)^{1/l^k}\). Then at each step, \(I\) is replaced by \(I^l\) and we compute an LLL-reduced element \(\delta_k\) of the ideal \(C\subseteq\mathcal{O}\) such that \(I^{-1} = \frac{1}{d_k} C\) for the scaled \(T_2\) norm \(T_{2, (v_j)_j}(\delta) := \sum |\delta|^2_i \frac{v_j^2}{v^2}\) where \(v := \sqrt[n]{\prod v_j}\). The ideal \(I\) is then replaced by \(\beta_k I\) where \(\beta_k := \frac{\delta_k}{d_k}\), and \(v_j\leftarrow v_j\cdot|\beta_k|_j\). At every step \(k\) from \(k_0\) to \(0\), we know that
\(\beta_k\) has polynomial size,
\(\beta_{k_0}^{l^{k}}\cdots\beta_k\left(\alpha\right)^{\frac{1}{l^{k_0-k}}}\) has polynomial size,
\(I\subseteq \mathcal{O}\) and has polynomial size (i.e. \(\log(I)\) is polynomial),
\(\prod_j v_j \geq \mathcal{N}(I)\geq 1\).
At the end of this process, we have polynomial size algebraic numbers \((\beta_j)_{j\leq k_0}\) such that \(\beta_{-1} := \alpha\prod_k \beta_k^{l^k}\) has polynomial size. Finding \(\beta_{-1}\) is the main difference between our approach and that of [27] and [28] since we have no exact representation for \(\alpha\). We find the minimal \(d>0\) such that \(\beta := d\beta_{-1}\in \mathcal{O}\) and from approximates of the \(\log(|\beta_k|_j)\), \(\log(|\alpha|)_j\), and the phase vector of each of the corresponding algebraic numbers, we find a rational approximation \(\widehat{\beta}\in\mathbb{R}^m\) under the rule [eq:map95Rm] with a polynomial number of bits of precision. Likewise, we can get approximations \(\widehat{\omega_j}\in\mathbb{R}^m\) of the integral basis vectors \(\omega_j\), and solve the linear system (over the rationals) \(\widehat{\beta} := \sum_j \frac{b_j}{c_j}\widehat{\omega_j}\). The nearest lattice point \(\sum_j a_j\widehat{\omega_j}\) in \(\sum_j\mathbb{Z}\widehat{\omega_j}\) can be retrieved if the precision is larger than \(n\) by using Babai’s algorithm [29]. Then we know that \(\beta = \sum_j a_j\omega_j\), and \[\alpha = \frac{\beta_{-1}}{\beta_0}\left(\frac{1}{\beta_1}\right)^ l\cdots \left(\frac{1}{\beta_{k_0}}\right)^{l^{k_0}}.\]
Proposition 7. Algorithm 2 is correct and returns a compact representation of the input \(\alpha\) in polynomial time.
Proof. The invariant properties on the size of the elements are deduced in the same way as in the proof of [27]. The only important different is the way we compute an exact representation of \(\beta_{-1}\). Barbai’s algorithm allows us to find in polynomial time a lattice element \(\tilde{\beta}\) in \(\widehat{\mathcal{L}}:=\sum_j\mathbb{Z}\widehat{\omega_j}\) such that \(d(\widehat{\beta},\tilde{\beta})\leq 2^n d(\widehat{\beta},\widehat{\mathcal{L}})\). If the precision is larger than \(n\), then the coefficients of \(\tilde{\beta}\) on the basis \(\widehat{\omega}_j\) are those of \(\beta = d\beta_{-1}\) on the integral basis \(\omega_j\) of \(\mathcal{O}\). ◻
Let \(\mathcal{B} = \{ \mathfrak{p}_1,\cdots,\mathfrak{p}_N\}\) be a set of invertible prime ideals of an order \(\mathcal{O}\) whose classes generate \(\operatorname{Cl}(\mathcal{O})\). We have a surjective morphism \[\begin{CD} \mathbb{Z}^N @>{\varphi}>> \mathcal{I} @>{\pi}>> \operatorname{Cl}(\mathcal{O})\\ (e_1, \ldots, e_N) @>>> \prod_i\mathfrak{p}_i^{e_i} @>>> \prod_i[\mathfrak{p}_i]^{e_i} \end{CD},\] and the class group \(\operatorname{Cl}(\mathcal{O})\) is isomorphic to \(\mathbb{Z}^N/\ker(\pi\circ\varphi).\) Therefore, computing the class group boils down to computing \(\ker(\pi\circ\varphi)\), which is the lattice of \((e_1,...,e_N)\in\mathbb{Z}^N\) such that \(\mathfrak{p}_1^{e_1},\ldots , \mathfrak{p}_N^{e_N} = (\alpha)\) for some \(\alpha\in K\). These \(\alpha\) are \(S\)-units for \(S = \mathcal{B}\), and the exponent vectors of a generating set of \(U_S\) give us a generating set for \(\ker(\pi\circ\varphi)\) from which we derive \(\operatorname{Cl}(\mathcal{O})\)
The best unconditional bounds on \(|S|\) are exponential in \(\log(|\Delta|)\). As the complexity of the computation of the \(|S|\)-unit group is polynomial in \(|S|\), we cannot achieve a polynomial complexity unconditionally that way. However, under the Generalized Riemann Hypothesis (GRH), the classes of all prime ideals of \(\mathcal{O}\) of norm up to \(48\log(|\Delta|)^2\) generate \(\operatorname{Cl}(\mathcal{O})\). The size of \(S := \{ \mathfrak{p}\subseteq\mathcal{O}\text{ prime } \mid \mathcal{N}(\mathfrak{p})\leq 48\log(|\Delta|)^2\}\) is polynomial in \(\log(|\Delta|)\), and calculating the corresponding \(S\)-unit group is therefore polynomial in \(n\) and \(\log(|\Delta|)\).
Proposition 8. Under the Generalized Riemann Hypothesis, Algorithm 3 is correct and runs in polynomial time.
Our work also has direct applications in computational number theory. Indeed, the \(S\)-unit group is a central object that can be used in a lot of algorithms. It usually is computed together with the so-called \(S\)-class group, which is the quotient of the group of ideals in the ring of \(S\)-integers by the subgroup of principal ideals. The \(S\)-class group can easily be derived from the ideal class group and an oracle for the PIP by quotienting the class group by extra relations. A description of this method can be found in Simon’s PhD thesis [1].
Another direct consequence of our work is that it directly implies a polynomial time algorithm for computing the relative class group and the relative unit group of an arbitrary extension of number fields. Algorithms for these tasks are already known Cohen2?[Ch. 7], but their run time is exponential in the degree of the fields. As for the \(S\)-class group, they also consist of using a complete set of relations for the ideal class group and of enriching it with new relations that are obtained by solving instances of the PIP.
Let \(\mathfrak{a}\subseteq\mathcal{O}\) be an ideal of \(\mathcal{O}\). We want our algorithm to run in polynomial time in the size of the input, that is \(\log|\Delta|\), \(n\), and \(\log(\mathcal{N}(\mathfrak{a}))\) (which quantifies the size of \(\mathfrak{a}\)). The ideal \(\mathfrak{a}\) is principal if and only if \(\mathfrak{a}= (\alpha)\) for \(\alpha\) an \(S\)-unit where \(S\) is the set of prime divisors of \(\mathfrak{a}\). We calculate a generating set for the \(S\)-units, which gives us a generating set for all the possible principal ideals only divisible by elements of \(S\). The resolution of a linear system tells us if \(\mathfrak{a}\) belongs to this set, and if so, what is its generator.
Under the GRH, the set of prime ideals \[S := \{\mathfrak{p}\subseteq\mathcal{O}\text{ prime } \mid \mathcal{N}(\mathfrak{p})\leq 48\log(|\Delta|)^2\}\cup \{ \mathfrak{p}\subseteq\mathcal{O}\text{ prime } \mid \mathfrak{p}\mid \mathfrak{a}\}\] generate the ideal class group. Ideal class decomposition consists of finding exponents \(x_1,\ldots,x_s\) and \(\alpha\in K\) such that \[\mathfrak{a}= (\alpha)\mathfrak{p}_1^{x_1}\ldots \mathfrak{p}_s^{x_s}.\] We want our algorithm to run in polynomial time in the size of the input, that is \(\log|\Delta|\), \(n\), and \(\log(\mathcal{N}(\mathfrak{a}))\) (which quantifies the size of \(\mathfrak{a}\)). Our strategy is the following:
Decompose \(\mathfrak{a}\) as a product of prime ideals \(\mathfrak{a}= \prod \mathfrak{q}\).
For each \(\mathfrak{q}_j\notin\mathcal{B},j\leq k\) in the decomposition of \(\mathfrak{a}\), find \(\beta_k\in K\) such that \(\mathfrak{q}= (\beta_k) \cdot \prod_{\mathfrak{p}_j\in S}\mathfrak{p}^{x_{j,k}}\).
Deduce \(\boldsymbol{v}\in\mathbb{Z}^{N}\) such that \(\mathfrak{a}= \prod_k(\beta_k) \cdot \prod_{\mathfrak{p}_j\in\mathcal{B}} \mathfrak{p}^{v_j}\).
deciding if an input ideal \(\mathfrak{a}\subseteq\mathcal{O}\) is principal, and if so, compute an element \(\alpha\in \mathcal{O}\) such that \(\mathfrak{a}=(\alpha)\). The first step consist of finding the prime ideal decomposition of \(\mathfrak{a}\). Then we define \(S\) by \[S := \{\mathfrak{p}\subseteq\mathcal{O}\text{ prime } \mid \mathcal{N}(\mathfrak{p})\leq 48\log(|\Delta|)^2\}\cup \{ \mathfrak{p}\subseteq\mathcal{O}\text{ prime } \mid \mathfrak{p}\mid \mathfrak{a}\},\] and compute the \(S\)-unit group. Then we deduce the solution to the principal ideal problem by performing a linear algebra step on the matrix of the valuations, as described in Section [subsec:high_level_PIP].
Proposition 9. Under the Generalized Riemann Hypothesis, Algorithm 5 is correct and runs in polynomial time.
Our algorithms also directly imply a quantum algorithm for computing the ray class group of an arbitrary number field. The computation of the ray class group is an essential task in computational class field theory, and both classical and quantum algorithms have been described to solve this task. A classical method due to Cohen can be found in Cohen2?[3.2] and has an exponential run time with respect to the degree (but runs in subexponential time for classes of fixed degree number fields). A quantum algorithm was described by Eisenträger and Hallgren Hallgren10? with a polynomial run time in classes of fixed degree number fields. As for the afortmentioned tasks, computing the ray class group essentially relies on subroutines for computing the ideal class group and solving the PIP, for which we provide polynomial time algorithms in arbitrary number fields. It also relies on algorithms for factoring ideals (which can be easily derived from Shor’s factoring algorithm), and efficient methods for solving the discrete logarithm problem (which is also a well known consequence of Shor’s work [14]).
Finally, our work allows us to describe polynomial time algorithms for solving relative norm equations of the form \(\mathcal{N}_{L/K}(x) = \theta\) where \(L/K\) is an arbitrary Galois extension. Norm equations are an important example of Diophantine equations which are a major topic in number theory. The resolution of the Pell equation (for which there is a quantum algorithm [15]) can be seen as a special case where \(L = \mathbb{Q}(\sqrt{\Delta})\), \(K = \mathbb{Q}\) and \(\theta = 1\) (when we restrict our attention to integer solutions). Solving norm equations in general is an important task in computational number theory. A classical method was described by Simon [1] (based on the work of Fieker [2] for Galois extensions) that solves general extensions in exponential time in the degree of the fields. For the Galois case, it simply uses the knowledge of the \(S\)-unit group and the relative class group, which we can provide in polynomial time for number fields of arbitrary degree. However, the general method uses the Galois closure, whose degree can be exponential in the degree of the field, thus restricting the direct application of our work to arbitrary Galois extensions.
These reductions are straightforward. But classical algorithms typically compute the \(S\)-unit group by solving and solving instances of first. Our quantum algorithm tackles these problems in the reverse order.↩︎
Let \(A=\operatorname{diag}{a_i}\). Observe that \(\|A\|_2\leq 1\) and in this case \(\sum_{k=2}^\infty \frac{\|A^k\|_2}{k!}\leq \|A\|_2\cdot \sum_{k=2}^\infty \frac{1}{k!} = (e-2)\|A\|_2\). Hence \(\|e^A - I\|_2 \geq \|A\|_2 - \sum_{k=2}^\infty \frac{\|A^k\|_2}{k!}\geq 0.2 \|A\|_2\).↩︎