On sampling determinantal and Pfaffian
point processes on a quantum computer

Rémi Bardenet1, Michaël Fanuel, and Alexandre Feller
Université de Lille, CNRS, Centrale Lille
UMR 9189 - CRIStAL, F-59000 Lille, France


Abstract

DPPs were introduced by Macchi as a model in quantum optics the 1970s. Since then, they have been widely used as models and subsampling tools in statistics and computer science. Most applications require sampling from a DPP, and given their quantum origin, it is natural to wonder whether sampling a DPP on a quantum computer is easier than on a classical one. We focus here on DPPs over a finite state space, which are distributions over the subsets of \(\{1,\dots,N\}\) parametrized by an \(N\times N\) Hermitian kernel matrix. Vanilla sampling consists in two steps, of respective costs \(\mathcal{O}(N^3)\) and \(\mathcal{O}(Nr^2)\) operations on a classical computer, where \(r\) is the rank of the kernel matrix. A large first part of the current paper consists in explaining why the state-of-the-art in quantum simulation of fermionic systems already yields quantum DPP sampling algorithms. We then modify existing quantum circuits, and discuss their insertion in a full DPP sampling pipeline that starts from practical kernel specifications. The bottom line is that, with \(P\) (classical) parallel processors, we can divide the preprocessing cost by \(P\) and build a quantum circuit with \(\mathcal{O}(Nr)\) gates that sample a given DPP, with depth varying from \(\mathcal{O}(N)\) to \(\mathcal{O}(r\log N)\) depending on qubit-communication constraints on the target machine. We also connect existing work on the simulation of superconductors to Pfaffian point processes, which generalize DPPs and would be a natural addition to the machine learner’s toolbox. In particular, we describe “projective” Pfaffian point processes, the cardinality of which has constant parity, almost surely. Finally, the circuits are empirically validated on a classical simulator and on 5-qubit IBM machines.

Keywords— Determinantal and Pfaffian point processes, fermionic systems, quantum circuits, Givens rotations.

1 Introduction↩︎

Determinantal point processes (DPPs) were introduced in the thesis of [1], recently translated and reprinted as [2]. Macchi’s motivation was the design of probabilistic models for free fermions in quantum optics; see the preface of [2] for a history of DPPs, and [3] for an extended discussion of the links between free fermions and DPPs. DPPs have known another surge of interest since the 90s for their application to random matrix theory [4]. More recently, they have been adopted as models or sampling tools in fields like spatial statistics [5], Monte Carlo methods [6], machine learning [7], or numerical linear algebra [8]. In the latter two fields, the considered DPPs are often finite, in the sense that a DPP is a probability measure over subsets of a ground set of finite cardinality \(N\gg 1\). Such a finite DPP is specified by an \(N\times N\) matrix called its kernel matrix, which we assume here to be Hermitian.

In machine learning as in numerical linear algebra, it is crucial to be able to sample from the considered finite DPPs. For instance, a famous DPP is the subset of edges of a uniform spanning tree in a connected graph [9]. Sampling these uniform spanning trees is a necessary step for building the randomized preconditioners of Laplacian systems in [10]. As another example, DPPs have been used as randomized summaries of large collections of items, such as a corpus of texts. Sampling the corresponding DPP then allows to extract a small representative subset of sentences [11]. Other machine learning applications include constructing coresets [12], kernel matrix approximation [13], [14] and feature extraction for linear regression [15].

Much research has thus been devoted to sampling finite DPPs on a classical computer, either exactly or approximately. The default generic exact sampler is the ‘HKPV’ sampler [16]. To fix ideas, when applied to a projection DPP, i.e., a DPP that puts all its mass on subsets of some fixed cardinality \(r\leq N\), and assuming the kernel matrix is given in diagonalized form, HKPV has complexity \(\mathcal{O}(Nr^2)\). For DPPs on graphs such as uniform spanning trees, there also exist dedicated algorithms, such as the loop-erased random walks of [17], with an expected number of steps equal to the mean commute time to a chosen root node of the graph.

Given that DPPs are originally a model in quantum electronic optics, and are still the default mathematical object used to describe a quantum physical system known as free fermions [18], it is natural to ask whether finite DPPs can be sampled more efficiently on a quantum computer than on a classical computer. Somewhat implicitly, the question has actually already been tackled in a string of physics papers whose goal is the more ambitious quantum simulation of fermionic systems [19][22]. In a reverse cross-disciplinary direction, and still rather implicitly, the quantum algorithms therein are reminiscent of parallel QR decompositions, a key topic in numerical algebra [23], [24]. While finishing this work, we also realized that in the computer science literature, and independently of the aforementioned physics works, [25] recently proposed similar quantum algorithms to sample projection DPPs, as a building block for quantum data analysis pipelines. For our purpose, their main contribution is a quantum circuit with depth logarithmic in \(N\), when [22] only discuss depths linear in \(N\).

On our side, motivated by applications of finite DPPs in data science, we initiated in [3] a programme of reconnection of DPPs to their physical fermionic roots, to foster cross-disciplinary research between mathematics, computer science, and physics on the topic, even if our languages and lore are quite different. In particular, physicists have developed tools for the analysis and the construction of fermionic systems that we would like to apply to DPPs in data science, without reinventing the wheel. The current paper is part of this programme, and sums up our understanding of what the state of the art in physics tells us on sampling finite DPPs, after we follow in the footsteps of [1] and map a given DPP to a fermionic density operator. This cross-disciplinary, self-contained survey is our first contribution.

As an example of what our disciplines can bring to each other, our second contribution is to relate the Pfaffian point processes as defined by [26] – a generalization of DPPs that is natural in theoretical physics but has not yet been used in data science – to a quantum algorithm by [22] for solving the Bogoliubov-de Gennes Hamiltonian. As another example of the fertility of cross-disciplinary work, after we make the link between the quantum circuits of [20], [22] and parallel QR decompositions [24], many new variants of the quantum circuits in [22] become immediately available, adapting to a range of qubit-communication and hardware constraints. In particular, we exhibit a variant of the quantum circuits in [22] with the same dimensions as the best circuit in [25].

Overall, our conclusions on quantum DPP sampling are that if a projection kernel is given in diagonalized form \(\mathbf{K}= \mathbf{Q}^*\mathbf{Q}\), with \(\mathbf{Q}\in\mathbb{C}^{r\times N}\) a matrix with orthonormal rows, one can build quantum circuits that sample DPP(\(\mathbf{K}\)) with \(\mathcal{O}(rN)\) one- and two-qubit gates, and depth depending on what hardware constraints we put on which qubits can be jointly operated. Acting only on neighbouring qubits, depth is \(\mathcal{O}(N)\) [20], [22], while acting on arbitrary pairs of qubits can take the depth down to \(\mathcal{O}(r\log N)\); see our variant of [22] in 5.2.2 and the logarithmic depth Clifford loaders of [25]. Such depths (i.e., the largest number of gates applied to any single qubit) favourably compare to the time complexity \(\mathcal{O}(Nr^2)\) of the classical HKPV algorithm, or the expected complexity in \(\mathcal{O}(Nr + r^3 \log r)\) of the randomized version of HKPV in [27], [28].

That being said, diagonalizing \(\mathbf{K}\) on a classical computer as a preprocessing step remains a \(\mathcal{O}(N^3)\) bottleneck, or at least \(\mathcal{O}(Nd^2)\) in the common case where the diagonalization of \(\mathbf{K}\) can be reduced to the SVD of an \(N\times d\) matrix. This bottleneck thus seems to cancel the advantage of using a quantum circuit if one insists on starting with \(\mathbf{K}\) stored on a classical computer. Yet, while the projection kernel \(\mathbf{K}\) may not be available in diagonalized form, it is common in data science applications [10], [15] to specify it implicitly, as a set of vectors spanning its range. As noted by [28], using a (classical) parallel QR algorithm and \(P\) processors, we can reduce the classical preprocessing step to \(\mathcal{O}(Nd^2/P)\) flops. Importantly for our quantum pipeline, we discuss here how to further reuse the intermediate steps of this preprocessing in the design of the quantum circuit to apply next. This yields a hybrid parallel/quantum algorithm to sample projection DPPs. Compared to the classical HKPV sampler, our pipeline thus provides a linear speedup. Compared to the expected complexity of the randomized classical algorithm discussed in [27], [28], we show a gain in the sampling step, but we arguably share the same bottleneck of classical parallel QR preprocessing. Finally, the necessity for classical preprocessing may disappear in the future, once \(\mathbf{K}\) can be assumed to be initially available as a quantum state, stored on a quantum computer.

The rest of the paper is organized as follows. In 2, we define DPPs and one of their generalizations, Pfaffian PPs (PfPPs). In 3, we introduce the vocabulary of quantum field theory, at the basis of the connection between PfPPs and free fermions. By sticking to the case of a finite-dimensional state space, we can avoid technical difficulties and provide a rigorous, stand-alone introduction, mostly following [29]. Section 4 is devoted to building a Hamiltonian starting from a DPP or a PfPP, so that a simple quantum measurement yields a sample from the corresponding point process. In Section 5, we show how [20], [22] build circuits to simulate the fermionic systems corresponding to our point processes. Our presentation insists on the implicit links with parallel QR algorithms, which allow us to introduce variants of the circuits with smaller complexity under assumptions on the qubit communication constraints of the target machine. Still in Section 5, we show that the projective PfPPs that we simulate generate sets of points with a fixed parity of their cardinality. Finally, we investigate in Section 6 the implementation of the circuits with the library Qiskit [30], and the noise when running the circuits on 5-qubit IBMQ machines [31].

Appendix 8 contains a few detailed proofs that we extracted from the main body of the paper. Appendix 9 contains a discussion on gate details to implement the basic operations introduced in the circuits of 5. Appendix 10 is an overview of the sources of error in current quantum computers and their orders of magnitude.

Notations.

The complex conjugate of a complex number \(z\) is denoted by \(\overline{z}\). Similarly, \(\overline{\mathbf{M}}\) denotes the entrywise complex conjugate of a matrix \(\mathbf{M}\). The transpose of \(\mathbf{M}\) reads \(\mathbf{M}^\top\) and its Moore-Penrose pseudo-inverse is \(\mathbf{M}^+\). For two Hermitian matrices \(\mathbf{M}\) and \(\mathbf{N}\) of the same size, we write \(\mathbf{M}\preceq \mathbf{N}\) if for all complex vectors \(\mathbf{v}\) we have \(\mathbf{v}^* \mathbf{M} \mathbf{v} \leq \mathbf{v}^* \mathbf{N}\mathbf{v}\). The adjoint of an operator \(A\) is written \(A^*\). Also, we denote the canonical basis elements of \(\mathbb{C}^N\) by \(\mathbf{e}_k\), \(1\leq k \leq N\).

For any positive integer \(k\), we write \([k] \triangleq \{1,\dots,k\}\). The matrix obtained by selecting the first \(k\) columns on \(\mathbf{M}\) is denoted by \(\mathbf{M}_{:, [k]}\). Similarly, the \(\ell\)th column of \(\mathbf{M}\) reads \(\mathbf{M}_{:, \ell}\). A point process on \([N]\) is a probability measure over subsets of \([N]\). When talking about a point process \(Y\), we use \(\mathbb{P}\) and \(\mathbb{E}_Y\) to denote the corresponding probability and expectation, like in \(\mathbb{P}(\{1,2\}\subseteq Y)\) or \(\mathbb{E}_Y f(Y)\). Finally, the sigmoid function is \(\sigma(x) = 1/(1+\exp(-x))\).

2 Determinantal and Pfaffian point processes↩︎

In this section, we introduce discrete determinantal point processes (DPPs), and refer to [7] for their elementary properties. We also introduce Pfaffian point processes (PfPPs, [32], [33]), a generalization of DPPs that has not yet been used in machine learning, to the best of our knowledge. As we shall see in 4.4, both DPPs and PfPPs naturally appear when modeling physical particles known as fermions.

2.1 Determinantal point processes↩︎

A determinantal point process is determined by the so-called inclusion probabilities.

Definition 1 (DPP). Let \(\mathbf{K}\in \mathbb{C}^{N\times N}\). A random subset \(Y \subseteq [N]\) is drawn from the DPP of marginal kernel \(\mathbf{K}\), denoted by \(Y\sim \mathrm{DPP}(\mathbf{K})\) if and only if \[\label{eq:def95dpp} \forall S \subseteq [N],\quad \mathbb{P}(S \subseteq Y) = \det (\mathbf{K}_{S}),\qquad{(1)}\] where \(\mathbf{K}_{S} = [\mathbf{K}_{i,j}]_{i,j \in S}\). We take as convention \(\det (\mathbf{K}_{\emptyset}) = 1\).

Note that the matrix \(\mathbf{K}\) is called the marginal kernel since it determines the inclusion probabilities of subsets of items, in the same way the adjective marginal is used for the distribution of a subset of random variables in probability theory. In particular, the one-item inclusion probabilities are given by the diagonal of the kernel, namely \(\mathbb{P}(\{i\} \subseteq Y) = \mathbf{K}_{i,i}\) for all \(i \in [N]\). For all pairs \(\{i,j\}\), \(|\mathbf{K}_{i,j}|\) is interpreted as the similarity of \(i\) and \(j\) – similar items having a low probability to be jointly sampled; see 3 below for more details. A priori, it is not obvious that a given complex matrix \(\mathbf{K}\) defines a DPP.

Theorem 1 ([1], [34]). When \(\mathbf{K}\) is Hermitian, existence of \(\mathrm{DPP}(\mathbf{K})\) is equivalent to the spectrum of \(\mathbf{K}\) being included in \([0,1]\).

As a first comment, note that if \(\mathbf{K}\) is a Hermitian kernel associated with a DPP, the complex conjugate kernel \(\overline{\mathbf{K}}\) defines a DPP with the same law. This is because the eigenvalues of any principal submatrix of \(\mathbf{K}\) are real. Second, as a particular case of 1, when the spectrum of \(\mathbf{K}\) is included in \(\{0,1\}\), we call \(\mathbf{K}\) a projection kernel, and the corresponding DPP a projection DPP. Letting \(r\) be the number of unit eigenvalues of its kernel, samples from a projection DPP have fixed cardinality \(r\) with probability 1 [16]. In applications, projection kernels of rank \(r\) are often available in one of two forms: either \[\label{e:projection95kernel95gram95schmidt} \mathbf{K}= \mathbf{A}(\mathbf{A}^*\mathbf{A})^{+} \mathbf{A}^*,\tag{1}\] where \(\mathbf{A}\in \mathbb{C}^{N \times M}\) is any matrix with rank \(r\leq \min(N, M)\), or in diagonalized form \[\label{e:projection95kernel95diagonalized} \mathbf{K}= \mathbf{U}\mathbf{U}^*,\tag{2}\] where \(\mathbf{U}\in \mathbb{C}^{N\times r}\) has orthonormal columns. We give an example application for each form.

Example 1 (Uniform spanning trees). Consider a finite connected graph \(G\) with \(M\) vertices and \(N\) edges, encoded by its oriented edge-vertex incidence matrix \(\mathbf{A}\in\{-1,0,1\}^{N\times M}\). There are a finite number of spanning trees of \(G\), and we draw one uniformly at random. The edges in that random tree correspond to a subset \(Y\) of the indices \([N]\) of the rows of \(\mathbf{A}\). It turns out [9] that \(Y\) is a projection DPP with kernel 1 .

Uniform spanning trees are useful in many contexts, e.g.to build preconditioners for certain linear systems [10]. Another example of application of DPPs is column-subset selection.

Example 2 (Column subset selection). [15] propose to select \(k\) columns of a “fat" matrix2 \(\mathbf{X}\in\mathbb{R}^{n\times N}\), \(N\gg n\), using the projection DPP with rank-\(k\) kernel \[\label{e:ayoub95kernel} \mathbf{K}= \mathbf{V}_{:,[k]}\mathbf{V}_{:,[k]}^\top,\qquad{(2)}\] where \(\mathbf{X}= \mathbf{U}\mathbf{\Sigma} \mathbf{V}^\top\) is the singular value decomposition of \(\mathbf{X}\), and \(\mathbf{V}_{:,[k]}\) is the matrix given by the first \(k\) columns of \(\mathbf{V}\). This is an example of DPP with a kernel specified by 2 . [15] prove that the projection of \(\mathbf{X}\) onto the subspace spanned by the selected columns is essentially an optimal low-rank approximation of \(\mathbf{X}\). This ensures statistical guarantees in sketched linear regression.

Because we assume that the kernel is Hermitian, a DPP can be seen as a repulsive distribution, in the sense that for all distinct \(i,j\in [N]\), \[\begin{align} \mathbb{P}(\{i,j\} \subseteq Y) &= \mathbf{K}_{i,i} \mathbf{K}_{j,j} - \mathbf{K}_{i,j}\overline{\mathbf{K}_{i,j}}\label{e:proba95DPP95pair}\\ &= \mathbb{P}(\{i\} \subseteq Y)\mathbb{P}(\{j\} \subseteq Y) - \vert\mathbf{K}_{i,j}\vert^2\nonumber\\ &\leq \mathbb{P}(\{i\} \subseteq Y)\mathbb{P}(\{j\} \subseteq Y).\nonumber \end{align}\tag{3}\]

This repulsiveness enforces diversity in samples, and is particularly adequate in applications where a DPP is used to extract a small diverse subset of a large collection of \(N\) items. Beyond column subset selection, this diversity is natural in machine learning tasks such as summary extraction [7] or experimental design [35], [36].

2.2 Pfaffian point processes↩︎

Similarly to the determinant, the Pfaffian of a \(2k\times 2k\) matrix is a polynomial of the matrix entries \[\begin{align} \mathrm{Pf}(\mathbf{A}) = \frac{1}{2^k k!}\sum_{\sigma } \mathrm{sgn}(\sigma) \mathbf{A}_{\sigma(1)\sigma(2)}\dots \mathbf{A}_{\sigma(2k-1)\sigma(2k)}. \end{align}\] It is easy to see that the Pfaffian of \(\mathbf{A}\) is equal to the Pfaffian of its antisymmetric part \((\mathbf{A}-\mathbf{A}^\top)/2\). For \(\mathbf{A}\) skew-symmetric, the definition simplifies to \[\begin{align} \mathrm{Pf}(\mathbf{A}) = \sum_{\sigma \text{ contraction}} \mathrm{sgn}(\sigma) \mathbf{A}_{\sigma(1)\sigma(2)}\dots \mathbf{A}_{\sigma(2k-1)\sigma(2k)}. \end{align}\] Recall that a contraction of order \(m\) (\(m\) even) is a permutation such that \(\sigma(1)< \sigma(3) < ... < \sigma(m - 1)\), and \(\sigma(2i - 1) < \sigma(2i)\) for \(i \leq m/2\). To relate to determinants, note that the Pfaffian of a skew-symmetric matrix \(\mathbf{A}\) of even size satisfies \(\det \mathbf{A} = (\mathrm{Pf}\mathbf{A})^2\).

Definition 2 (PfPP). Let \(\mathbb{K}:[N]\times [N] \rightarrow \mathbb{C}^{2\times 2}\) satisfy \(\mathbb{K}(i,j)^\top = -\mathbb{K}(j,i)\) for all \(1\leq i,j\leq N\). A random subset \(Y \subseteq [N]\) is drawn from the PfPP of marginal kernel \(\mathbb{K}\), denoted by \(Y\sim \mathrm{PfPP}(\mathbb{K})\) if and only if \[\label{eq:def95pfpp} \forall S \subseteq [N],\quad \mathbb{P}(S \subseteq Y) = \mathrm{Pf}(\mathbf{K}_S),\qquad{(3)}\] where \(\mathbf{K}_S = [\mathbb{K}(i,j)]_{i,j\in S}\) is a complex matrix made of \(|S|\) blocks of size \(2\times 2\).

Sufficient conditions on \(\mathbb{K}\) for the existence of \(\mathrm{PfPP}(\mathbb{K})\) were given by [37] when \(\big(\begin{smallmatrix} 0 & -1\\ 1 & 0 \end{smallmatrix}\big)\mathbb{K}(i,j)\) can be mapped to a self-adjoint quaternionic kernel taking values in the set of \(2\times 2\) complex matrices. Later [38] gave an equivalent of the Macchi-Soshnikov Theorem 1 for this type of processes; see [39]. This class of Pfaffian PPs was also studied by [40] in the continuous setting.

More recently, [26] gave another sufficient condition for the existence of a Pfaffian point process on a discrete ground set, which is well-suited to the processes considered in our paper. The PfPPs of [26] correspond to the case where \(\big(\begin{smallmatrix} 0 & 1\\ 1 & 0 \end{smallmatrix}\big)\mathbb{K}(i,j)\) is a self-adjoint complex kernel. The intersection of the classes of PfPPs studied by [37] and [26] is simply the set of PfPPs for which the \(2\times 2\) matrix \(\mathbb{K}(i,j)\) has a vanishing diagonal, i.e., DPPs with Hermitian kernels; see 3 below.

Before going further, we introduce a few useful notations. Consider a \(2N\times 2N\) complex matrix \(\mathbf{S},\) viewed as made of four \(N\times N\) blocks. Define the following transformation of \(\mathbf{S}\), called here particle-hole transformation, which consists in taking the complex conjugation and exchanging blocks along diagonals, i.e. \[\mathrm{ph}(\mathbf{S}) = \mathbf{C} \overline{\mathbf{S}} \mathbf{C}, \text{ with } \mathbf{C}= \begin{pmatrix} \mathbf{0} & \mathbf{I}\\ \mathbf{I}& \mathbf{0} \end{pmatrix}.\]

Proposition 2 ([26]). Let \(\mathbf{S}= \left(\begin{smallmatrix} \mathbf{S}_{11} & \mathbf{S}_{12}\\ \mathbf{S}_{21} & \mathbf{S}_{22} \end{smallmatrix}\right)\) be a Hermitian \(2N\times 2N\) matrix such that \(0 \preceq \mathbf{S}\preceq \mathbf{I}\) and \(\mathrm{ph}(\mathbf{S}) = \mathbf{I}- \mathbf{S}\). There exists a Pfaffian point process with the marginal kernel \[\mathbb{K}_{\textcolor{black}{\mathbf{S}}}(i,j) = \begin{pmatrix} \mathbf{S}_{21}(i,j) & \mathbf{S}_{22}(i,j)\\ \mathbf{S}_{11}(i,j) - \delta_{ij} & \mathbf{S}_{12}(i,j) \end{pmatrix}, \quad 1\leq i,j \leq N.\]

A few remarks are in order. First, the properties of \(\mathbf{S}\) allow to simplify the expression of the kernel in \[\begin{align} \mathbb{K}_{\textcolor{black}{\mathbf{S}}}(i,j) = \left(\begin{matrix} \mathbf{S}_{21}(i,j) & \mathbf{S}_{22}(i,j)\\ -\mathbf{S}_{22}(j,i) & \overline{\mathbf{S}_{21}}(j,i) \end{matrix}\right), \label{eq:K95pfaffian95kernel} \end{align}\tag{4}\] where \(\mathbf{S}_{21}\) is skew-symmetric and \(\mathbf{S}_{22}\) is Hermitian. Second, for a matrix \(\mathbf{S}\) as in 2, we easily see that \(\mathbb{K}_{\textcolor{black}{\mathbf{S}}}\) and \(\mathbb{K}_{\textcolor{black}{\overline{\mathbf{S}}}}\) yield a PfPP with the same law. This is a consequence of the identity \[\begin{align} \mathbb{K}_{\overline{\mathbf{S}}}(i,j) = -\begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} \mathbb{K}_{\mathbf{S}}(i,j) \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix}.\label{e:pf95kernel95conjugate} \end{align}\tag{5}\] When no ambiguity is possible, we suppress the dependence on \(\mathbf{S}\) for simplifying expressions. Third, DPPs with Hermitian kernels appear as particular instances of the PfPPs of 2.

Example 3 (vanishing diagonal). Let \(\mathbf{S}\) satisfy the conditions of 2, and let \(\mathbb{K}_{\textcolor{black}{\mathbf{S}}}\) be the corresponding Pfaffian kernel. If \(\mathbf{S}_{21} = 0\), \(Y \sim \mathrm{PfPP}(\mathbb{K}_{\textcolor{black}{\mathbf{S}}})\) is distributed according to \(\mathrm{DPP}(\mathbf{S}_{22})\).

Fourth, for \(Y \sim \mathrm{PfPP}(\mathbb{K}_{\textcolor{black}{\mathbf{S}}})\) and \(i\neq j\), the \(2\)-point correlation function is \[\begin{align} \mathbb{P}(\{i,j\} \subseteq Y) &= \mathbf{S}_{22}(i,i)\mathbf{S}_{22}(j,j) - |\mathbf{S}_{22}(i,j)|^2 + |\mathbf{S}_{21}(i,j)|^2\nonumber\\ &= \mathbb{P}(\{i\} \subseteq Y)\mathbb{P}(\{j\} \subseteq Y) - |\mathbf{S}_{22}(i,j)|^2 + |\mathbf{S}_{21}(i,j)|^2. \label{e:2-point-correlation-PfPP} \end{align}\tag{6}\] Compared with DPPs with Hermitian kernels, Equation 6 suggests that a Pfaffian point process as in Proposition 2 is less repulsive than the related determinantal process \(\mathrm{DPP}(\mathbf{S}_{22})\) – an intuition for this fact is given in 4.4 below. Relatedly, note that the \(1\)-point correlation functions of \(\mathrm{DPP}(\mathbf{S}_{22})\) and \(\mathrm{PfPP}(\mathbb{K}_{\textcolor{black}{\mathbf{S}}})\) for \(\mathbb{K}_{\textcolor{black}{\mathbf{S}}}\) given in 4 are the same. In particular, the expected cardinality of \(Y\sim \mathrm{PfPP}(\mathbb{K}_{\textcolor{black}{\mathbf{S}}})\) is simply \(\require{physics} \mathop{\mathrm{\mathbb{E}}}| Y | = \Tr(\mathbf{S}_{22})\).

We end this section with a result about sample parity, which will be further discussed later in 4.4. 1 below gives the expected parity of the number of samples of a PfPP. Since we are not aware of any similar statement in the literature, we also provide a short proof of 1.

Lemma 1 (Parity of PfPP samples). Let \(\mathbb{K}\) be the kernel of a Pfaffian point process on \([N]\) and let \(\mathbb{J}\) be a \(2N\times 2N\) block matrix such that \(\mathbb{J}(i,j) = \delta_{ij} \left(\begin{smallmatrix} 0 & 1\\ -1 & 0 \end{smallmatrix}\right).\) The expected parity of the cardinality of a sample of \(Y\sim \mathrm{PfPP}(\mathbb{K})\) is \(\mathop{\mathrm{\mathbb{E}}}_{Y}(-1)^{|Y|} = \mathrm{Pf}\left(\mathbb{J} - 2 \mathbb{K}\right).\)

Proof. We shall prove a slightly more general result, which relies on the identity \[\mathrm{Pf}(\mathbb{J} + \mathbb{M}) = \sum_{S\subseteq [N]}\mathrm{Pf}\left(\mathbf{M}_S\right),\] which holds for any skew-symmetric block matrix \(\mathbb{M}\); see [32]. Note that the term corresponding to \(\emptyset\) (and equal to \(1\)) is included in the sum.

Let \(\alpha \neq 1\). We have \[\begin{align} \mathop{\mathrm{\mathbb{E}}}_{Y}(\alpha^{|Y|}) = \sum_{S\subseteq [N]}(\alpha -1)^{|S|} \mathrm{Pf}(\mathbf{K}_S) = \sum_{S\subseteq [N]}\mathrm{Pf}\left((\alpha -1)\mathbf{K}_S\right) = \mathrm{Pf}\left(\mathbb{J} + (\alpha - 1) \mathbb{K}\right), \end{align}\] where the first equality can be derived from [26]. This the desired result if we take \(\alpha = -1\). ◻

Below, in 11, we will see that samples of a PfPP associated with a projective \(\mathbf{S}\) have a fixed parity.

3 From qubits to fermions↩︎

The content of this section is standard; see e.g., the reference textbook [41] for quantum computing basics and [29] for the Jordan-Wigner transform. We also refer to [3], which presents all the basic elements required in this section in the context of optical measurements and the resulting point processes.

3.1 Models in quantum physics↩︎

A quantum model is given by \((i)\) a Hilbert space \((\mathbb{H},\braket{\cdot}{\cdot})\) called the state space, and \((ii)\) a collection of self-adjoint operators \(\mathbb{H}\to \mathbb{H}\) called observables, of which one particular observable \(H:\mathbb{H}\rightarrow \mathbb{H}\) is singled out and called the Hamiltonian. Let \(\psi \neq 0\) be an element of \(\mathbb{H}\). All elements of the form \(z \psi\) for a complex \(z\neq 0\) represent the same quantum state, called a pure quantum state, as opposed to more general states to be defined later. To simplify expressions, it is conventional to only consider elements \(\psi\) of unit norm, and to denote a unit-norm pure state by the “ket” \(\ket{\psi}\), keeping in mind that, as long as \(|z|=1\), all vectors \(z\ket{\psi}\in\mathbb{H}\) represent the same state. The corresponding “bra" \(\bra{\psi}\) is the linear form \(\ket{x}\mapsto \braket{\psi}{x}\).

3.1.1 An observable and a state define a random variable↩︎

Henceforth, we assume that \(\mathbb{H}\) has finite dimension \(d\). Take an observable \(A\). By the spectral theorem, \(A\) can be diagonalized in an orthonormal basis, say with eigenpairs \((\lambda_i, u_i)\) with \(1\leq i \leq d\). For simplicity, we momentarily assume that all the eigenvalues of \(A\) have multiplicity \(1\). Together with a state \(\ket\psi\), the observable \(A = \sum_i \lambda_i u_i u_i ^*\) describes a random variable \(X_{A,\psi}\) on \(\mathrm{spec}(A) = \{\lambda_1, \dots, \lambda_d\}\), through \[\label{e:born95pure95state} \mathbb{P}(X_{A, \psi} = \lambda_i) = \vert\braket{\psi}{u_i}\vert^2.\tag{7}\]

When modeling statistical uncertainty on a state, like when describing the noisy output of an experimental device, physical states are not modelled as unit-norm vectors of \(\mathbb{H}\), but rather as positive trace-one operators. To see how, we first map \(\ket{\psi}\) to the rank-one projector \(\rho = \ketbra{\psi}\). Then the distribution 7 can be equivalently defined as \[\label{e:born} \mathbb{P}(X_{A, \rho} = \lambda_i) = \tr\left[ \rho 1_{\{\lambda_i\}}(A) \right],\tag{8}\] where for any \(f:\mathbb{R}\rightarrow \mathbb{R}\), we have \(f(A) = \sum_i f(\lambda_i) u_i u_i ^*.\) In particular, the expectation of \(X_{A, \ketbra{\psi}}\) is \(\require{physics} \expval{A}{\psi}\). Note that Definition 8 generalizes to operators \(A\) with eigenvalues with arbitrary multiplicity, and to states \(\rho\) beyond projectors. In particular, for \(\mu\) a probability measure on \(\mathbb{H}\), \[\label{e:mixed95state} \rho = \mathop{\mathrm{\mathbb{E}}}_{\psi\sim\mu} \ketbra{\psi}\tag{9}\] still defines a probability measure on the spectrum of \(A\) through 8 . The expectation of that distribution, also called the expectation value of operator \(A\), is \(\require{physics} \expval{A}_\rho \triangleq \tr \rho A\). In physics, the association 8 of a state-observable pair \((\rho, A)\) with the random variable \(X_{A,\rho}\) is known as Born’s rule.

Any \(\rho\) that is not a rank-one projector is called a mixed state, by opposition to rank-one projectors, which are pure states. Mixed states like 9 are commonly used to describe any uncertainty in the actual state of an experimental entity.3

3.1.2 Commuting observables and a state define a random vector↩︎

When all pairs of a set of observables \(A_1, \dots, A_p\) commute, then these observables can be diagonalized in the same orthonormal basis \((u_i)\), and 8 can be naturally generalized to describe a random vector \((X_{A_j,\rho})\) of dimension \(p\), with values in the Cartesian product \[\mathrm{spec}(A_1) \times \dots \times \mathrm{spec}(A_p).\] More precisely, the law of \((X_{A_j,\rho})\) is given by \[\label{e:vector95born} \mathbb{P} (X_{A_1,\rho} = x_1, \dots, X_{A_p,\rho} = x_p) = \tr\left[ \rho 1_{\{x_1\}}(A_1) \dots 1_{\{x_p\}}(A_p) \right],\tag{10}\] see e.g. [42] for an introduction. In this paper, we will associate a Pfaffian point process to a particular mixed state and a set of commuting observables, which can respectively be efficiently prepared and measured on a quantum computer. To define these objects, we first need to explain how physicists build Hamiltonians of fermionic systems.

3.2 The canonical anti-commutation relations↩︎

The Hamiltonian and its structure are often the key part in specifying a model, much like the factorization of a joint distribution in a probabilistic model. In the case of fermions, a family of physical particles that includes electrons, Hamiltonians are typically built as polynomials of fermionic creation-annihilation operators, i.e., operators that satisfy the so-called canonical anti-commutation relations (CAR).

Definition 3 (CAR). Let \(\mathbb{H}\) be a Hilbert space. The operators \(c_j:\mathbb{H}\rightarrow \mathbb{H}\), \(j=1, \dots, p\) and their adjoints are said to satisfy the canonical anti-commutation relations if \[\begin{align} \label{e:CAR}\lbrace c_i,c_j \rbrace = \lbrace c^*_i,c^*_j \rbrace = 0 \qquad \text{and} \qquad \lbrace c_i,c^{*}_j \rbrace = \delta_{ij} \mathbb{I}, \end{align}\qquad{(4)}\] where \(\{u,v\} \triangleq uv+vu\) is the anti-commutator of operators \(u\) and \(v\).

Assuming existence4 for a moment, and limiting ourselves to a finite dimensional Hilbert space \(\mathbb{H}\) of dimension \(d=2^N\), one can say many things on \(\mathbb{H}\) from the fact that there are \(N\) operators \(c_1, \dots, c_N\) satisfying ?? . On that topic, we recommend reading [29], from which we borrow the following lemma.

Lemma 2 (Fock basis; see e.g. [29]). Let \(\mathrm{dim} \mathbb{H} = 2^N\), \(N\in\mathbb{N}^*\), and assume that \(c_1, \dots, c_N\) are distinct operators on \(\mathbb{H}\) that satisfy ?? . First, there is a vector \(\ket{\emptyset}\in\mathbb{H}\), called the vacuum, which is a simultaneous eigenvector of all \(c_i^*c_i\), \(i=1, \dots, N\), always with eigenvalue \(0\). Second, for \(\mathbf{n}=(n_1, \dots, n_N)\in\{0,1\}^N\), consider \[\ket{\mathbf{n}} \triangleq \prod_{i=1}^N (c_i^*)^{n_i} \ket{\emptyset}.\] Then \(\mathcal{B}_{\mathrm{Fock}} = (\ket{\mathbf{n}})\) is an orthonormal basis of \(\mathbb{H}\). Third, for all \(1\leq i \leq N\), \[c_i^*c_i \ket{\mathbf{n}} = n_i \ket{\mathbf{n}}, \label{e:nonzero95terms}\qquad{(5)}\] and, for \(i\neq j\), \[c_i^*c_j \ket{\mathbf{n}} =\pm n_j (1-n_i) \ket{\widetilde{\mathbf{n}}}, \label{e:zero95terms}\qquad{(6)}\] where \(\widetilde{n}_i =1\), \(\widetilde{n}_j =0\), and \(\widetilde{n}_k = n_k\) for \(k\neq i,j\).

The basis built in the lemma in called the Fock basis. Its construction depends on the choice of the operators \(c_1, \dots, c_N\). When there is a risk of confusion, we shall thus further denote \(\ket{\mathbf{n}}\) and \(\ket{\emptyset}\) as \(\ket{\mathbf{n}_c}\) and \(\ket{\emptyset_c}\), respectively. Second, because applying \(c_i\) to a vector of the Fock basis has the effect of zeroing the \(i\)th component if it was \(1\), and mapping to zero otherwise, we call the \(c_i\)’s annihilation operators. Similarly, we call their adjoints creation operators.

3.3 Fermionic operators acting on qubits↩︎

Henceforth, we let \(\mathbb{H} \triangleq (\mathbb{C}^2)^{\otimes N}\). This is the state space describing \(N\) qubits, of dimension \(2^N\). A qubit corresponds to any physical system, the state of which is described by one out of two levels. We associate these two levels with a distinguished orthonormal basis \((\ket{0}, \ket{1})\) of \(\mathbb{C}^2\), commonly called the computational basis.

Consider the so-called Pauli operators on \(\mathbb{C}^{2}\), given in the computational basis as \[\mathbf{I}= \begin{pmatrix} 1 & 0\\ 0 & 1\end{pmatrix},\quad \sigma_x = \begin{pmatrix} 0 & 1\\ 1 & 0\end{pmatrix}, \quad \sigma_y = \begin{pmatrix} 0 & -\mathrm{i}\\ \mathrm{i}& 0\end{pmatrix}, \quad \sigma_z = \begin{pmatrix} 1 & 0\\ 0 & -1\end{pmatrix}. \label{e:pauli95matrices}\tag{11}\] Note that \(\sigma_x^2 = \sigma_y^2 = \sigma_z^2= \mathbf{I}\), and that \(\sigma_x\sigma_y = \mathrm{i}\sigma_z\).

Definition 4 (The Jordan-Wigner (JW) transformation). Define the JW annihilation operators \(a_j\) on \(\mathbb{H}\), for \(j\in\{1, \dots N\}\), by their matrices in the computational basis \[\label{e:jordan95wigner} \underbrace{\sigma_z \otimes \dots \otimes \sigma_z}_{j-1\text{ times}} \otimes \left(\frac{\sigma_x+\mathrm{i}\sigma_y}{2}\right) \otimes \mathbf{I}\otimes \dots \otimes \mathbf{I}.\qquad{(7)}\] The creation operator \(a_j^*\), adjoint of \(a_j\), has matrix \[\underbrace{\sigma_z \otimes \dots \otimes \sigma_z}_{j-1\text{ times}} \otimes \left(\frac{\sigma_x-\mathrm{i}\sigma_y}{2}\right) \otimes \mathbf{I}\otimes \dots \otimes \mathbf{I}.\]

It is easy to check the following properties of the Jordan-Wigner operators.

Proposition 3. Let \(a_1, \dots, a_N\) be the JW operators from 4. They satisfy ?? . Moreover, the so-called number* operators \(N_j= a_j^*a_j\) have the simple matrix form \[\underbrace{\mathbf{I}\otimes \dots \otimes \mathbf{I}}_{j-1\text{ times}} \otimes \frac{\mathbf{I}- \sigma_z}{2} \otimes \mathbf{I}\dots\otimes \mathbf{I}.\] Finally, the Fock basis \(\ket{\mathbf{n}}\) obtained by setting \(c_j = a_j\) in 2 is actually the computational basis, i.e., \[\ket{\mathbf{n}} = \ket{n_1}\otimes \dots \otimes \ket{n_N}.\]*

There are alternative ways to define fermionic operators on \(\mathbb{H}\) using Pauli matrices, like the parity or Bravyi-Kitaev transformations [43]; or the Ball-Verstraete-Cirac transformation [44], [45]. In particular, and at the expense of simplicity, the Bravyi-Kitaev transformation yields operators with smaller weight than Jordan-Wigner: only \(\mathcal{O}(\log N)\) terms are allowed to differ from the identity in the equivalent of ?? .

4 From fermions to Pfaffian point processes↩︎

We first show in 4.1 how to build any discrete DPP with Hermitian kernel from a mixed state corresponding to a particle number preserving Hamiltonian and a set of commuting observables. This connection between DPPs and quasi-free states was observed by [46]. Then we do the same for a class of discrete PfPPs in 4.4 associated to Hamiltonians without particle number conservation.

4.1 Building a DPP from a quantum measurement↩︎

Let \(\mathbf{H}\in\mathbb{C}^{N\times N}\) be Hermitian, and consider the operator \[\begin{align} H &= \sum_{i,j=1}^N c_i^*\mathbf{H}_{ij} c_j,\label{eq:quad95Hamiltonian95vanilla} \end{align}\tag{12}\] which we think of as a Hamiltonian acting on \(\mathbb{H}\). This Hamiltonian preserves the particle number, i.e., \(H\) commutes with the number operator \(\sum_i c_i^*c_i\). Let \((\lambda_k, \mathbf{u}_{k})_{1\leq k \leq N}\) be the eigenpairs of \(\mathbf{H}\) where the eigenvalues satisfy \(\lambda_1 \leq \dots \leq \lambda_N\). The Hamiltonian 12 can thus be rewritten in “diagonalized" form \[\begin{align} H = \sum_{i,j=1}^N c_i^*\left( \sum_{k=1}^N \lambda_k \mathbf{u}_{k} \mathbf{u}_k^*\right)_{ij} c_j= \sum_{i,j,k=1}^N \lambda_k c_i^*\mathbf{u}_{ki} \mathbf{u}_{kj}^* c_j= \sum_{k=1}^N \lambda_k b_k^*b_k, \end{align}\] where we defined the operators \[\begin{align} b_k = \sum_{j=1}^N \mathbf{u}_{kj}^* c_j, \label{e:new95annihilation95operators} \end{align}\tag{13}\] which can be checked to satisfy ?? .

In what follows, we consider in 4.2 DPPs for which the correlation kernel has a spectrum in \([0,1)\), referred to as L-ensembles in the literature. Next, in 4.3, we discuss projection DPPs.

4.2 The case of L-ensembles↩︎

Below, 4 states that there is a natural DPP associated with a Hamiltonian like 12 , whose marginal kernel \(\mathbf{K}\) is obtained by applying a sigmoid to the spectrum of \(\mathbf{H}\); formally \(\mathbf{K} = \sigma(-\beta \mathbf{H})\) with \(\beta >0\) being a parameter called inverse temperature.

Proposition 4 (DPP kernel by taking the sigmoid). Let \(\beta>0\) and \(\mu \geq 0\), \(H\) and \((b_k)\) be respectively defined by 12 and 13 . Consider the mixed state \[\begin{align} \label{e:gaussian} \rho =\frac{1}{Z} \mathrm{e}^{ -\beta \left( H - \mu \sum_{j=1}^N b_j^*b_j \right) }, \end{align}\qquad{(8)}\] where the normalization constant \(Z\) ensures that \(\tr \rho = 1\). For \(i\in[N]\), the observable \(N_i=c_i^*c_i\) is a projector, so that the random variable \(X_{N_i,\rho}\) associated to \(N_i\) and the state ?? by 8 takes values in \(\{0,1\}\). Moreover, all \(N_i\)’s commute pairwise, and thus define a joint Boolean vector \((X_{N_i,\rho})_{i\in[N]}\) through 10 . Consider the point process \[S = \{i\in[N]: X_{N_i,\rho}=1\},\] corresponding to jointly observing all operators \(N_i\) in the state ?? , and reporting the indices corresponding to a \(1\). Then \(S\) is determinantal with correlation kernel \[\mathbf{K}= \mathbf{U}\, \mathrm{Diag}\left( \frac{\mathrm{e}^{-\beta (\lambda_k-\mu)}}{1 + \mathrm{e}^{-\beta (\lambda_k-\mu)}}\right)\, \mathbf{U}^*,\] where \(\mathbf{U}\) and \((\lambda_k)_k\) are obtained by the diagonalization \(\mathbf{H}= \mathbf{U}\, \mathrm{Diag}\left( \lambda_k\right) \mathbf{U}^*\).

Proof. All number operators \(N_i = c_i^*c_i\) commute pairwisely and have spectrum in \(\{0,1\}\); see 2. Consequently, their joint measurement indeed describes a random binary vector \(X=(X_{N_i, \rho})\). By Born’s rule 10 , the correlation function of the point process corresponding to the indices of the \(1\)s in \(X\) are given by \[\require{physics} \mathbb{P} \left ( \{i_1, \dots, i_k\}\subseteq X \right ) = \tr \left(\rho N_{i_1}\dots N_{i_k}\right) \triangleq \expval{N_{i_1}\dots N_{i_k}}_\rho.\] Because of the anti-commutation relations ?? , an explicit computation known as Wick’s theorem implies that for any \(i_1,\dots,i_k\in [N]\), \[\require{physics} \label{e:wick} \expval{N_{i_1}\dots N_{i_k}}_\rho = \det \left(\expval{c_{i_m}^*c_{i_n}}_\rho\right)_{m,n}.\tag{14}\] Wick’s theorem is a standard result in quantum physics; see e.g.[3] for a rewriting with our notations of one of the canonical references [47].

Now, Equation 14 implies that the point process \(X\), consisting of simultaneously measuring the occupation of all qubits using \((N_i)\), is determinantal with correlation kernel \[\require{physics} \begin{align} \mathbf{K}_{ij} &= \expval{c_{i}^*c_{j}}_\rho. \label{e:kernel95intermediate} \end{align}\tag{15}\] In order to provide an explicit computation of \(\mathbf{K}_{ij}\), we use 2 with the operators \(b_1, \dots, b_N\), to obtain a basis \((\ket{\mathbf{n}}) = (\ket{\mathbf{n}}_b)\) of eigenvectors of all operators of the form \(b_i^*b_i\), \(i=1, \dots, N\). We now proceed to computing the trace in 15 by summing over that basis. We write \[\require{physics} \begin{align} \mathbf{K}_{ij} &= \tr \left[\rho c_{i}^*c_{j}\right] \\ &= \tr \left[\rho c_{i}^*c_{j} \sum_{\mathbf{n}} \ketbra{\mathbf{n}}\right]\\ & = Z^{-1} \sum_{\mathbf{n}} \expval{\mathrm{e}^{-\beta \sum_{p} (\lambda_p-\mu) b_p^*b_p} c_i^*c_j}{\mathbf{n}}. \end{align}\] Now note that \(c_j = \sum_{\ell=1}^N \mathbf{u}_{j\ell}b_\ell\), so that \(c_i^*c_j = \sum_{k,\ell=1}^N \mathbf{u}^*_{ik} \mathbf{u}_{j\ell} b_k^*b_\ell\), and \[\require{physics} \begin{align} \mathbf{K}_{ij} &= Z^{-1}\sum_{k,\ell=1}^N \mathbf{u}^*_{ik} \mathbf{u}_{j\ell} \sum_{\mathbf{n}} \expval{\mathrm{e}^{-\beta \sum_{p} (\lambda_p-\mu) b_p^*b_p} b_k^*b_\ell}{\mathbf{n}}. \end{align}\] Because of ?? and the orthonormality of the basis \((\ket{\mathbf{n}})\), for \(k\neq \ell\), \[\require{physics} \expval{\mathrm{e}^{-\beta \sum_{p} (\lambda_p-\mu) b_p^*b_p} b_k^*b_\ell}{\mathbf{n}}=0.\] Therefore, \[\require{physics} \begin{align} \mathbf{K}_{ij} &= Z^{-1}\sum_{k=1}^N \mathbf{u}^*_{ik} \mathbf{u}_{jk} \sum_{\mathbf{n}} \expval{\mathrm{e}^{-\beta \sum_{p} (\lambda_p-\mu) b_p^*b_p} b_k^*b_k}{\mathbf{n}} \tag{16}\\ &= Z^{-1}\sum_{k=1}^N \mathbf{u}^*_{ik} \mathbf{u}_{jk} \sum_{\mathbf{n}} n_k \mathrm{e}^{-\beta \sum_p (\lambda_p-\mu)n_p}. \tag{17} \end{align}\] Now, we pause and compute the normalization constant \(Z\) of \(\rho\). Starting from \(\tr\rho=1\), we write \[\require{physics} \begin{align} Z = \tr \mathrm{e}^{-\beta \sum_{p} (\lambda_p-\mu) b_p^*b_p} = \sum_{\mathbf{n}} \expval{\mathrm{e}^{-\beta \sum_{p} (\lambda_p-\mu) b_p^*b_p}}{\mathbf{n}} = \sum_{\mathbf{n}} \mathrm{e}^{-\beta \sum_p (\lambda_p-\mu)n_p}. \end{align}\] Rewriting the sum as a product, we obtain \[\begin{align} Z &= \prod_{p=1}^N \sum_{n_p \in \{0,1\}} \mathrm{e}^{-\beta (\lambda_p-\mu)n_p} = \prod_{p=1}^N (1+ \mathrm{e}^{-\beta (\lambda_p-\mu)}). \label{e:constant95intermediate} \end{align}\tag{18}\] Now, explicitly writing the dependence of \(Z=Z(\boldsymbol{\lambda})\) to \(\boldsymbol{\lambda}= (\lambda_1, \dots, \lambda_N)\), Equations 17 and 18 together yield \[\begin{align} \mathbf{K}_{ij} &= \sum_{k=1}^N \mathbf{u}^*_{ik} \mathbf{u}_{jk} \frac{\sum_{\mathbf{n}} n_k \mathrm{e}^{-\beta \sum_p (\lambda_p-\mu)n_p}}{\sum_{\mathbf{n}} \mathrm{e}^{-\beta \sum_p (\lambda_p-\mu)n_p}}\\ &= \sum_{k=1}^N \mathbf{u}^*_{ik} \mathbf{u}_{jk} \left[-\frac{1}{\beta}\frac{\mathrm{d}}{\mathrm{d}\lambda_k} \log Z(\boldsymbol{\lambda})\right]\\ &= \sum_{k=1}^N \mathbf{u}^*_{ik} \mathbf{u}_{jk} \frac{\mathrm{e}^{-\beta (\lambda_k-\mu)}}{1 + \mathrm{e}^{-\beta (\lambda_k-\mu)}}, \label{e:kernel95final} \end{align}\tag{19}\] for any \(1\leq i,j \leq N\). Thus, we find \(\mathbf{K}= \overline{\mathbf{U}}\, \mathrm{Diag}\left( \frac{\mathrm{e}^{-\beta (\lambda_k-\mu)}}{1 + \mathrm{e}^{-\beta (\lambda_k-\mu)}}\right)\, \overline{\mathbf{U}}^*\). Since \(\overline{\mathbf{K}}\) and \(\mathbf{K}\) define the same DPP, we have the desired result. ◻

Corollary 1 (Hamiltonian by taking the logit). Let \(\mathbf{K}= \mathbf{V} \mathbf{D} \mathbf{V}^*\) be a Hermitian DPP kernel with \(0\prec \mathbf{D}\prec \mathbf{I}\). Then Proposition 4 yields a DPP with kernel \(\mathbf{K}\) provided we choose \(\mathbf{V} = \mathbf{U}\) and the eigenvalues \(\boldsymbol{\lambda}\) so that \[\beta(\lambda_i-\mu) = \log \frac{1-d_i}{d_i}.\] Furthermore, assuming \(d_1 \geq \dots \geq d_N \geq 0\), and \(d_{r} < \mu < d_{r+1}\), \(\mathbf{K}\) converges to the projection kernel onto the first \(r\) columns of \(\mathbf{V}\).

Proof. The first part is a consequence of 16 . The second statement is maybe easiest to see in terms of Frobenius norm, to wit \(\Vert \mathbf{A}\Vert^2 = \sum_{i=1}^N \sigma_i^2(\mathbf{A})\), where \(\sigma_i(\mathbf{A})\) are the singular values of \(\mathbf{A}\). If \(\mathbf{P}\) denotes the projector into the first \(r\) columns of \(\mathbf{V}\), then the absolute values of the eigenvalues of the Hermitian matrix \(\mathbf{K}-\mathbf{P}\) are its singular values, and all of them converge to \(0\). ◻

4.3 The case of projection DPPs↩︎

With the notation of 4.2, we note that \((\ket{\mathbf{n}}) = (\ket{\mathbf{n}}_b)\) is a basis of eigenvectors of \(\rho\), and that \[\rho\ket{\mathbf{n}} = \frac{\mathrm{e}^{-\beta \sum_p (\lambda_p - \mu) n_p}}{\sum_{\mathbf{n}'} \mathrm{e}^{-\beta \sum_p (\lambda_p - \mu)n'_p}} \ket{\mathbf{n}}.\] In particular, the only eigenvalue that does not vanish when \(\beta\rightarrow \infty\) is that of \(\ket{1,\dots, 1, 0, \dots, 0}\), where the \(1\)s occupy the first \(r\) components, and \(r\in\{1, \dots, N\}\) is such that with \(\lambda_{r} < \mu < \lambda_{r+1}\). Indeed, the ratio of the eigenvalue of \(\ket{1,\dots, 1, 0, \dots, 0}\) with that of any other eigenvector diverges to \(+\infty\), while all eigenvalues remain in \([0,1]\) and the trace is fixed to \(1\). Thus, denoting \(\ket{\psi} = \ket{1,\dots, 1, 0, \dots, 0}\), \[\rho \rightarrow \ketbra{\psi}\] in Frobenius norm as \(\beta\rightarrow \infty\). In particular, \[\tr N_{i_1} \dots N_{i_k} \rho \rightarrow \tr N_{i_1} \dots N_{i_k} \ketbra{\psi}\] as \(\beta\rightarrow \infty\). Combined with the second statement of 1, we know that the correlation functions of the point process corresponding to preparing \(\ketbra{\psi}\) and measuring the occupation of all qubits and recording where the \(1s\) occur is the projection DPP of kernel \(\mathbf{P} = \mathbf{V}_{:,[r]} \mathbf{V}^*_{:,[r]}\). Recall that \(\mathbf{V}_{:,[r]}\) is the matrix obtained by selecting the \(r\) first columns of \(\mathbf{V}\).

Remark 5 (Beyond Gaussian states). We have shown that the kernels of projection DPPs are obtained as limits of kernels associated with Gaussian states ?? . Actually, a DPP kernel can be associated with a density matrix of a quasi-free state generalizing the Gaussian state, for which Wick’s theorem also holds; see e.g.[26]. In particular, every quasi-free state is a convex combination of pure states. We give now a few details by following [48]. Let \(\mathcal{C} \subseteq [N]\) and let \(\ket{\psi_\mathcal{C}} = \prod_{i\in \mathcal{C}} c_i^*\ket{\emptyset}\). Let \(\mathbf{K}\) be a Hermitian matrix with eigenvalues \((\nu_p)\) in \([0,1]\). The density matrix associated with \(\mathbf{K}\) is the quasi-free state \[\rho = \sum_{\mathcal{C}\subseteq [N]} \alpha_\mathcal{C} \ketbra{\psi_\mathcal{C}} \text{ with } \alpha_\mathcal{C} = \prod_{p\in \mathcal{C}}\nu_p \prod_{q\in [N]\setminus\mathcal{C}}(1-\nu_q).\] Furthermore, in 8.7, we establish for the interested reader the determinantal formula for \(\require{physics} \expval{N_{i_1}\dots N_{i_k}}_\rho\) for a projective \(\rho\) without resorting to Wick’s theorem as in the proof of 4.

4.4 Building a PfPP from the BdG Hamiltonian↩︎

We shall see that Pfaffian point processes appear using slightly more sophisticated Hamiltonians. Unlike Hamiltonian 12 , the Hermitian operator \[\begin{align} H = \frac{1}{2}\sum_{i,j=1}^N \mathbf{M}_{ij} (c_i^*c_j - c_j c_i^*) + \frac{1}{2}\sum_{i,j=1}^N \left( \boldsymbol{\Delta}_{ij} c_i^*c_j^*+ \boldsymbol{\Delta}^*_{ij} c_i c_j\right), \label{e:H95general} \end{align}\tag{20}\] with complex matrices \(\mathbf{M} = \mathbf{M}^*\) and \(\boldsymbol{\Delta}\), does not commute with the total number operator \(N = \sum_i c_i^* c_i\). Physicists say that \(H\) does not “preserve" the total number of particles.

Note that, due to the anti-commutation relation \(c_i^*c_j^*= - c_j^*c_i^*\), we can simply redefine \(\boldsymbol{\Delta}\) so that \(\boldsymbol{\Delta} = - \boldsymbol{\Delta}^\top\). The Hamiltonian 20 becomes the so-called Bogoliubov-de Gennes (BdG) Hamiltonian \[\begin{align} H_{\mathrm{BdG}} = \frac{1}{2}\sum_{i,j=1}^N \mathbf{M}_{ij} (c_i^*c_j - c_j c_i^*) + \frac{1}{2}\sum_{i,j=1}^N \left( \boldsymbol{\Delta}_{ij} c_i^*c_j^*- \overline{\boldsymbol{\Delta}_{ij}} c_i c_j\right), \label{e:H95BdG} \end{align}\tag{21}\] which is a model for superconductors; see e.g.[49] for a modern overview.

Remark 6 (Fermion parity). The Hamiltonian \(H_{\mathrm{BdG}}\) commutes with the parity operator \((-1)^{\sum_{i=1}^N c_i^* c_i}\); see e.g.[50]. Therefore, the eigenfunctions of \(H_{\mathrm{BdG}}\) are also eigenfunctions of the parity operator.

We now investigate the point process associated to the occupation numbers of the Gaussian state \(Z^{-1} \exp(-\beta H_{\mathrm{BdG}}),\) for \(\beta >0\), as we did for ?? . It is convenient to stack the operators \(c_1, \dots, c_N\) and their adjoints in column vectors, and write \(\mathbf{c} = [c_1 \dots c_N ]^\top\) and \(\mathbf{c}^*= [c_1^*\dots c_N^*]^\top\). We can then rewrite the Hamiltonian more compactly as \[\begin{align} H_{\mathrm{BdG}} = \frac{1}{2}\Big(\begin{smallmatrix} \mathbf{c}^*\\ \mathbf{c} \end{smallmatrix}\Big)^*\mathbf{H}_{\mathrm{BdG}} \Big(\begin{smallmatrix} \mathbf{c}^*\\ \mathbf{c} \end{smallmatrix}\Big) \text{ with } \mathbf{H}_{\mathrm{BdG}} = \begin{pmatrix} -\overline{\mathbf{M}} & -\overline{\boldsymbol{\Delta}}\\ \boldsymbol{\Delta} & \mathbf{M} \end{pmatrix}.\label{eq:H95matrix} \end{align}\tag{22}\] By construction, the matrix \(\mathbf{H}_{\mathrm{BdG}}\) obeys the particle-hole symmetry, namely \(\mathbf{C}\overline{\mathbf{H}}_{\mathrm{BdG}}\mathbf{C}= -\mathbf{H}_{\mathrm{BdG}}\) where we recall the definition of the involution \(\mathbf{C}= \big(\begin{smallmatrix} \boldsymbol{0} &\mathbf{I}\\ \mathbf{I}& \boldsymbol{0} \end{smallmatrix}\big)\). The name particle-hole symmetry comes from the fact that \[\mathbf{C}\Big(\begin{smallmatrix} \mathbf{c}^*\\ \mathbf{c} \end{smallmatrix}\Big) = \Big(\begin{smallmatrix} \mathbf{c}\\ \mathbf{c}^* \end{smallmatrix}\Big)\] flips the role of creation and annihilation operators.

Before going further, we introduce a group of transformations preserving ?? that are used to diagonalize the Bogoliubov-de Gennes Hamiltonian; see [51].

Definition 5 (orthogonal complex matrix). A complex invertible \(2N\times 2N\) matrix \(\mathbf{W}\) satisfying \(\mathbf{W}^\top \mathbf{C}\mathbf{W} = \mathbf{C}\) is called an orthogonal complex matrix. The group of orthogonal complex matrices is denoted by \(O(\mathbf{C},\mathbb{C})\).

For any orthogonal complex matrix \(\mathbf{W}\) and any \(N\) operators \(c_1, \dots, c_N\) satisfying ?? , another set of creation-annihilation operators satisfying ?? is given by the so-called Bogoliubov transformation \[\begin{align} \Big(\begin{smallmatrix} \boldsymbol{b}^*\\ \boldsymbol{b} \end{smallmatrix}\Big) = \boldsymbol{W} \Big(\begin{smallmatrix} \mathbf{c}^*\\ \mathbf{c} \end{smallmatrix}\Big),\label{e:Bog95transform} \end{align}\tag{23}\] and by further requiring that \(\mathbf{W}\) is unitary, \(b_k^*\) is the adjoint of \(b_k\) for all \(k \in [N]\). Hence, in what follows, we only consider transformations \(\mathbf{W}\in O(\mathbf{C},\mathbb{C})\cap U(2N)\).

Lemma 3. A unitary orthogonal complex matrix is of the form \(\mathbf{W} = \big(\begin{smallmatrix} \mathbf{U} & \mathbf{V} \\ \overline{\mathbf{V}} & \overline{\mathbf{U}} \end{smallmatrix}\big),\) with \(\mathbf{U} \mathbf{U}^*+ \mathbf{V} \mathbf{V}^*= \mathbf{I}\) and \(\mathbf{U} \mathbf{V}^\top + \mathbf{V} \mathbf{U}^\top = \boldsymbol{0}\). Furthermore, \(\det \mathbf{W} \in \{-1,1\}\).

Next, following [22], we diagonalize 21 by a convenient change of variables, which consists in finding a suitable Bogoliubov transformation. The upshot is that we have the decomposition \[\mathbf{H}_{\mathrm{BdG}} = \mathbf{W}^* \begin{pmatrix} \mathrm{Diag}(-\epsilon_k) & 0\\ 0 & \mathrm{Diag}(\epsilon_k) \end{pmatrix} \mathbf{W},\label{e:eigen95decomp95HBDG}\tag{24}\] as shown in 4 below.

Lemma 4 (Diagonalization of BdG Hamiltonian). Let \(\boldsymbol{\Omega} = \frac{1}{\sqrt{2}} \Big(\begin{smallmatrix} \mathbf{I}& \mathbf{I}\\ \mathrm{i}\mathbf{I}& -\mathrm{i}\mathbf{I} \end{smallmatrix}\Big)\) and define \(\mathbf{A} = -\mathrm{i}\overline{\boldsymbol{\Omega}} \Big(\begin{smallmatrix} \boldsymbol{\Delta} & \mathbf{M}\\ -\overline{\mathbf{M}} & -\overline{\boldsymbol{\Delta}} \end{smallmatrix}\Big) \boldsymbol{\Omega}^*\). The following statements hold.

  1. \(H_{\mathrm{BdG}} = \frac{1}{2} \mathrm{i}\boldsymbol{f}^\top \mathbf{A} \boldsymbol{f}\) where \(\boldsymbol{f} = \boldsymbol{\Omega} \Big(\begin{smallmatrix} \mathbf{c}^*\\ \mathbf{c} \end{smallmatrix}\Big)\) and \(\mathbf{A}\) is real skew-symmetric.

  2. There exists a real orthogonal matrix \(\boldsymbol{R}\) and a real vector \(\boldsymbol{\epsilon} = [\epsilon_1, \dots, \epsilon_N]^\top\) such that \(0\leq \epsilon_1\leq \dots\leq \epsilon_N\) and \(\boldsymbol{R}\mathbf{A}\boldsymbol{R}^\top = \Big(\begin{smallmatrix} 0 & \mathrm{Diag}(\epsilon_k)\\ -\mathrm{Diag}(\epsilon_k) & 0 \end{smallmatrix}\Big).\)

  3. Another set of creation-annihilation operators satisfying ?? is given by \(\Big(\begin{smallmatrix} \boldsymbol{b}^*\\ \boldsymbol{b} \end{smallmatrix}\Big) = \boldsymbol{W} \Big(\begin{smallmatrix} \mathbf{c}^*\\ \mathbf{c} \end{smallmatrix}\Big)\), where \(\boldsymbol{W} = \boldsymbol{\Omega}^*\boldsymbol{R} \boldsymbol{\Omega} \in O(\mathbf{C},\mathbb{C}) \cap U(2N)\).

  4. We have the diagonalization \(H_{\mathrm{BdG}} = \frac{1}{2} \Big(\begin{smallmatrix} \mathbf{b}^*\\ \mathbf{b} \end{smallmatrix}\Big)^\top \Big(\begin{smallmatrix} \mathbf{0} & \mathrm{Diag}(\epsilon_k)\\ -\mathrm{Diag}(\epsilon_k) & \mathbf{0} \end{smallmatrix}\Big) \Big(\begin{smallmatrix} \mathbf{b}^*\\ \mathbf{b} \end{smallmatrix}\Big).\)

We refer to [22] for a proof sketch.

Now, we leverage these results to compute the expectation value of bilinears under \(\rho_{\mathrm{BdG}}\). 5 states a result analogous to 4, and its proof, given in 8.2, relies on similar techniques such as Wick’s theorem.

Lemma 5. Let \(\boldsymbol{\epsilon} = [\epsilon_1, \dots, \epsilon_N]^\top\) be the eigenvalues of \(H_{\mathrm{BdG}}\) such that \(0\leq \epsilon_1\leq \dots\leq \epsilon_N\) as given by 4 and let \(\rho = \frac{1}{Z} \exp( -\beta H_{\mathrm{BdG}}).\) We have \[\begin{align} \mathbf{S} \triangleq \begin{pmatrix} (\left\langle c_i c^*_j \right\rangle_{\rho})_{i,j} &(\left\langle c_i c_j\right\rangle_{\rho})_{i,j} \\ (\left\langle c^*_i c^*_j\right\rangle_{\rho})_{i,j} & (\left\langle c^*_i c_j\right\rangle_{\rho})_{i,j} \end{pmatrix} = \overline{\mathbf{W}}^{*} \begin{pmatrix} \mathrm{Diag}(\sigma(\beta \epsilon_k))& \mathbf{0}\\ \mathbf{0} & \mathrm{Diag}(\sigma(-\beta \epsilon_k)) \end{pmatrix} \overline{\mathbf{W}}, \end{align}\] where \(\sigma(x)\) is the sigmoid function; see Section 1. Furthermore, we have that \(\mathbf{S}= \sigma(-\beta \overline{\mathbf{H}_{\mathrm{BdG}}})\) and \(\mathbf{S}\) satisfies \(0\preceq \mathbf{S} \preceq \mathbf{I}\), as well as \(\mathbf{C}\overline{\mathbf{S}}\mathbf{C} = \mathbf{I}- \mathbf{S}\).

Note that the process \(\mathrm{PfPP}(\mathbb{K}_{\mathbf{S}})\) with the \(\mathbf{S}\) matrix given in 5 has the same law as \(\mathrm{PfPP}(\mathbb{K}_{\overline{\mathbf{S}}})\) in the light of the behaviour of this type of Pfaffian kernels under complex conjugation; see 5 .

For convenience, we introduce the following notations, for \(1\leq i,j\leq N\), \[\begin{pmatrix} \left\langle c_i c^*_j\right\rangle_{\rho} & \left\langle c_i c_j\right\rangle_{\rho} \\ \left\langle c^*_i c^*_j\right\rangle_{\rho}& \left\langle c^*_i c_j\right\rangle_{\rho} \end{pmatrix} \triangleq \begin{pmatrix} \delta_{ij}-\mathbf{K}^T_{ij} &-\overline{\mathbf{P}_{ij}} \\ \mathbf{P}_{ij} &\mathbf{K}_{ij} \end{pmatrix},\] where \(\mathbf{0} \preceq \mathbf{K}\preceq \mathbf{I}\) is Hermitian and \(\mathbf{P}\) is skew-symmetric. By definition, we have \[\mathbf{S} = \begin{pmatrix} \mathbf{I}-\overline{\mathbf{K}}& \mathbf{P}^*\\ \mathbf{P} & \mathbf{K} \end{pmatrix}.\] The particle-hole transformation amounts to replacing in \(\mathbf{S}\) each \(c_i\) by \(c_i^*\) and vice-versa. As a consequence of ?? , \(\mathbf{S}\) satisfies \(\mathbf{C}\overline{\mathbf{S}}\mathbf{C} = \mathbf{I}- \mathbf{S}\); see 2.2.

can be seen as a generalization to mixed states of the analysis of [52], and restates the results of [26].

Proposition 7. Let \(N_i = c_i^*c_i\) for \(1\leq i\leq N\) and let \(\rho = Z^{-1} \exp( -\beta H_{\mathrm{BdG}})\). For any \(i_1,\dots,i_k\in [N]\), we have \[\require{physics} \label{eq:correlation95Pfaffian} \expval{N_{i_1}\dots N_{i_k}}_{\rho} = \mathrm{Pf}\left(\mathbb{K}(i_m,i_n)\right)_{1\leq m,n \leq k},\qquad{(9)}\] where each block of the above \(2k \times 2k\) matrix is given in terms of the \(2\times 2\)-matrix-valued kernel \[\begin{align} \mathbb{K}(i,j) = \begin{pmatrix} \mathbf{P}_{i j} & \mathbf{K}_{i j}\\ - \mathbf{K}_{j i} & -\overline{\mathbf{P}_{i j}} \end{pmatrix}. \label{eq:Pfaffian95kernel} \end{align}\qquad{(10)}\] The latter satisfies \(\mathbb{K}(i,j)^\top = -\mathbb{K}(j,i)\) for \(1\leq i,j \leq N\).

The proof of this result, given in 8.3, is also based on Wick’s theorem.

More can also be said about sample parity.

Proposition 8 (Sample parity and Majorana quadratic form). In the setting of 5, for \(Y \sim \mathrm{PfPP}(\mathbb{K})\), we have that \[\mathop{\mathrm{\mathbb{E}}}_Y (-1)^{|Y|} = \det(\mathbf{W})\prod_{k = 1}^N \tanh(\beta \epsilon_k/2).\label{e:expected95parity95tanh}\qquad{(11)}\] Assuming that \(\epsilon_k >0\) for all \(k\), an alternative expression for the expectation of the parity is obtained by using the identity \(\det \mathbf{W} = \mathop{\mathrm{\mathrm{sign}}}\mathrm{Pf}(\mathbf{A}_{M}).\) Here, the skew-symmetric matrix \(\mathbf{A}_{M}\) is the quadratic form of the Majorana representation of the Hamiltonian, namely \[H_{\mathrm{BdG}} = \frac{\mathrm{i}}{2} \boldsymbol{\gamma}^\top \mathbf{A}_{M} \boldsymbol{\gamma},\] where \(\gamma_{2k-1} = \frac{1}{\sqrt{2}}(c_k^*+c_k)\) and \(\gamma_{2k} = \frac{\mathrm{i}}{\sqrt{2}}(c^*_k - c_k)\) for \(k\in [N]\).

The proof of 8 given in 8.4 relies on the expected parity formula of 1. Incidentally, note that this result implies Kitaev’s formula [53] for the parity of a one-dimensional chain of fermions in a limit case that we discuss in 11; see also [50] for a detailed study of Pfaffian formulae for fermion parity.

5 Quantum circuits to sample DPPs and PfPPs↩︎

Armed with the connections between PfPPs and fermions in Section 4, it remains to connect fermions and quantum circuits. Quantum circuits are briefly introduced in Section 5.1 for self-containedness. In Section 5.2, we describe the quantum circuit of [20], later modified by [22], that corresponds to a projection DPP given the diagonalized form of its kernel. In Section 5.2.2, we depart from [22] and highlight the connections of their construction with a classical parallel QR algorithm in numerical algebra [23]. Consequently, we propose to take inspiration from more recent parallel QR algorithms, such as [24], to construct circuits with different constraints on the communication between qubits. In particular, we recover circuits with depth the shortest depth reported in [25], but with QR-style arguments rather than sophisticated data loaders. Moreover, and maybe more importantly for DPP sampling, while sampling from a DPP with non-diagonalized kernel remains limited by the initial (classical) diagonalization step, we argue that if one chooses the right avatar in the available distributed QR algorithms, we can even give a hybrid pipeline of a classical parallel and a quantum algorithm to sample some projection DPPs with non-diagonalized kernel, with a linear speedup compared to the vanilla classical DPP sampler of [16]. The covered DPPs include practically relevant cases such as the uniform spanning trees and column subsets of Examples 1 and 2. In Sections 5.3 and 5.4, we give two standard arguments, respectively due to [16] and [54], to reduce the treatment of (non-projection) Hermitian DPPs to projection DPPs. This concludes our treatment of DPPs. In Section 5.5, we go back to connecting point processes to the work of [22], showing how one can use their circuit corresponding to the BdG Hamiltonian to sample a Pfaffian PP.

5.1 Quantum circuits↩︎

We refer to [41] for a description of quantum circuits and the basic building blocks. In short, in the quantum circuit model of quantum computation, one describes a computation by the initialization of a set of \(N\) qubits, a sequence of unitary operators among a small set of physically-implementable operators called gates, and a physically-implementable observable called measurement. Not all gates can be implemented on every quantum computer, but software development kits like Qiskit [30] allow the user to define a quantum circuit using a large enough set of gates. The latter usually include any tensor product of identity matrices and Pauli matrices 11 (the so-called \(X\), \(Y\), and \(Z\) gates, respectively corresponding to \(\sigma_x\), \(\sigma_y\), and \(\sigma_z\) in 11 ) and a few two-qubit gates such as the ubiquitous CNOT. Qiskit then “transpiles" the resulting circuit into a sequence of actually-implementable gates for a given machine. For instance, with the current technology, not all qubits can be jointly operated on a given machine, e.g., one may not be able to apply a gate that acts on two qubits that are physically too far from each other in the actual quantum machine, or only be able to do so with a significant chance of error. Assuming the transpiling process preserves the dimensions of the circuit, one measures the complexity of a quantum circuit by quantities such as its total number of gates and its depth, i.e., the largest number of elementary gates applied to any single qubit.

Finally, in order to judge the possibility of sampling DPPs today, and be able to estimate what we can do in the future as hardware and software improve, it is important to keep in mind the main sources of error and their order of magnitude in current quantum hardware; see Appendix 10.

5.2 The case of a projection DPP↩︎

Consider again \((a_j)\) to be the Jordan-Wigner operators, which satisfy ?? , and \((\ket{\mathbf{n}})\) to be the corresponding Fock basis. Let \(r\in\mathbb{N}^*\), \(\mathbf{Q}\in\mathbb{C}^{r\times N}\) have orthonormal rows, and \(b_j^*= \sum_{\ell=1}^N q_{j\ell} a_\ell^*\) for \(1\leq j\leq r\). Let further \[\label{e:slater95determinant} \ket{\psi} = b_1^*\dots b_r^*\ket{\emptyset}.\tag{25}\] From Section 4.3, we know that simultaneously measuring \(N_i = a_i^*a_i\), \(i=1, \dots, N\) on \(\rho = \ketbra{\psi}\) samples the projection DPP with kernel \(\mathbf{K}= \mathbf{Q}^*\mathbf{Q}\). Measuring \(N_i\) is easy, because the Fock basis of the JW operators is the computational basis, see 3, and measurement in the computational basis is a basic operation on any quantum computer. For the same reason, implementing \(a_1^* \dots a_r^*\ket\emptyset\) simply amounts to initializing the first \(r\) qubits to \(\ket{1}\), and the rest to \(\ket{0}\). It remains to be able to prepare the state 25 , where the \(b_i\)’s intervene, not the \(a_i\)’s. This is done by a procedure akin to the QR algorithm by successive Givens rotations; a standard reference on QR algorithms and matrix computations in general is [55].

5.2.1 QR by Givens rotations↩︎

Proposition 9 ([20][22]). There is a unitary operator \(\mathcal{U}(\mathbf{Q})\) on \(\mathbb{H}\) such that \[b_1^*\dots b_r^*\ket{\emptyset} = \mathcal{U}(\mathbf{Q}) a_1^* \dots a_r^*\ket\emptyset,\] and \(\mathcal{U}(\mathbf{Q})\) is a product of unitaries corresponding to elementary quantum gates.

Proof. The Givens rotation with parameters \(\theta,\phi\) and indices \(\ell^1, \ell^2\in[N]\) is the unitary matrix \[\mathbf{G}= \mathbf{G}_{\ell^1,\ell^2}(\theta,\phi) = \mathbf{P} \begin{pmatrix} \boldsymbol{\gamma} & \mathbf{0}\\ \mathbf{0} & \mathbb{I} \end{pmatrix} \mathbf{P}^{-1} \text{ with } \boldsymbol{\gamma} = \begin{pmatrix} \cos\theta & \mathrm{e}^{-\mathrm{i}\phi} \sin\theta \\ -\sin\theta \mathrm{e}^{\mathrm{i}\phi}& \cos\theta \end{pmatrix}, \label{e:givens}\tag{26}\] where \(\mathbf{P}\) is the matrix of the permutation \((1\;\ell^1)(2\;\ell^2)\), allowing to select the vectors on which the rotation is applied.5 Choosing \(\theta, \phi\) relevantly, one can make sure that \(\mathbf{Q}\mathbf{G}^*\) has a zero in position \((\ell^1, \ell^2)\), while all columns other than \((\ell_1, \ell_2)\) are left unchanged. Iteratively choosing \((\ell^1, \ell^2)\), we can sequentially introduce zeros in \(\mathbf{Q}\) by multiplying it by \(n_{G}\) Givens rotations \(\mathbf{G}_1, \dots, \mathbf{G}_{n_G}\), never changing an entry that we previously set to zero, until \[\mathbf{Q}\mathbf{G}_1^* \dots \mathbf{G}_{n_G}^* = \begin{pmatrix} \mathbf{\Lambda}\vert \mathbf{0} \end{pmatrix}, \label{e:Givens95chain}\tag{27}\] where \(\mathbf{\Lambda}\) is an \(r\times r\) diagonal unitary matrix, and the right block is the \(r\times (N-r)\) zero matrix.

Now, to each Givens rotation \(\mathbf{G} = \mathbf{G}_{\ell^1,\ell^2}(\theta,\phi)\), we associate a unitary operator \(\mathcal{G} = \mathcal{G}_{\ell^1, \ell^2}(\theta, \phi)\) on \(\mathbb{H}\). We call \(\mathcal{G}\) a Givens operator, as it realizes the Givens rotation \(\mathbf{G}\) by a conjugation of creation operators, i.e. \[\begin{align} \mathcal{G} a_{\ell^1}^*\mathcal{G}^*&= \cos \theta a_{\ell^1}^*+ \mathrm{e}^{-\mathrm{i}\phi}\sin \theta a_{\ell^2}^*,\\ \mathcal{G} a_{\ell^2}^*\mathcal{G}^*&= -\mathrm{e}^{\mathrm{i}\phi}\sin \theta a_{\ell^1}^*+ \cos \theta a_{\ell^2}^*,\\ \mathcal{G} a_{i}^*\mathcal{G}^*&= a_{i}^*, \quad \forall i \notin\{\ell^1, \ell^2\}. \end{align}\] We refer to Appendix 8.1 for an explicit construction. For future convenience, we note that this action can be compactly summarized if one stacks up the creation operators in a vector \(\mathbf{a}^*= (a_1^*\cdots a_N^*)^\top\), and writes \[\begin{align} \mathcal{G} \mathbf{a}^*\mathcal{G}^*= \mathbf{G}\cdot \mathbf{a}^*,\label{e:givens95operator95definition} \end{align}\tag{28}\] with \(\cdot\) defined by matrix-vector multiplication.

Finally, consider the unitary \[\begin{align} \mathcal{U}(\mathbf{Q}) \triangleq \mathcal{G}_{n_G} \dots \mathcal{G}_1, \label{e:Givens95factorization95on95H} \end{align}\tag{29}\] where \(\mathcal{G}_i\) is the Givens operator corresponding to \(\mathbf{G}_i\) in 27 . Then, by construction and up to a phase factor, for all \(1\leq k \leq r\), \[\mathcal{U}(\mathbf{Q}) a_k^*\mathcal{U}(\mathbf{Q})^* = b_k^*, \text{ and } \mathcal{U}(\mathbf{Q}) \ket{\emptyset} = \ket{\emptyset}.\] In particular, \[\begin{align} \ket{\psi} = b_1^*\dots b_r^*\ket{\emptyset} &= \mathcal{U}(\mathbf{Q}) a_1^*\mathcal{U}(\mathbf{Q})^* \dots \mathcal{U}(\mathbf{Q}) a_r^*\mathcal{U}(\mathbf{Q})^* \ket{\emptyset}\\ &= \mathcal{U}(\mathbf{Q}) a_1^*\dots a_r^*\ket{\emptyset}, \end{align}\] again up to a phase factor, which is irrelevant in the resulting state \(\ketbra{\psi}\). ◻

The operator \(\mathcal{U}(\mathbf{Q})\) in Proposition 9 is indeed a product of elementary two-qubit gates, because any Givens operator \(\mathcal{G}_{\ell^1,\ell^2}(\theta,\phi)\) can be implemented as such, as first put forward by [20]. We discuss gate details in Appendix 9.

To see how many gates are required for a given DPP kernel, we need to discuss a final degree of freedom: the Givens chain of rotations 26 is not unique. This is where constraints on which qubits can be jointly acted upon enter the picture.

5.2.2 Parallel QR algorithms and qubit operation constraints↩︎

The product of Givens rotations in 26 can give rise to a quantum circuit of short depth if there are subproducts of rotations that can be performed on disjoint pairs of qubits. Independently of quantum computation, this is exactly the same kind of constraint that numerical algebraists have been studying in a long string of works on parallel QR factorization;6 see [24] and references therein.

As a first example, after a preprocessing phase, the preparation of 25 in [22] implicitly implements a parallel QR algorithm known as Sameh-Kuck [23]. More precisely, in a preprocessing phase, [22] zero out \(r(r-1)/2\) entries in the upper-right corner of \(\mathbf{Q}\) by pre-multiplying \(\mathbf{Q}\) by a product of (complex) Givens rotations. Let us write this product of unitary matrices by \(\mathbf{V}\). This preprocessing requires at most7 \(r(r-1)/2\) Givens rotations, which are implemented thanks to a classical computer. To fix ideas, when \(r = 3\) and \(N = 6\), this results in a matrix of the following form \[\mathbf{V}\mathbf{Q}= \begin{pmatrix} * & * & * & * & 0 & 0\\ * & * & * &* & * & 0\\ * & * & * & * & * & * \end{pmatrix}. \label{e:jiang39s95example}\tag{30}\] Note that replacing \(\mathbf{Q}\) by \(\mathbf{V}\mathbf{Q}\) does not change the kernel of the resulting DPP since \(\mathbf{V}^*\mathbf{V}= \mathbf{I}\). Now, [22] apply QR to \(\mathbf{V}\mathbf{Q}\) as in the proof of Proposition 9, applying Givens rotations on disjoint pairs of neighbouring columns, as they become available. Rather than giving a formal description of the algorithm, available in [23], we follow [22] and depict its application on Example 30 . We use the convenient notation of [22], i.e.bold characters to show the most recently actively updated entries of the matrix, and underlined characters to show entries that automatically result from the rows of \(\mathbf{Q}\) being orthonormal. The successive parallel rounds of the algorithm are then \[\begin{align} \mathbf{V}\mathbf{Q} & = \begin{pmatrix} * & * & * & * & 0 & 0\\ * & * & * &* & * & 0\\ * & * & * & * & * & * \end{pmatrix} \rightarrow \begin{pmatrix} * & * & * & \mathbf{0}& 0 & 0\\ * & * & * &* & * & 0\\ * & * & * & * & * & * \end{pmatrix}\nonumber\\ &\rightarrow \begin{pmatrix} * & * & \mathbf{0}& 0 & 0 & 0\\ * & * & * &* & \mathbf{0}& 0\\ * & * & * & * & * & * \end{pmatrix} \rightarrow \begin{pmatrix} \underline{\lambda_1} & \mathbf{0}& 0 & 0 & 0 & 0\\ \underline{0} & * & * &\mathbf{0}& 0 & 0\\ \underline{0} & * & * & * & * & \mathbf{0} \end{pmatrix}\nonumber\\ &\rightarrow \begin{pmatrix} \lambda_1 & 0 & 0 & 0 & 0 & 0\\ 0 & \underline{\lambda_2} & \mathbf{0}& 0 & 0 & 0\\ 0 & \underline{0} & * & * & \mathbf{0}& 0 \end{pmatrix} \rightarrow \begin{pmatrix} \lambda_1 & 0 & 0 & 0 & 0 & 0\\ 0 & \lambda_2 & 0 & 0 & 0 & 0\\ 0 & 0 & \underline{\lambda_3} & \mathbf{0}& 0 & 0 \end{pmatrix}.\label{e:QR95nnb95Jiang} \end{align}\tag{31}\] The factors \(\lambda_i\) are of unit modulus, as in the construction behind Proposition 9. The upshot is that only a lower triangular matrix remains whose only non-zero entries are on the diagonal, while the other entries of the lower triangle automatically vanish due to row orthonormality.

The resulting circuit, an example of which is given in Figure 1, requires \(\mathcal{O}(Nr)\) gates, and has depth \(\mathcal{O}(N)\). The \(\mathcal{O}(r^2)\) preprocessing, which is optional and could also be performed in the quantum circuit, allows getting rid of a handful of (necessarily faulty) quantum gates at a small classical cost. Finally, constraining Givens operators to act on neighbouring qubits, or equivalently, Givens rotations to act on neighbouring columns, is practically relevant if the actual quantum computer at disposal favours two-qubit gates acting on neighbouring qubits.

As a second extreme example, assume that we have no constraint on which pairs of qubits can be jointly operated. We can then perform parallel QR on \(\mathbf{Q}\) using only \(\mathcal{O}(r\log N)\) parallel rounds, simply by acting on all available disjoint pairs in a single row until all but one entry are zeroed, keeping in mind that the leftmost \(r\) entries of each row will be automatically updated because of orthonormality constraints. On an example with \(N=8\) and \(r=3\), this would yield \[\begin{align} \mathbf{Q} &\rightarrow \begin{pmatrix} * & \mathbf{0}& *& \mathbf{0}& * & \mathbf{0}& * & \mathbf{0}\\ * & * & * & * & * & * & * & *\\ * & * & * & * & * & * & * & *\\ \end{pmatrix} \rightarrow \begin{pmatrix} * & 0 & \mathbf{0}& 0 & * & 0 & \mathbf{0}& 0\\ * & * & * & * & * & * & * & *\\ * & * & * & * & * & * & * & *\\ \end{pmatrix}\\ &\rightarrow \begin{pmatrix} \underline{\lambda_1} & 0 & 0 & 0 & \mathbf{0}& 0 & 0 & 0\\ \underline{0} & * & * & * & * & * & * & *\\ \underline{0} & * & * & * & * & * & * & *\\ \end{pmatrix} \rightarrow \begin{pmatrix} \underline{\lambda_1} & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & * & * & \mathbf{0}& * & \mathbf{0}& * & \mathbf{0}\\ 0 & * & * & * & * & * & * & *\\ \end{pmatrix}\\ &\rightarrow \begin{pmatrix} {\lambda}_1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & * & \mathbf{0}& 0& * & 0 & \mathbf{0}& 0\\ 0 & * & * & * & * & * & * & *\\ \end{pmatrix} \rightarrow \begin{pmatrix} {\lambda}_1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & \underline{\lambda_2} & 0 & 0& \mathbf{0}& 0 & 0 & 0\\ 0 & \underline{0} & * & * & * & * & * & *\\ \end{pmatrix}\\ &\rightarrow \begin{pmatrix} {\lambda}_1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & \lambda_2 & 0 & 0& 0 & 0 & 0 & 0\\ 0 & 0 & * & \mathbf{0}& * & \mathbf{0}& * & \mathbf{0}\\ \end{pmatrix} \rightarrow \begin{pmatrix} {\lambda}_1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & \boldsymbol{\lambda}_2 & 0 & 0& 0 & 0 & 0 & 0\\ 0 & 0 & * & 0 & * & 0 & \mathbf{0}& 0\\ \end{pmatrix}\\ &\rightarrow \begin{pmatrix} {\lambda}_1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & \lambda_2 & 0 & 0& 0 & 0 & 0 & 0\\ 0 & 0 & \underline{\lambda_3} & 0 & \mathbf{0}& 0 & 0 & 0\\ \end{pmatrix}. \end{align}\] The resulting quantum circuit has \(\mathcal{O}(Nr)\) gates as the one of [22], but a depth of only \(\mathcal{O}(r\log_2 N)\). This depth is similar to the shortest depth obtained by [25] with a similar tree-like structure called “parallel Clifford loaders".

Just like in parallel QR, we argue that the choice of the chain 26 of rotations should depend on the particular hardware constraints, like which qubits we allow to be jointly operated. Parallel QR algorithms allow quite arbitrary dependency structures, see e.g. [24]. While initially motivated by communication- or storage constraints in parallel classical computing, these dependency structures are actually tailored to designing quantum circuits to prepare states like 25 depending on a given quantum architecture. Taking the analogy with parallel QR even one step further, we now show that the initial bottleneck of diagonalizing the kernel of a DPP can be combined with the design of the quantum circuit in a single run of a parallel QR algorithm.

Figure 1: A circuit sampling a DPP with projection kernel of rank \(r = 3\), with \(N = 5\) items. On the left-hand side, Pauli \(X\) gates are used to create fermionic modes on the three first qubits. Note also the parallel QR Givens rotations on neibouring qubits indicated by parametrized \(XX+YY\) gates. See 9 for gate details. On the right-hand side, measurements of occupation numbers are denoted by black squares.

5.2.3 A hybrid parallel/quantum algorithm for projection DPPs↩︎

The link of Givens-based quantum circuits with parallel QR algorithms suggests pipelines to sample from projection DPPs, even when the kernel is not yet in diagonalized form.

Proposition 10. Assume that we have access only to \(\mathbf{A}\in \mathbb{R}^{d\times N}\), and that we want to sample from DPP(\(\mathbf{K}\)), where \(\mathbf{K}= \mathbf{V}_{:,[r]} \mathbf{V}_{:,[r]}^*\) and \(\mathbf{V}\) is defined by the singular value decomposition \(\mathbf{A}= \mathbf{U}\mathbf{\Sigma}\mathbf{V}^*\). Then, given \(P\) classical processors, for a run time in \(\mathcal{O}(Nd^2/P)\) up to logarithmic factors in \(P\), we can design a quantum circuit with depth \(\mathcal{O}(N\log P/P)\) and \(\mathcal{O}(Nd)\) gates, such that simultaneous measurements in the computational basis at the output of the circuit yield a sample of DPP(\(\mathbf{K}\)).

Three comments are in order. First, specifying the kernel \(\mathbf{K}\) as in Proposition 10, where \(\mathbf{K}\) is the projection kernel onto the first \(r\) principal components of a full-rank rectangular matrix \(\mathbf{A}\), is common in practice. In particular, it covers the column subset selection of Example 2 and the uniform spanning forests of Example 1. Second, for simplicity we have omitted the additional \(\mathcal{O}(r^2)\) preprocessing introduced by [22]. Third, at least if we neglect physical sources of noise in the quantum circuit (see Appendix 10), the bottleneck in the hybrid sampler remains the initial classical cost \(\mathcal{O}(Nd^2/P)\). Compared with the vanilla classical approach, we have gained a linear speedup thanks to parallelization. Note that it is not clear how such a linear speedup can be gained in the vanilla classical algorithm, but that it can be gained in non-quantum randomized variants [27], [28]. Dequantizing Proposition 10, to obtain a non-quantum algorithm with a similar cost, is actually an interesting question, which we leave to future work.

Proof. To prove Proposition 10, we first use a Givens-based parallel QR algorithm to compute \(\mathbf{A}^*= \mathbf{Q}\mathbf{R}\); see [24] and references therein for a recent entry. One can incorporate here communication constraints that will turn into qubit-communication constraints in the final algorithm. For simplicity, we assume \(N\gg d\) and use the parallel tall-skinny QR of [24], with \(P\leq N/d\) processors. Without going into details, for \(P\) a power of two, the idea is to partition \(\mathbf{A}^*\) into \(P\) equal \(N/P\times d\) blocks, perform QR for each block using a user-chosen QR algorithm, and for \(\log P\) stages, group the resulting \(R\) matrices in pairs and perform QR. The run time is \(\mathcal{O}(Nd^2/P)\), up to logarithmic factors, which is a linear speedup compared to classical, non-parallel QR. Now, perform the singular value decomposition of the \(d\times d\) matrix \(\mathbf{R}= \tilde{\mathbf{U}}\tilde{\mathbf{\Sigma}}\tilde{\mathbf{V}}^*\); this costs \(\mathcal{O}(d^3)\). Then \[\mathbf{A}^*\mathbf{A}= \mathbf{Q}\mathbf{R}\mathbf{R}^*\mathbf{Q}^*= (\mathbf{Q}\tilde{\mathbf{U}}) \tilde{\mathbf{\Sigma}}^2 (\mathbf{Q}\tilde{\mathbf{U}})^*,\] so that the first \(r\) principal components of \(\mathbf{A}\) are \(\mathbf{Q}\tilde{\mathbf{U}}_{:,[r]}\). Essentially, we have used the QR algorithm to compute the principal directions: this detour is valuable because we simultaneously \((i)\) benefit from parallelization and \((ii)\) obtain \(\mathbf{Q}\) as a product of Givens rotations by construction. If we now decompose \(\tilde{\mathbf{U}}_{:,[r]}\) as a product of Givens rotations, say using the Sameh-Kuck algorithm like in [22], we paid a (classical) preprocessing cost of \(\mathcal{O}(Nd^2/P + d^3)\), up to logarithmic factors in \(P\), to obtain all the information needed to create a quantum circuit with Givens gates that samples DPP(\(\mathbf{K}\)) starting from the knowledge of \(\mathbf{A}\). The depth depends on the blockwise QR algorithm used in the initial parallel tall-skinny QR step. If we use Givens rotations à la Sameh-Kuck to enforce only nearest-neighbour qubit interactions, we obtain a depth of \(\mathcal{O}(N\log P/P)\), and a number of gates of \(\mathcal{O}(Nd)\). The limiting factor of the overall pipeline remains the QR factorization of \(\mathbf{A}\), in \(\mathcal{O}(Nd^2/P)\) flops, but we have gained a linear speedup. ◻

5.3 Reducing general DPPs to mixture DPPs↩︎

In Section 5.2, we gave quantum circuits to sample projection DPPs. If the kernel \(\mathbf{K}\) is not that of a projection, a standard argument by [16] allows writing the corresponding DPP as a statistical mixture of projection DPPs, thus yielding a quantum sampler for DPP(\(\mathbf{K}\)) with a further classical preprocessing step, the latter implementing the choice of the mixture component.

More precisely, assume \(\mathbf{K}\) is available in diagonalized form \[\begin{align} \mathbf{K}= \mathbf{U}\, \mathrm{Diag}\left( \nu_k\right)\, \mathbf{U}^*, \label{eq:correlation95kernel95DPP95sampling} \end{align}\tag{32}\] with \(\nu_1, \dots, \nu_N \in [0,1]\). Then sampling DPP(\(\mathbf{K})\) can be done in two steps. First, sample \(N\) independent Bernoulli random variables with respective parameters \(\nu_1, \dots, \nu_N\) on a classical computer. Let \(\mathcal{C}\subseteq [N]\) be the set of indices of successful Bernoulli trials. Second, measure all the occupation numbers \(N_i\) for \(1\leq i \leq N\) in the quantum circuit of 5.2; in other words, sample from the projection DPP of correlation kernel \(\mathbf{Q}^* \mathbf{Q}\) with \(\mathbf{Q} = (\mathbf{U}_{:\mathcal{C}})^*\), where \(\mathbf{U}_{:\mathcal{C}}\) is the set of columns indexed by \(\mathcal{C}\).

To conclude this section, we note how one can arrive at this mixture construction by inspecting the quantum state “above” DPP(\(\mathbf{K}\)). Consider the factorization of the state ?? obtained thanks to the diagonalization of \(H\), i.e., \[\begin{align} \rho = \prod_{k = 1}^N \frac{\exp(-\beta (\lambda_k-\mu) b_k^*b_k)}{1 + \exp(-\beta (\lambda_k-\mu))},\label{eq:rho95DPP95as95product} \end{align}\tag{33}\] where we used the formula 18 for the normalization constant. Note that the eigenvalues of the correlation kernel 32 are obtained by taking the sigmoid of the Hamiltonian eigenvalues, i.e., \(\nu_k = \sigma(-\beta(\lambda_k-\mu))\), in the light of 4. By a simple inspection, the factor \(k \in [N]\) in the product 33 , is given by \[\frac{\exp(-\beta (\lambda_k-\mu) b_k^*b_k)}{1 + \exp(-\beta (\lambda_k-\mu))} = (1-\nu_k)\times \pi_{\mathop{\mathrm{\mathrm{Ker}}}b_k} + \nu_k \times \pi_{\mathop{\mathrm{\mathrm{Ker}}}b_k^*},\] where \(\pi_{\mathop{\mathrm{\mathrm{Ker}}}b_k}\) (resp. \(\pi_{\mathop{\mathrm{\mathrm{Ker}}}b_k^*}\)) is the orthogonal projector onto the null space of \(b_k\) (resp. \(b_k^*\)). Inspecting Born’s rule reveals that \(\tr \rho N_{i_1}\dots N_{i_\ell}\) is equal to the correlation function of a statistical mixture of projection DPPs, each coming with a weight of the form \[\prod_{i\in I} \nu_i \prod_{j\notin I} (1-\nu_j)\] for some \(I\subset[N]\). These weights correspond to the independent Bernoulli trials introduced by [16] and discussed immediately after 32 .

5.4 Dilating a general DPP to a projection DPP↩︎

In principle, there exists another strategy to sample from a general DPP by only using a quantum circuit such as described in 5.2 – thus bypassing the first step of the sampling algorithm consisting of the classical sampling of Bernoulli random variables. Any DPP on the ground set \(\{1,\dots,N\}\) can be dilated to a projection DPP on \(\{1, \dots, N, N+1, \dots, 2N\}\) by the transformation \[\mathbf{K} \mapsto \mathbf{K}_{\rm dil} = \begin{pmatrix} \mathbf{K}& (\mathbf{K}(\mathbf{I}-\mathbf{K}))^{1/2}\\ (\mathbf{K}(\mathbf{I}-\mathbf{K}))^{1/2} & \mathbf{I}- \mathbf{K} \end{pmatrix},\] see [54]. The points of a sample of DPP(\(\mathbf{K}_{\rm dil}\)) which belong to \(\{1,\dots,N\}\) yield a sample of DPP(\(\mathbf{K}\)).

5.5 Sampling the PfPP corresponding to the BdG Hamiltonian↩︎

We now turn to PfPPs. As above, we can write the mixed state of the Bogoliubov-de Gennes Hamiltonian at inverse temperature \(\beta > 0\) as \[\frac{\exp(-\beta H_{\mathrm{BdG}})}{Z} = \prod_{k=1}^N \frac{\exp(-\beta \epsilon_k b_k^*b_k)}{1 + \exp(-\beta \epsilon_k)}, \label{e:mixture95pfaffian}\tag{34}\] where \(\epsilon_k\) is the energy of the eigenmode \(k\). Hence, in complete analogy with the case of DPPs in 5.3, the product in 34 entails a statistical mixture formula. Therefore, sampling the corresponding Pfaffian point process amounts to proceed as follows:

  1. sample \(N\) independent Bernoullis with success probability \(\sigma(-\beta \epsilon_k)\) on a classical computer, to obtain the set of successful indices \(\mathcal{C}\subseteq \{1, \dots, N\}\),

  2. jointly measure the occupation numbers \(N_i = a_i^*a_i\) for all \(1\leq i\leq N\) in the pure state \(\prod_{i\in \mathcal{C}} b_{i}^*\ket{\emptyset_b}\) with a quantum circuit,

where \(\ket{\emptyset_b}\) is the joint Fock vacuum of the \(b_k\)’s. Note that the preparation of \(\prod_{i\in \mathcal{C}} b_{i}^*\ket{\emptyset_b}\) is discussed in 5.5.1.

The output of this sampling algorithm is the set of indices for which the measure has given an occupation number equal to \(1\) rather than equal to \(0\). Again, this can be done easily thanks to the representation of the occupation numbers \(N_i\)’s in 3.

Remark 11 (Projective \(\mathbf{S}\)). In the light of 7, the Pfaffian point process associated with the pure state \(\prod_{i\in \mathcal{C}}b_i^*\ket{\emptyset_b}\) is determined by the orthogonal projection matrix \[\mathbf{S}(\mathcal{C}) = \overline{\mathbf{W}}^{*} \begin{pmatrix} \mathbf{I}_{\bar{\mathcal{C}}}& \mathbf{0}\\ \mathbf{0} & \mathbf{I}_{\mathcal{C}} \end{pmatrix} \overline{\mathbf{W}},\label{e:S40C41}\qquad{(12)}\] where \(\mathbf{I}_{\mathcal{C}}\) is the diagonal matrix with entry \((i,i)\) equal to \(1\) if \(i\in \mathcal{C}\) and zero otherwise, whereas \(\mathbf{I}_{\bar{\mathcal{C}}} = \mathbf{I} - \mathbf{I}_{\mathcal{C}}\). Notice that \(\mathbf{C}\overline{\mathbf{S}}\mathbf{C} = \mathbf{I}- \mathbf{S}\) holds.

We now shortly discuss a simple consequence of 8 in the case of the zero temperature limit a the PfPP, denoted by \(Y_\beta\) and associated with the density matrix \(\rho = Z^{-1} \exp( -\beta H_{\mathrm{BdG}})\), to highlight the role of the Majorana quadratic form of \(H_{\mathrm{BdG}}\), namely \(\mathbf{A}\), see 4. This PfPP is associated with the matrix \[\begin{align} \mathbf{S} = \overline{\mathbf{W}}^{*} \begin{pmatrix} \mathrm{Diag}(\sigma(\beta \epsilon_k))& \mathbf{0}\\ \mathbf{0} & \mathrm{Diag}(\sigma(-\beta \epsilon_k)) \end{pmatrix} \overline{\mathbf{W}}. \end{align}\] For simplicity, assume now that the eigenvalues of \(H_{\mathrm{BdG}}\) are such that: \(\epsilon_k < 0\) for all \(k\in \mathcal{C}\) and \(\epsilon_k > 0\) otherwise8; and take the limit of \(\rho\) for \(\beta\to +\infty\).

The correlation functions of \(Y_\beta\) tend to the correlation functions of the process \(Y_\infty\) associated to the orthogonal projection \(\mathbf{S}(\mathcal{C})\) given in ?? . Meanwhile, the expectation of the parity w.r.t.\(Y_\beta\) tends to \[\mathop{\mathrm{\mathbb{E}}}_{Y_\infty} (-1)^{|Y_\infty|} = (-1)^{|\mathcal{C}|} \mathop{\mathrm{\mathrm{sign}}}\mathrm{Pf}(\mathbf{A}) = (-1)^{|\mathcal{C}|} \det \mathbf{W},\] in the light of 4. Since \(\det \mathbf{W} \in \{-1,1\}\), the right-hand side of this equality is equal to either \(1\) or \(-1\). But \((-1)^{|Y_\infty|}\) can only take values in \(\{-1,1\}\), so that the parity of the cardinality of \(Y_\infty\) is almost surely fixed.

In particular, when \(\mathcal{C} = \emptyset\), the Pfaffian process \(Y_\infty\) is determined by the lowest energy eigenvector of \(H_{\mathrm{BdG}}\), namely \(\ket{\emptyset_b}\), and its expected parity is \(\mathop{\mathrm{\mathrm{sign}}}\mathrm{Pf}(\mathbf{A})\), which corresponds to the formula derived by [53] in the setting of a topological superconductor in one dimension. Note the direct connection with 6 about the fermionic parity of eigenvectors of \(H_{\mathrm{BdG}}\). Thus, in this case, the parity is odd if and only if \(\boldsymbol{W} = \boldsymbol{\Omega}^*\boldsymbol{R} \boldsymbol{\Omega}\) is such that the (real) orthogonal matrix \(\boldsymbol{R}\) is in the connected component of \(O(2N)\) not containing the identity element.

5.5.1 QR decomposition with double Givens rotations↩︎

We now describe how to adapt the discussion of 5.2 to the case where the particle number is not conserved. To simplify notation, we suppose that \(\mathcal{C} = \{1, \dots, r\}\), and we prepare the state \[\begin{align} b_1^*\dots b_r^*\ket{\emptyset_b}, \label{e:fermionic95gaussian95state95W} \end{align}\tag{35}\] where the creation-annihilation operators \(b_k\)’s are obtained from the \(a_k\)’s thanks to a Bogoliubov transformation9 23 with a unitary orthogonal complex matrix \(\mathbf{W}\) given in 4. Due to particle number non-conservation, the Fock vacuum of the \(b_i\)’s, denoted by \(\ket{\emptyset_b}\) and given below, is not annihilated by the \(a_i\)’s. Also, recall that the Bogoliubov transformation is given by a matrix of the form \[\label{e:w95matrix} \mathbf{W} =\begin{pmatrix} \overline{\mathbf{W}}_1 & \overline{\mathbf{W}}_2 \\ \mathbf{W}_2 & \mathbf{W}_1 \end{pmatrix},\tag{36}\] where the blocks are such that this transformation is unitary; see 5. Explicitly, as explained in [22], the full transformation is determined by the lower blocks of \(\mathbf{W}\), which encode the expression of \(\mathbf{b}\) in terms of \(\mathbf{a}^*\) and \(\mathbf{a}\). Now, we give the following result that we formalize and adapt from [22] by including cases with fewer than \(N\) particle-hole transformations which were not explicitly considered by [22].

Lemma 6. There exists a \(2N \times 2N\) orthogonal complex matrix \(\mathbf{O}\) and an \(N\times N\) unitary matrix \(\mathbf{V}\) such that \[\mathbf{V}\begin{pmatrix} \mathbf{W}_2 \vert \mathbf{W}_1 \end{pmatrix} \mathbf{O}^*= \begin{pmatrix} \mathbf{0} \vert \mathbf{D} \end{pmatrix},\] where \(\mathbf{D} = \mathrm{Diag}(z_k)\) is a diagonal matrix with \(|z_k| = 1\) for \(1\leq k \leq N\). Furthermore, the matrix \(\mathbf{O}\) is here the following product of Givens rotations and* particle-hole transformations \[\begin{align} \mathbf{O} = \mathbf{B}^{\textcolor{black}{\mathtt{m}_{N}}}\boldsymbol{M}_{N-1}\mathbf{B}^{\textcolor{black}{\mathtt{m}_{N-1}}} \dots \mathbf{B}^{\textcolor{black}{\mathtt{m}_{2}}}\boldsymbol{M}_1\mathbf{B}^{\textcolor{black}{\mathtt{m}_{1}}},\label{eq:unitary95U95slater95ph} \end{align}\tag{37}\] the terms of which we now explain. The matrix \(\boldsymbol{M}_i\) is a product of at most \(N-i\) double Givens rotation matrices. The double Givens rotation \(\boldsymbol{\Gamma}\) associated with \((\theta, \phi)\) and vector indices \(1\leq \ell^1 < \ell^2\leq N\) is defined as \[\boldsymbol{\Gamma} = \begin{pmatrix} \mathbf{G}_{\ell^1,\ell^2}(\theta,\phi) & \mathbf{0}\\ \mathbf{0} & \overline{\mathbf{G}}_{\ell^1,\ell^2}(\theta,\phi) \end{pmatrix} \text{ with \mathbf{G} defined in \ref{e:givens}}.\] The matrix \(\mathbf{B}\) in 37 is a permutation matrix exchanging the last vector of the first block with the last vector of the second block. Formally, it is given by the so-called particle-hole matrix \[\mathbf{B} = \begin{pmatrix} \mathbf{I}- \mathbf{e}_N \mathbf{e}_N^\top & \mathbf{e}_N \mathbf{e}_N^\top\\ \mathbf{e}_N \mathbf{e}_N^\top & \mathbf{I}- \mathbf{e}_N \mathbf{e}_N^\top \end{pmatrix}.\] The Booleans \((\mathtt{m}_1, \dots, \mathtt{m}_N) \in \{0,1\}^N\) simply indicate whether the corresponding particle-hole matrices appear or not in the decomposition 37 . Finally, note that because we use the Jordan-Wigner transform, \(\mathbf{B}\) can be implemented by a Pauli \(X\) gate.*

Let us give a few details that sketch the proof of 6. On the one hand, the matrix \(\mathbf{V}\) is a product of single Givens rotations applied to the rows of \((\begin{smallmatrix} \mathbf{W}_2 & \mathbf{W}_1 \end{smallmatrix}),\) to yield an upper triangle of zeros in the left block. For instance, when \(N=3\), and assuming for simplicity that \(\mathbf{W}_2\) is full-rank,10we obtain \[\begin{align} \mathbf{V}\begin{pmatrix} \mathbf{W}_2 \vert \mathbf{W}_1 \end{pmatrix} = \left(\begin{array}{ccc|ccc} 0 & 0 & \square & * & * & \underline{0}\\ 0 & \square & * &* & * & * \\ \square & * & * & * & * & * \end{array}\right),\label{e:Givens95first95step} \end{align}\tag{38}\] where we have highlighted the diagonal of the first block as squares. The full-rank assumption implies that each square in 38 corresponds to a nonzero element. This in turn implies that there is a zero at the top right corner of the second block, as we now explain. Denote by \(\mathbf{r}_{1,i}\) and \(\mathbf{r}_{2,i}\), the \(i\)th row vector of the first and second block, respectively. Thanks to 3, we have the following inner product identities between rows \[\begin{align} \mathbf{r}_{1,i}\mathbf{r}_{2,j}^\top = -\mathbf{r}_{2,i}\mathbf{r}_{1,j}^\top, \tag{39}\\ \mathbf{r}_{1,i}\mathbf{r}_{1,j}^*= \delta_{ij} - \mathbf{r}_{2,i}\mathbf{r}_{2,j}^*.\tag{40} \end{align}\] Property 39 explains why we necessarily already have a zero in the right-hand block of 38 .

Starting from 38 , we now explain how to build the matrix \(\mathbf{O}\) in 37 . More precisely, the next step consists in filling the remainder of the left block of 38 with zeros, one row at a time. We start with a right-multiplication with the particle-hole transformation \(\mathbf{B}\) to zero the top right element. Then right-multiplication with double Givens matrices are used to zero the second row of the left block, from left to right, until we reach the last element of the second row. If that last element is not zero, orthogonality of the rows implies that the last element of the row of the second block is zero, so that a right-multiplication with the particle-hole transformation \(\mathbf{B}\) finishes our procedure of zeroing the second row of the left block. We repeat the operation row after row, using \(\mathbf{B}\) to zero the right-most element of each row if it is not already zero.

To make things more visual, we show the right multiplication by \(\mathbf{O}^*\) on our example 38 , starting with a particle-hole transformation on 38 . For simplicity of the display, we assume that particle-hole transformations are necessary for each row. This gives \[\begin{align} &\mathbf{V}\begin{pmatrix} \mathbf{W}_2 \vert \mathbf{W}_1 \end{pmatrix}\\ &\xrightarrow{\times \mathbf{B}} \left(\begin{array}{ccc|ccc} 0 & 0 & \mathbf{0} & * & * & *\\ 0 & * & * &* & * & * \\ * & * & * & * & * & * \end{array}\right) \xrightarrow{\times\Gamma^*} \left(\begin{array}{ccc|ccc} 0 & 0 & 0 & * & * & \underline{0}\\ 0 & \mathbf{0} & * &* & * & \underline{0} \\ * & * & * & * & * & * \end{array}\right) \text{ by \ref{e:ortho95diff95rows}}\\ &\xrightarrow{\times\Gamma^*} \left(\begin{array}{ccc|ccc} 0 & 0 & 0 & \underline{\alpha}_{(2)} & \underline{0}_{(1)} & 0\\ 0 & 0 & * &* & * & \underline{0}_{(3)} \\ \mathbf{0} & * & * & * & * & * \end{array}\right) \text{ by \ref{e:ortho95diff95rows} at (1), \ref{e:ortho95same95rows} at (2), \ref{e:ortho95diff95rows} at (3)}\\ &\xrightarrow{\times\mathbf{B}} \left(\begin{array}{ccc|ccc} 0 & 0 & 0 & \alpha & 0 & 0\\ 0 & 0 & \mathbf{0} &\underline{0} & * & \boldsymbol{*} \\ 0 & * & * & \underline{0} & * & * \end{array}\right) \text{ by \ref{e:ortho95same95rows}}\\ &\xrightarrow{\times\Gamma^*}\left(\begin{array}{ccc|ccc} 0 & 0 & 0 & \alpha & 0 & 0\\ 0 & 0 & 0 &0 & \underline{\beta}_{(3)} & 0 \\ 0 & \mathbf{0} & \underline{\gamma}_{(4)} & 0 & \underline{0}_{(2)} & \underline{0}_{(1)} \end{array}\right) \text{ by \ref{e:ortho95diff95rows} at (1), by \ref{e:ortho95same95rows} at (2), (3), (4)}\\ &\xrightarrow{\times\mathbf{B}}\left(\begin{array}{ccc|ccc} 0 & 0 & 0 & \alpha & 0 & 0\\ 0 & 0 & 0 &0 & \beta & 0 \\ 0 & 0 & 0 & 0 & 0 & \gamma \end{array}\right) = \mathbf{V}\begin{pmatrix} \mathbf{W}_2 \vert \mathbf{W}_1 \end{pmatrix}\mathbf{O}^*, \end{align}\] with unit modulus complex numbers \(\alpha,\beta,\gamma\). To help the reader understand which of properties 39 and 40 applied, an index was added to matrix elements which are automatically set to zero.

Proposition 12 (Formalization of a result of [22]). The number of double Givens gates as defined in 6 to achieve the factorization 37 is at most \(N(N-1)/2\), whereas the number of particle-hole transformations is at most \(N\). Furthermore, if \(\det\mathbf{W} = 1\), the number particle-hole transformations is even, else if \(\det \mathbf{W} = -1\) it is odd.

6 is now leveraged to yield the factorization of \(\mathbf{W}\) in 36 .

Lemma 7. The matrix \(\mathbf{O}\) in 37 is a unitary orthogonal complex matrix, and we have the factorization \[\begin{pmatrix} \mathbf{D}\overline{\mathbf{V}} & \mathbf{0}\\ \mathbf{0} & \overline{\mathbf{D}}\mathbf{V} \end{pmatrix} \mathbf{W} \mathbf{O}^*= \begin{pmatrix} \mathbf{I}& \mathbf{0}\\ \mathbf{0} & \mathbf{I} \end{pmatrix}.\]

Proof. The discussion above yields the expression of the lower block. Since \(\mathbf{O}\) and \(\mathbf{W}\) are orthogonal complex matrices and unitary as well, the upper block is determined by the lower block in the light of 5. ◻

Furthermore, the \(\mathbf{W}\) matrix of 7 yields a Bogoliubov transformation of the following factorized form \[\begin{pmatrix} \boldsymbol{b}^*\\ \boldsymbol{b} \end{pmatrix} = \begin{pmatrix} \mathbf{V}^\top\overline{\mathbf{D}} & \mathbf{0}\\ \mathbf{0} & \mathbf{V}^*\mathbf{D} \end{pmatrix} \mathbf{O} \begin{pmatrix} \mathbf{a}^*\\ \mathbf{a} \end{pmatrix}\] This factorization into two factors gives us a recipe to prepare the state \(b_1^*\dots b_r^*\ket{\emptyset_b}\) by composing two transformations given in the following remarks.

Remark 13 (particle number conserving transformation). Define \(\mathbf{Q}= (\mathbf{V}^\top\overline{\mathbf{D}})_{[r]:}\) as the matrix made of the \(r\) first rows of \(\mathbf{V}^\top\overline{\mathbf{D}}\). We can directly use the factorization given in 5.2 so that, for \(k\in [r]\), we have \[\textcolor{black}{\mathcal{U}(\mathbf{Q}) a_k^*\mathcal{U}(\mathbf{Q})^*= \sum_{j=1}^{N}\mathbf{Q}_{kj} a^{*}_j }\] where \(\mathcal{U}(\mathbf{Q})\) realizes the product of Givens transformations factorizing \(\mathbf{Q}\).

Remark 14 (particle number non-conserving transformation). There is a unitary operator \(\mathcal{U}_{\mathrm{ph}}(\mathbf{O}) : \mathbb{H}\to \mathbb{H}\) representing the multiplication by \(\mathbf{O}\) as follows \[\begin{pmatrix} \mathcal{U}_{\mathrm{ph}}\mathbf{a}^*\mathcal{U}_{\mathrm{ph}}^*\\ \mathcal{U}_{\mathrm{ph}}\mathbf{a}\;\mathcal{U}_{\mathrm{ph}}^* \end{pmatrix} = \mathbf{O} \begin{pmatrix} \mathbf{a}^{*} \\ \mathbf{a} \end{pmatrix},\] where the action of \(\mathcal{W}\) is entrywise on the vectors \(\mathbf{a}^{*}\) and \(\mathbf{a}\). By construction, we have \[\begin{align} \textcolor{black}{\mathcal{U}_{\mathrm{ph}} = \mathcal{B}^{\mathtt{m}_1}\mathcal{M}_{1}\mathcal{B}^{\mathtt{m}_2}\mathcal{M}_{2}\mathcal{B}^{\mathtt{m}_3}\dots \mathcal{B}^{\mathtt{m}_{N-1}}\mathcal{M}_{N-1}\mathcal{B}^{\mathtt{m}_N},}\label{eq:W95Givens} \end{align}\qquad{(13)}\] for some \((\mathtt{m}_1, \dots, \mathtt{m}_N) \in \{0,1\}^N\), and where \(\mathcal{B}\) is such that \(\mathcal{B}a_N^*\mathcal{B}^*= a_N\) while leaving all other creation-annihilation operators unchanged, while \(\mathcal{M}_i\) is a composition of at most \(N-i\) Givens operators. Each Givens operator \(\Gamma: \mathbb{H}\to \mathbb{H}\) simply represents multiplication by the Givens matrix \(\boldsymbol{\Gamma}\) as follows \[\begin{pmatrix} \Gamma\mathbf{a}^*\Gamma^*\\ \Gamma\mathbf{a} \Gamma^* \end{pmatrix} = \boldsymbol{\Gamma}\cdot \begin{pmatrix} \mathbf{a}^*\\ \mathbf{a} \end{pmatrix}.\] Again, the Givens operators in ?? appear in reverse order compared with Givens matrices in 37 .

5.5.2 Quantum gates↩︎

We now combine the two remarks above. Let \(\mathbf{V}\) and \(\mathbf{O}\) be given by 6. For short, denote \(\mathbf{Q}= (\mathbf{V}^\top\overline{\mathbf{D}})_{[r]:}\). The factorization of the fermionic Gaussian state 35 as a composition of double Givens gates and \(X\) gates reads \[b_1^*\dots b_r^*\ket{\emptyset_b} = \mathcal{W}\cdot a_1^{*}\dots a_r^{*}\ket{\emptyset_{a}}, \text{ with } \mathcal{W} = \textcolor{black}{\mathcal{U}_{\mathrm{ph}}(\mathbf{O})\cdot \mathcal{U}(\mathbf{Q})}.\] This result is readily obtained by inserting several times the identity \(\mathcal{W}^*\mathcal{W}\) in the above equation and by noting that \(\ket{\emptyset_b} = \mathcal{W}\ket{\emptyset_{a}}\).

6 Numerical experiments↩︎

In this section, we demonstrate the circuits discussed in 5 on a Qiskit simulator [30], and on a few 5-qubit IBMQ machines [31]. Python code to reproduce the experiments in available on GitHub.11

6.1 Projection DPPs↩︎

As discussed in 5.4, any DPP can be sampled by constructing a projection DPP on a twice larger ground set; we thus focus here on the simulation of projection DPPs. We take \(N=5\) and we consider a projection DPP on \([N] = \{1, 2, 3, 4, 5\}\). The marginal kernel is \(\mathbf{K}=\mathbf{Q}_{:,[k]}\mathbf{Q}_{:,[k]}^\top\) with \(k=3\), where \(\mathbf{Q}\) is a matrix with orthonormal columns obtained by the Householder QR decomposition of a realization of an \(N\times N\) random matrix full of independent and identically distributed real Gaussians. Almost surely, \(\mathbf{K}\) is thus a rank-\(3\) projector, and all the following statements are conditionally on the kernel. In particular, the corresponding DPP generates samples of \(3\) points with probability one.

Probabilities of subsets are depicted in the blue histogram of 4 (d). For completeness, we display \(\mathbf{K}\) in Figure 2 (a) where labels of entries12 range from \(0\) to \(4\) to match Qiskit convention. Note that it is not obvious from this figure how to guess the rank of the kernel or whether it is projective. Nonetheless, we can gain some other insights about the DPP by inspection of the color map. We see for example that item \(\{1\}\) has a large inclusion probability \(\mathbf{K}_{1,1}\); see the entry with label \((0,0)\) in dark red in Figure 2 (a). Furthermore, by considering the entries with labels \((0,1)\) and \((0,2)\), we see that \(|\mathbf{K}_{1,2}|\) and \(|\mathbf{K}_{1,3}|\) have small values compared e.g.with \(|\mathbf{K}_{1,4}|\); see the darker red entry with label \((0,3)\). This intuitively indicates that the item with label \(0\) and the item with label \(3\) have a large “similarity” w.r.t.this kernel and are not likely to be jointly sampled. Accordingly, we observe in the histogram in 4 (d) that the set with labels \(\{0,1,2\}\) has a larger probability to be sampled than the set with labels \(\{0,1,3\}\).

Following the circuit construction of [22] with 2-qubit gates acting only on neighbouring qubits on a line, we obtain the circuit shown in Figure 1, where each gate labeled as “XX+YY" is a Givens gate, i.e., a sequence of CNOT and \(Z\) gates as discussed in 9. The circuit of Figure 1 starts on the left side by three \(X\) gates which excite three fermions on the Fock vacuum, namely \(a_1^* a_2^* a_3^* \ket{0}\). Next, Givens rotations are applied to this state to yield \(b_1^* b_2^* b_3^* \ket{0}\) as a result of 9. The parameters of the corresponding gates are obtained by the QR decomposition of 5.2.2 as implemented in Qiskit.

Now, before it can be run on a particular machine, the circuit needs to be transpiled, i.e.written as an equivalent sequence of gates that correspond to what can be physically implemented on the machine. We show in Figure 3 the calibration data for three \(5\)-qubit IBMQ machines: lima, quito, and manila. This calibration takes the form of a graph, where nodes represent qubits, and edges represent the possibility of applying a CNOT gate, the only two-qubit gate in our original circuit in 1. While manila can implement CNOT gates between neighbouring qubits on a line, as implicitly assumed in the construction of [22], the two other machines have a T-structured graph that will force the transpiled circuit to look quite different from 1. Indeed, we show the transpiled circuits in the first three panels of 4. Note how manila uses CNOT gates between neighbouring qubits, as in the original circuit, resulting in overall fewer CNOT gates than on quito and lima. The latter two machines cannot afford CNOT gates jointly acting on qubits 2 and 3, for instance, and end up compensating for this by using more CNOT gates in total. Since CNOT gates are among the most error-prone manipulations, minimizing the number of CNOT gates is an important feature. Intuitively, had we known in advance that we would run the circuit on a machine with a particular graph, we should have designed the circuit in 1 differently, by rather running parallel QR with Givens rotations only along edges of the machine’s graph.

Before we observe the results, we note that the three machines come with different characteristics. For instance, manila has overall the lowest readout errors, and lima the largest. It is common to summarize the characteristics of a machine in a single number \(2^{d_\text{max}}\), called quantum volume, where \(d_\text{max}\) is –loosely speaking– the depth of a square circuit that we can expect to run reliably. The volume reported by IBM is obtained numerically, using a procedure known as randomized benchmarking; see Appendix 10. The machines manila, quito, and lima respectively have reported volumes 32, 16, and 8. Thus, even on manila, the transpiled circuit is much larger than the “guaranteed" \(5\times 5\) square circuit, and we should expect noise in our measurements, as we shall now see.

a

DPP kernel

b

TV distance under the null

Figure 2: Illustration of the DPP kernel matrix associated with the circuit in 1 (left-hand side) and histogram of the total variation distance between the distribution of the simulated DPP and the exactly known distribution (right-hand side)..

a

lima

b

quito

c

manila

Figure 3: Calibration data for three IBMQ machines: lima, quito, and manila. Dark blue is low, light violet is high, but colors are not comparable across subfigures, although the orders of magnitude are similar. Node colors indicate readout errors (all of the order of \(10^{-2}\)), edge colors indicate CNOT error, of the same \(10^{-2}\) order of magnitude for lima and quito, but going down to \(5\times 10^{-3}\) for manila. All machines have 5 qubits, but their volumes are respectively 8, 16, and 32..

a

lima

b

quito

c

manila

d

Results

Figure 4: Transpiled circuits corresponding to the input in Figure 1 for three IBMQ machines: lima, quito, and manila. 4 (d) shows the corresponding empirical distribution..

4 (d) shows the empirical distributions corresponding to independently preparing the input and measuring the output of the transpiled circuits in 4, \(20,000\) times each. In blue and orange, we show for reference the probability under DPP(\(\mathbf{K}\)) of the corresponding subset, as well as the empirical frequencies coming from sampling the output of the circuit using a simulator, which amounts to independently drawing \(20,000\) samples of the DPP on a classical computer.

The empirical measure of the classical samples in orange is close to the underlying DPP, as testified by the total variation (TV) distance13 between the two distributions, which is below \(10^{-2}\). Actually, if we repeatedly and independently draw sets of \(20,000\) samples, we obtain the empirical distribution of the TV distance shown in Figure 2 (b). In contrast, the TV distance between the empirical measure obtained from \(20,000\) runs on manila of the corresponding transpiled circuit is \(0.27\), while it is \(0.39\) and \(0.40\) for the T-structured machines lima and quito, respectively. Looking at 2 (b) again, a test based on the estimated TV distance would easily reject even the hypothesis that at least one of the quantum circuits samples from the correct distribution. Moreover, we confirm (not shown) that the difference in TV between manila and its competitors is significant, which confirms the intuition coming from its larger volume and smaller number of CNOT gates, due to a QR decomposition adapted to its qubit communication graph. Finally, although a test would reject that the quantum circuits sample from the correct DPP, the resulting distributions are still close to the target DPP, especially for manila, as confirmed by their respective TV. Interestingly, the quantum circuits actually yield point processes that are supported on (almost) all subsets of \(\{1, \dots, 5\}\), while the target DPP, being of projection, only charges subsets of cardinality 3. The noise does respect the structure of the DPP, somehow: two-item subsets that (wrongly) appear in the support of the empirical measures correspond to items that are marginally favoured by the DPP, as can be seen on the diagonal of the kernel in 2 (a). Intuitively, the appearance of a subset of cardinality \(2\) is partly due to readout error on one of the qubits supposed to output \(1\). This intuition is confirmed by the calibration data: for lima, for instance, 3 (a) shows that the qubit labeled ‘3’ has a large readout error; simultaneously, there is a deficit of appearance of the triplet with labels \(\{0,3,4\}\) in its empirical distribution in Figure 4 (d), while \(\{0,4\}\) wrongly appears.

Optimal QR for a T-structured communication graph.

The QR-inspired fermionic circuits implemented in Qiskit follow [22], and thus use Givens rotations between neighbouring columns. While this suits the calibration data of manila, which has a linear qubit-communication graph, lima and quito rather have a communication graph shaped as a \(T\). As a result, transpilation is less straighforward and yields a bigger circuit than for manila. In particular, the transpiled quito circuit in 4 (b) has 15 CNOT gates, while both the original circuit in 1 and the transpiled manila circuit have only \(12\) CNOT gates.

As we discussed in 5, a shorter (and less error-prone) transpiled circuit would intuitively result from a QR decomposition that respects the qubit-communication graph. For concreteness, we give a QR decomposition that is better suited to the communication graph of quito, with the mapping to the columns of \(\mathbf{Q}\) given in [f:col95to95qubit95T95structured]. Assuming no preprocessing, we find \[\begin{align} \mathbf{Q} &\rightarrow \begin{pmatrix} * & * & \mathbf{0}& * & \mathbf{0}\\ * & * & * & * & * \\ * & * & * & * & * \\ \end{pmatrix} \rightarrow \begin{pmatrix} * & * & 0& \mathbf{0}& 0 \\ * & * & * & * & * \\ * & * & * & * & * \\ \end{pmatrix}\nonumber\\ &\rightarrow \begin{pmatrix} \underline{\lambda_1} & \mathbf{0}& 0& 0 & 0 \\ \underline{0} & * & * & * & \mathbf{0}\\ \underline{0} & * & * & * & * \\ \end{pmatrix} \rightarrow \begin{pmatrix} \lambda_1 & 0 & 0& 0 & 0 \\ 0 & * & * & \mathbf{0}& 0 \\ 0 & * & * & * & * \\ \end{pmatrix}\nonumber\\ &\rightarrow \begin{pmatrix} \lambda_1 & 0 & 0& 0 & 0 \\ 0 & \mathbf{0}& \underline{\lambda_2} & 0 & 0 \\ 0 & * & \underline{0} & * & \mathbf{0}\\ \end{pmatrix} \rightarrow \begin{pmatrix} \lambda_1 & 0 & 0& 0 & 0 \\ 0 & 0 & \lambda_2 & 0 & 0 \\ 0 & \underline{\lambda_3} & 0 & \mathbf{0}& 0 \\ \end{pmatrix}.\label{e:QR95for95T95graph} \end{align}\tag{41}\] While not necessary optimal in any sense, our guiding principle for the decomposition 41 is to fill the matrix with zeros such that, for each row, the final complex phase appears at a node which, if removed from the graph, leaves the resulting graph connected. Note that to fall back onto a “diagonal" matrix, although this last step is unnecessary for DPP sampling, a final permutation between the second and third columns can be realized by an extra Givens gate with \(\theta=0\) and \(\phi = \pi/2\) (i.e., a so-called ISwap gate) between qubit \(1\) and qubit \(2\). Overall, the sequence of Givens rotations corresponding to 41  can be transpiled on quito or lima with only the expected 2-per-rotation CNOT gates, since all rotations are applied to neighbours in the graph. We leave the characterization and benchmarking of the optimal QR decomposition for a given qubit communication graph for future work.

Figure 5: Allowed CNOT gates between qubits in the T-structured communication graph of lima or quito, for a map sending column \(1\) \(\mapsto\) qubit \(0\), column \(2\) \(\mapsto\) qubit \(1\), etc.

6.2 Pfaffian PPs↩︎

In this section, we illustrate the second step of the algorithm of 5.5 to sample PfPPs of the type described by [26] and associated with a pure state of the form \(\prod_{i\in \mathcal{C}}b_i^*\ket{\emptyset_b}\), namely, for which the matrix \(\mathbf{S}\) is projective as explained in 11.

To construct \(\mathbf{S}\), we consider the quadratic form \(\mathbf{H}_{\mathrm{BdG}}\) given by the block matrix in 21 with \(N = 5\) with the Hermitian and skew-symmetric part given respectively by \[\mathbf{M} = \begin{pmatrix} 1 & 0.5& 0.2& 0.2& 0.2\\ 0.5& 1& 0.5& 0.2& 0.2\\ 0.2& 0.5& 1& 0.5& 0.2\\ 0.2& 0.2& 0.5& 1& 0.5\\ 0.2& 0.2& 0.2& 0.5& 1 \end{pmatrix} \text{ and } \boldsymbol{\Delta} = \begin{pmatrix} 0& 1& 0& 0& 0\\ -1& 0& 1& 0& 0\\ 0& -1& 0& 1& 0\\ 0& 0& -1& 0& 1\\ 0& 0& 0& -1& 0 \end{pmatrix}.\] Let \(0\leq \epsilon_1\leq \dots\leq \epsilon_N\) be eigenvalues of \(H_{\mathrm{BdG}}\) determined by the eigendecomposition 24 and let \(\mathbf{W}\) be the orthogonal complex matrix associated with the diagonalization. Next, we select the subset \(\mathcal{C}\) of the \(3\) smallest positive eigenvalues of \(\mathbf{H}_{\mathrm{BdG}}\). Let \(\mathbb{K}\) be the kernel associated with the projection \[\mathbf{S} = \overline{\mathbf{W}}^{*} \begin{pmatrix} \mathbf{1}_{\bar{\mathcal{C}}}& \mathbf{0}\\ \mathbf{0} & \mathbf{1}_{\mathcal{C}} \end{pmatrix} \overline{\mathbf{W}} \text{ with } \mathcal{C}= \{1,2,3\},\label{eq:S95for95exp}\tag{42}\] by virtue of 2. The Hamiltonian quadratic form and the associated Pfaffian kernel are displayed in 6, whereas the corresponding circuit is displayed in 7. The probability mass function of \(Y\sim \mathrm{PfPP}(\mathbb{K})\) has the following simple expression: \[\begin{align} \mathbb{P}(Y = S)= (-1)^{|\overline{S}|} \mathrm{Pf}(\mathbb{K} - \mathbb{J}_{\overline{S}}),\label{e:pf95L95ensemble} \end{align}\tag{43}\] where \((\mathbb{J}_{\overline{S}})_{ij} = \delta_{ij}1(i\in\overline{S})\left(\begin{smallmatrix} 0 & 1\\ -1 & 0 \end{smallmatrix}\right)\), with \(\overline{S}\) being the complement of \(S\) in \([N]\); we refer to 8.5 for a short proof.

a

\(\mathbf{M}\) and \(\boldsymbol{\Delta}\).

b

\(\mathbf{K}\) and \(\mathbf{P}\).

Figure 6: Visualization of the Hamiltonian \(\mathbf{H}_{\mathrm{BdG}}\) and the components defining the Pfaffian kernel \(\mathbb{K}\) given in ?? as a function of \(\mathbf{K}\) and \(\mathbf{P}\)...

Figure 7: Circuit corresponding to \(\mathrm{PfPP}(\mathbb{K})\) associated to the projection matrix 42 . The parity of the number of \(X\) gates determines the parity of the number of points in samples of \(\mathrm{PfPP}(\mathbb{K})\).

For these numerical experiments, we only compare (classical) simulations of the quantum circuits to the true point process. As in 6.1, \(20,000\) samples are drawn independently from \(Y\sim\mathrm{PfPP}(\mathbb{K})\) thanks to the Qiskit simulator. Next, the empirical frequencies of subsets of \([N]\) are computed and compared with the probabilities 43 .

In 8, we observe that the empirical probabilities match the expected case, with a TV distance of \(0.009\). Furthermore, it is manifest from 6 (b) that there is a weak repulsion between items \(2\) and item \(4\) – see the entry \((1,3)\) in the grayscale matrix – and that each of these two elements has a large marginal probability. Hence, they can be expected to be sampled together. This is confirmed in 8 where the subset \(\{2,4\}\) (corresponding to the label \((1,3)\) in the histogram) corresponds to a large mass under the empirical measure. Also, note that all subsets naturally have the same parity.

Figure 8: Histogram of the empirical frequencies of the subsets sampled by \(\mathrm{PfPP}(\mathbb{K})\) associated to the projection matrix 42 versus subset probabilities as given by the Pfaffian formula 43 .

For completeness, the histogram of the total variation distance between the ground truth distribution and the estimated distribution, computed over \(1000\) runs, is displayed in 9.

Figure 9: Empirical distribution of the total variation distance between the true probability mass function of t\(\mathrm{PfPP}(\mathbb{K})\) associated to the projection matrix 42 and its simulation.

7 Discussion↩︎

Inspired by the pioneering work of [1] on point processes in quantum physics, we have studied quantum algorithms for DPP sampling. We did so by reducing DPP sampling to a fermion sampling problem, and then leveraging and modifying recent algorithms for fermion sampling [20], [22]. While many of the steps are either common lore in one or the other field, or recently published material, we believe that there is value in a self-contained survey of how to reduce a finite DPP to a fermion sampling problem, all the way from the mathematical object to the implementation on the quantum machine. We hope that this paper can help spark further cross-disciplinary work. Moreover, writing down all the steps from a DPP as specified in machine learning to its quantum implementation has allowed us to make contributions on top of the survey, like the extension of the argument to a class of Pfaffian PPs, and the first steps in adapting the QR-decomposition behind fermion sampling algorithms to qubit communication constraints. This opens several research avenues.

First, as mentioned in the introduction, projective DPPs can also be sampled thanks to the Clifford loaders defined in [25], introduced independently of the physics literature that we cover in this paper. Yet the structure of the arguments is related, and it would be enlightening to explicitly compare, on the one hand, the parallel implementation we give in 5.2.2 with a circuit depth logarithmic in \(N\) and, on the other hand, data loaders of [25] with a similar depth.

Second and in the same spirit, an interesting extension of this work would be to develop an algorithm optimally matching any qubit communication graph to a QR decomposition scheme, where by optimal we mean minimizing e.g.the total variation distance between the output of the circuit and the original point process. This would generalize the case of the \(T\) graph described in 6.1, and lead to transpilers with fewer noisy gates. A potential strategy would be to follow the approach of [56], who optimize a sparsity-inducing objective to approximate a unitary matrix as a product of Givens rotations.

Third, a natural improvement of the proposed classical preprocessing followed by circuit construction would be the inclusion of the kernel diagonalization in the quantum circuit, using for instance the recent developments about quantum SVD [57]. The combination of this quantum preprocessing and a QR-based circuit would constitute a turn-key sampling pipeline.

A fourth and maybe more speculative research perspective would be to leverage our knowledge of all the correlation functions of point processes such as DPPs, PfPPs, and permanental PPs [58] to develop a statistical test of quantum decoherence in a given machine. In particular, our aim would be to design statistics of PPs which are sensitive to the different kinds of errors affecting a quantum computer.

Finally, from a mathematical perspective, we think it is worth exploring in more depth the structure of PfPPs. While potentially offering more modeling power in machine learning applications, they have received little attention, likely due to their high sampling cost on a classical computer. Since the sampling overhead on a quantum computer is minor, they are likely to become a valuable addition to the machine learner’s toolbox. For starters, we are unaware of a formula for the probability mass function 43 taking the form of a pfaffian of a likelihood matrix. Such a construction would generalize the extended L-ensemble construction in [59].

Declarations↩︎

Competing interests↩︎

The authors declare to have no competing interests related to this work.

Availability of data and materials↩︎

The source code used in the paper is publicly available on the GitHub page https://github.com/For-a-few-DPPs-more/quantum-sampling-DPPs.

Acknowledgements↩︎

We acknowledge support from ERC grant Blackjack (ERC-2019-STG-851866) and ANR AI chair Baccarat (ANR-20-CHIA-0002). Furthermore, we acknowledge the use of IBM Quantum services for this work. The views expressed are those of the authors, and do not reflect the official policy or position of IBM or the IBM Quantum team.

8 Mathematical details↩︎

8.1 More about Givens operators↩︎

We briefly explain here how the unitary operator \(\mathcal{G}\) representing a Givens rotation in 28 can be constructed. A unitary operator \(\mathcal{V}: \mathbb{H} \to \mathbb{H}\) such that \[\begin{align} \begin{pmatrix} \mathcal{V} a_{1}^*\mathcal{V}^*\\ \mathcal{V} a_{2}^*\mathcal{V}^* \end{pmatrix} = \begin{pmatrix} \cos \theta & \sin \theta\\ -\sin \theta & \cos \theta \end{pmatrix} \begin{pmatrix} a_{1}^*\\ a_{2}^* \end{pmatrix} \end{align}\] is given by \(\mathcal{V} = \exp \left(\theta (a_{2}^*a_{1} - a_{1}^*a_{2})\right),\) where we used the BCH formula [60]. Thanks to this observation, we readily have that \[\mathcal{G} = \mathcal{T}\mathcal{V} \mathcal{T}^*,\] with \(\mathcal{T} = \exp(\mathrm{i}\phi (a_{1}^*a_{1} - a_{2}^*a_{2})/2)\), realizes the Givens transformation \[\begin{align} \begin{pmatrix} \mathcal{G} a_{1}^*\mathcal{G}^*\\ \mathcal{G} a_{2}^*\mathcal{G}^* \end{pmatrix} = \begin{pmatrix} \cos \theta & e^{-\mathrm{i}\phi}\sin \theta\\ -e^{\mathrm{i}\phi}\sin \theta & \cos \theta \end{pmatrix} \begin{pmatrix} a_{1}^*\\ a_{2}^* \end{pmatrix}. \end{align}\]

8.2 Proof of 5↩︎

We begin by a remark about the diagonalization of \(H_{\mathrm{BdG}}\). Let \(\mathbf{C}= \Big(\begin{smallmatrix} \mathbf{0} & \mathbf{I}\\ \mathbf{I}& \mathbf{0} \end{smallmatrix}\Big)\). As a consequence of 4, we can write \[\begin{align} H_{\mathrm{BdG}} = \frac{1}{2}\begin{pmatrix} \mathbf{c}^*\\ \mathbf{c} \end{pmatrix}^* \begin{pmatrix} -\overline{\mathbf{M}}& -\overline{\boldsymbol{\Delta}}\\ \boldsymbol{\Delta} & \mathbf{M} \end{pmatrix} \begin{pmatrix} \mathbf{c}^*\\ \mathbf{c} \end{pmatrix}. \end{align}\] The diagonalization of \(H_{\mathrm{BdG}}\) reads \[H_{\mathrm{BdG}} = \frac{1}{2} \begin{pmatrix} \mathbf{b}^*\\ \mathbf{b} \end{pmatrix}^* \begin{pmatrix} -\mathrm{Diag}(\epsilon_k) & \mathbf{0}\\ \mathbf{0} & \mathrm{Diag}(\epsilon_k) \end{pmatrix} \begin{pmatrix} \mathbf{b}^*\\ \mathbf{b} \end{pmatrix} .\] Note that the expectation value of the bilinears are \[\begin{align} \begin{pmatrix} (\expval**{c_i c_j^*}_{\rho} )_{i,j} &(\expval**{c_i c_j}_{\rho} )_{i,j} \\ (\expval**{c_i^*c_j^*}_{\rho} )_{i,j}&(\expval**{c_i^*c_j}_{\rho} )_{i,j} \end{pmatrix} &= \mathbf{W}^\top \begin{pmatrix} (\expval**{b_i b_j^*} )_{i,j} & (\expval**{b_i b_j} )_{i,j}\\ (\expval**{b_i^*b_j^*} )_{i,j} & (\expval**{b_i^*b_j} )_{i,j} \end{pmatrix} \overline{\mathbf{W}}\\ &= \mathbf{W}^\top \begin{pmatrix} ((1-d_i) \delta_{ij} )_{i,j} & (0)_{i,j}\\ (0)_{i,j} & (d_i\delta_{ij} )_{i,j} \end{pmatrix} \overline{\mathbf{W}}\\ &= \mathbf{W}^\top \begin{pmatrix} \mathbf{I}- \mathrm{Diag}(d_k) & \mathbf{0}\\ \mathbf{0} & \mathrm{Diag}(d_k) \end{pmatrix} \overline{\mathbf{W}}, \end{align}\] with \(d_k = \exp(-\beta \epsilon_k)/(1+\exp(-\beta \epsilon_k))\) for all \(1\leq k \leq N\). The proof is completed if we recall that the sigmoid \(\sigma\) satisfies \(\sigma(x) = 1 -\sigma(-x)\).

8.3 Proof of 7↩︎

Assume that \(i_1, \dots, i_k\) are pairwisely distinct. The object of interest is \[\require{physics} \expval{N_{i_1}\dots N_{i_k}}_{\rho} = \expval{c^*_{i_1}c_{i_1}\dots c^*_{i_k}c_{i_k}}_{\rho}.\] We now use a direct consequence of Wick’s theorem for expectations in a Gaussian state, see [3]: let \(m\) even and let \(\beta_1, \dots, \beta_m\) be linear combinations of the \(c_i\)’s and \(c_i^*\)’s for \(1\leq i \leq N\), then \[\require{physics} \begin{align} \expval{\beta_1\dots\beta_m}_{\rho} = \sum_{\sigma \text{ contraction}} \mathrm{sgn}(\sigma) \expval{\beta_{\sigma(1)}\beta_{\sigma(2)}}_{\rho} \dots \expval{\beta_{\sigma(m-1)}\beta_{\sigma(m)}}_{\rho}. \label{eq:wick95contraction} \end{align}\tag{44}\] Take \(m=2k\) and use the following definition: \(\beta_{2\ell -1} = c^*_{i_{\ell}}\) and \(\beta_{2\ell}= c_{i_{\ell}}\) for \(1\leq \ell \leq k\). Thus, we now show that 44 is the Pfaffian of a skewsymmetric matrix made of the \(2\times 2\) blocks.

Let us construct this skewsymmetric matrix. In details, for the pair \((\ell,\ell^\prime)\) with \(1\leq \ell < \ell^\prime \leq k\), we denote the \(2\times 2\) block by \(\mathbb{K}_{\ell\ell^\prime}\). In order to make any block matrix \(\left(\mathbb{K}_{i_mi_n}\right)_{1\leq m,n \leq k}\) skewsymmetric, we need to have \[\mathbb{K}_{\ell\ell^\prime}^\top = - \mathbb{K}_{\ell^\prime\ell}.\label{eq:block95skew}\tag{45}\] The form of the \((\ell,\ell^\prime)\) block with \(1\leq \ell < \ell^\prime \leq k\) is \[\mathbb{K}_{\ell\ell^\prime} = \begin{pmatrix} \expval**{\beta_{2\ell -1} \beta_{2\ell^\prime -1} }_{\rho} & \expval**{\beta_{2\ell -1} \beta_{2\ell^\prime} }_{\rho}\\ \star & \expval**{\beta_{2\ell} \beta_{2\ell^\prime} }_{\rho} \end{pmatrix} = \begin{pmatrix} \expval**{c_{i_\ell}^*c_{i_{\ell^\prime}}^*}_{\rho} & \expval**{c_{i_{\ell}}^*c_{i_{\ell^\prime}} }_{\rho}\\ \star & \expval**{c_{i_{\ell}} c_{i_{\ell^\prime}}}_{\rho} \end{pmatrix}.\]Note that \(\expval**{\beta_{2\ell -1} \beta_{2\ell^\prime} }_{\rho}\) does not appear in 44 since \(\ell < \ell^\prime\). Now, the \(\star\) is completed by \(- \expval**{c_{i_{\ell^\prime}}^*c_{i_{\ell}} }_{\rho}\) to ensure the property 45 which guarantees skewsymmetry of the block kernel matrix. Recalling ?? , the Pfaffian kernel is defined as the skewsymmetric matrix whose \((\ell,\ell^\prime)\)-block is \[\mathbb{K}_{\ell\ell^\prime} = \begin{pmatrix} \expval**{c_{i_\ell}^*c_{i_{\ell^\prime}}^*}_{\rho} & \expval**{c_{i_{\ell}}^*c_{i_{\ell^\prime}} }_{\rho}\\ \expval**{c_{i_{\ell}} c_{i_{\ell^\prime}}^*}_{\rho} - \delta_{i_{\ell}i_{\ell^\prime}} & \expval**{c_{i_{\ell}} c_{i_{\ell^\prime}}}_{\rho} \end{pmatrix}.\] Now, we define the kernel \(\mathbb{K}(i_{\ell}, i_{\ell^\prime}) = \mathbb{K}_{\ell\ell^\prime}\). Then, the expression 44 matches the definition of the Pfaffian, i.e., \[\require{physics} \expval{\beta_1\dots\beta_m}_{\rho} = \mathrm{Pf}\left(\mathbb{K}(i,j)\right)_{1\leq i,j\leq m}.\] By using 5, we obtain the expression ?? and ?? follows.

8.4 Proof of 8↩︎

Key identities.

The two main ingredients for this proof are \[\begin{align} &\mathrm{Pf}(\mathbf{B}^\top \mathbf{A}\mathbf{B}) = \mathrm{Pf}(\mathbf{A}) \det(\mathbf{B}),\tag{46}\\ &\mathrm{Pf}\Big(\begin{smallmatrix} \mathbf{0} & \mathbf{M}\\ -\mathbf{M} & \mathbf{0} \end{smallmatrix}\Big)= (-1)^{N(N-1)/2}\det \mathbf{M},\tag{47} \end{align}\] where \(\mathbf{M}\) is an \(N\times N\) matrix and \(\mathbf{A}\) is \(2N\times 2N\) skewsymmetric.

Main part of the proof.

We begin by using 1. The expected parity of a sample of \(Y\sim \mathrm{PfPP}(\mathbb{K})\) is \(\mathop{\mathrm{\mathbb{E}}}_{Y}(-1)^{|Y|} = \mathrm{Pf}\left(\mathbb{J} - 2 \mathbb{K}\right).\) Next, by applying a suitable permutation matrix \(\mathbf{B}\) of signature \((-1)^{N(N-1)/2}\) on rows and columns of \(\mathbb{J} - 2 \mathbb{K}\), we find \[\begin{align} \mathrm{Pf}(\mathbb{J} - 2 \mathbb{K}) &= (-1)^{N(N-1)/2} \mathrm{Pf}\left(\begin{pmatrix} \mathbf{0} & \mathbf{I}\\ -\mathbf{I} & \mathbf{0} \end{pmatrix} - 2 \begin{pmatrix} \mathbf{P} & \mathbf{K}\\ -\mathbf{K}^\top & -\overline{\mathbf{P}} \end{pmatrix} \right)\nonumber\\ &=(-1)^{N(N-1)/2} (-2)^N \mathrm{Pf}\left(\begin{pmatrix} \mathbf{0} & -\mathbf{I}/2\\ \mathbf{I}/2 & \mathbf{0} \end{pmatrix} + \begin{pmatrix} \mathbf{P} & \mathbf{K}\\ -\mathbf{K}^\top & -\overline{\mathbf{P}} \end{pmatrix} \right),\label{e:sym95to95be95added} \end{align}\tag{48}\] where we used 46 to obtain the first equality. At this point, we simply can add the symmetric matrix \(\frac{1}{2}\left(\begin{smallmatrix} \mathbf{0} & \mathbf{I}\\ \mathbf{I} & \mathbf{0} \end{smallmatrix}\right)\) to the argument of the Pfaffian at the right-hand side of 48 . This is valid since the pfaffian of a matrix is the Pfaffian of its skew-symmetric part; see 2.2. We then obtain \[\begin{align} \mathrm{Pf}(\mathbb{J} - 2 \mathbb{K}) &=(-1)^{N(N-1)/2} (-2)^N \mathrm{Pf}\left(\begin{pmatrix} \mathbf{0} & \mathbf{0}\\ \mathbf{I} & \mathbf{0} \end{pmatrix} + \begin{pmatrix} \mathbf{P} & \mathbf{K}\\ -\mathbf{K}^\top & -\overline{\mathbf{P}} \end{pmatrix} \right)\nonumber\\ &=(-1)^{N(N-1)/2} (-2)^N \mathrm{Pf}\left(\mathbf{C}\mathbf{S} \right).\label{e:pf2-K} \end{align}\tag{49}\] Recall that \(\mathbf{C} = \left( \begin{smallmatrix} \mathbf{0} & \mathbf{I}\\ \mathbf{I} & \mathbf{0} \end{smallmatrix}\right)\) and \(\mathbf{S} = \left(\begin{smallmatrix} \mathbf{I}-\mathbf{K}^\top & -\overline{\mathbf{P}}\\ \mathbf{P} & \mathbf{K} \end{smallmatrix}\right)\). Thanks to 5 and the definition of orthogonal complex matrices, we have \[\begin{align} \mathbf{C}\mathbf{S} = \overline{\mathbf{W}}^{\top} \begin{pmatrix} \mathbf{0} & \mathbf{D}_{-}\\ \mathbf{D}_{+} &\mathbf{0} \end{pmatrix} \overline{\mathbf{W}}, \end{align}\] with \(\mathbf{D}_{\pm} = \mathrm{Diag}(\sigma(\pm\beta \epsilon_k))\). Using that \(\mathrm{Pf}\left(\mathbf{C}\mathbf{S} \right) = \mathrm{Pf}\left(\frac{1}{2}\left(\mathbf{C}\mathbf{S} - \mathbf{S}^\top\mathbf{C}^\top \right)\right)\) and once again 46 , we find \[\begin{align} \mathrm{Pf}\left(\mathbf{C}\mathbf{S} \right) &= \det(\overline{\mathbf{W}})\mathrm{Pf}\begin{pmatrix} \mathbf{0} & \frac{1}{2}(\mathbf{D}_{-} - \mathbf{D}_{+})\\ \frac{1}{2}(\mathbf{D}_{+} - \mathbf{D}_{-}) &\mathbf{0} \end{pmatrix} \\ &= \det(\overline{\mathbf{W}}) (-1)^{N(N-1)/2} \left(\frac{-1}{2}\right)^N\det(\mathbf{D}_{+} - \mathbf{D}_{-}). \end{align}\] Now, we substitue this identity into 49 and we conclude the first part of the proof by remarking that \[\sigma(\beta \epsilon_k) - \sigma(-\beta \epsilon_k) = \frac{1 - e^{-\beta\epsilon_k}}{1 + e^{-\beta\epsilon_k}} = \tanh(\beta\epsilon_k/2).\] Now we show that \(\det \mathbf{W} = \mathop{\mathrm{\mathrm{sign}}}\mathrm{Pf}(\mathbf{A}_M)\). Recalling the results of 4, we have \[H_{\mathrm{BdG}} =\frac{\mathrm{i}}{2} \mathbf{f}^\top \mathbf{A} \mathbf{f} = \frac{\mathrm{i}}{2} \boldsymbol{\gamma}^\top \mathbf{A}_{M} \boldsymbol{\gamma},\] where \(\mathbf{A}\) and \(\mathbf{A}_M\) are related by a conjugation with a permutation matrix of signature \((-1)^{N(N-1)/2}\), and therefore, thanks to 46 \[\mathrm{Pf}(\mathbf{A}) = (-1)^{N(N-1)/2} \mathrm{Pf}(\mathbf{A}_M).\] Still as a consequence of 4, it holds \[\mathbf{R}\mathbf{A}\mathbf{R}^\top = \Big(\begin{smallmatrix} 0 & \mathrm{Diag}(\epsilon_k)\\ -\mathrm{Diag}(\epsilon_k) & 0 \end{smallmatrix}\Big).\] with \(\mathbf{R}\) real orthogonal as well as \(\mathbf{W} = \boldsymbol{\Omega}^*\mathbf{R} \boldsymbol{\Omega}\) with \(\Omega\) unitary. Thus, \(\det(\mathbf{W}) = \det(\mathbf{R}) \in \{-1,1\}\), and \[\mathrm{Pf}(\mathbf{A}) = \det(\mathbf{R})\mathrm{Pf}\Big(\begin{smallmatrix} 0 & \mathrm{Diag}(\epsilon_k)\\ -\mathrm{Diag}(\epsilon_k) & 0 \end{smallmatrix}\Big) = \det(\mathbf{W}) (-1)^{N(N-1)/2} \prod_{k}\epsilon_k,\] where we used 47 . Hence, by accounting for the permutation signature, \[\mathrm{Pf}(\mathbf{A}_M) = \det(\mathbf{W}) \prod_{k}\epsilon_k.\] Since we assumed that \(\epsilon_k\)’s are all strictly positive, we obtain \(\mathop{\mathrm{\mathrm{sign}}}\mathrm{Pf}\mathbf{A}_M = \det\mathbf{W}\).

8.5 Proof of 43↩︎

This proof uses the so-called inclusion-exclusion principle \[\mathbb{P}(Y = S) = \sum_{J: S\subseteq J} (-1)^{|J\setminus S|} \mathbb{P}(S \subseteq Y),\] see e.g.[39], and the identity \[\sum_{J: S\subseteq J} \mathrm{Pf}(\mathbb{A}_{J}) = \mathrm{Pf}(\mathbb{J}_{\bar{S}} + \mathbb{A}),\] for any kernel \(\mathbb{A}\) such that \(\mathbb{A}(i,j) = - \mathbb{A}(j,i)\); see [61]. By combining these two formulae, we find \[\begin{align} \mathbb{P}(Y = S) &= \sum_{J: S\subseteq J} (-1)^{|J\setminus S|} \mathrm{Pf}(\mathbb{K}_J)\\ &= (-1)^{|S|} \sum_{J: S\subseteq J} \mathrm{Pf}(-\mathbb{K}_J)\\ &= (-1)^{|S|}\mathrm{Pf}(\mathbb{J}_{\bar{S}} - \mathbb{K})\\ &= (-1)^{N-|S|}\mathrm{Pf}(\mathbb{K}-\mathbb{J}_{\bar{S}}), \end{align}\] which is the desired result since \(|\bar{S}| = N - |S|\).

8.6 Example of Givens and particle-hole decomposition in a rank-deficient case↩︎

We provide a complementary example for the discussion of 5.5.1, for which we assumed that \(\mathbf{W}_2\) was full-rank. In the case \(N=3\), we consider a specific orthogonal complex matrix \[\mathbf{W} = \begin{pmatrix} \overline{\mathbf{W}}_1& \overline{\mathbf{W}}_2\\ \mathbf{W}_2 & \mathbf{W}_1\; \end{pmatrix} = \boldsymbol{\Omega}^* \mathbf{R} \boldsymbol{\Omega} \text{ with } \mathbf{R} = \begin{pmatrix} 0 & 1 & 0 & 0 & 0 & 0\\ 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 1\\ \end{pmatrix},\] which is of the form considered in 4. By construction, we have \(\det \mathbf{W} = -1\). Thanks to a short calculation, we see that \(\mathbf{W}_1\) is the projector onto the orthogonal of the vector \(\left(\begin{smallmatrix}1 & 1 & 0\end{smallmatrix}\right)^\top\), whereas \(\mathbf{W}_2 = - \mathbf{I}+ \mathbf{W}_1\). The explicit expression \[\begin{align} \begin{pmatrix} \mathbf{W}_2 \vert \mathbf{W}_1 \end{pmatrix} = \left(\begin{array}{ccc|ccc} -\frac{1}{2} & \frac{1}{2} & 0& \frac{1}{2} &\frac{1}{2} & 0\\ \frac{1}{2} & -\frac{1}{2} & 0& \frac{1}{2} &\frac{1}{2} & 0\\ 0 & 0 & 0& 0 & 0 & 1\\ \end{array} \right), \end{align}\] makes manifest that \(\mathbf{W}_2\) is rank-deficient. To factorize \(\mathbf{W}\), we first apply a similar preprocessing to 38 , i.e., a multiplication with a product of real Givens matrices \(\mathbf{V}\) on the left, to put a triangle of zeros at the top left of the left block. Next, we multiply on the right by double Givens rotations and particle-hole transformations (see 50 and 51 ), i.e., \[\begin{align} &\begin{pmatrix} \mathbf{W}_2 \vert \mathbf{W}_1 \end{pmatrix}\nonumber\\ &\xrightarrow{\mathbf{G}\times} \left(\begin{array}{ccc|ccc} 0 & 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} &0\\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} &0 & 0 & 0 &0\\ 0 & 0 & 0 & 0 & 0 &1 \end{array} \right) \xrightarrow{\mathbf{G}\times} \left(\begin{array}{ccc|ccc} 0 & 0 & 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} &0\\ 0 & 0 & 0 & 0 & 0 &-1\\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 0 & 0 & 0 &0 \end{array} \right)\nonumber \\ &\xrightarrow{\times\Gamma^*}\left(\begin{array}{ccc|ccc} 0 & 0 & 0 & 1 & 0 &0\\ 0 & 0 & 0 & 0 & 0 &-1\\ 0 & -1 & 0 & 0 & 0 &0 \end{array} \right) \xrightarrow{\times\Gamma^*} \left(\begin{array}{ccc|ccc} 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 &0 & 0 & 1 & 0\\ 0 & 0 & -1 & 0 & 0 & 0 \end{array}\right)\tag{50}\\ &\xrightarrow{\times \mathbf{B}} \left(\begin{array}{ccc|ccc} 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 &0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & -1 \end{array}\right).\tag{51} \end{align}\] This yields \[\mathbf{V} \begin{pmatrix} \mathbf{W}_2 \vert \mathbf{W}_1 \end{pmatrix} \mathbf{O}^* = \begin{pmatrix} \mathbf{0} \vert \mathbf{D} \end{pmatrix} \text{ with } \mathbf{O} = \mathbf{B}\mathbf{M},\] where \(\mathbf{M}\) is a product of two double Givens rotations in 50 . Therefore, we here have an example for which \(\mathbf{O}\) contains only one particle-hole transformation.

8.7 Derivation of the determinantal formula for occupation number correlations without Wick’s theorem↩︎

This section derives directly the results of 4.3 about projection DPPs without using 4. While redundant in the paper, we believe this appendix might be useful to readers who want to focus on projection DPPs and skip the need for Wick’s theorem.

Recall that the \(i\)th number operator reads \(N_i = c_i^*c_i\). Let \(\rho\) be the orthogonal projector onto the line generated by \(\ket{\psi} = b_1^*\dots b_r^*\ket{\emptyset}\) where the fermionic operators \(b^*_k,b_k\) are defined in 13 , to wit \(b_k = \sum_{j=1}^N \mathbf{u}_{kj}^* c_j\) and \(b_k^* = \sum_{j=1}^N \mathbf{u}_{kj} c_j^*\). Here, as in 4, \(\mathbf{u}_{k}\) is the \(k\)th column of the unitary matrix \(\mathbf{U}\). In this section, we derive the expression \[\require{physics} \expval{N_{i_1}\dots N_{i_k}}_\rho = \det (\mathbf{K}_{\{i_1, \dots, i_k\}}),\] where \(\mathbf{K}= \mathbf{U}_{[r],:} \mathbf{U}_{[r],:}^*\).

By expressing the trace with respect to the Fock basis \((\ket{\mathbf{n}})\) of the operators \(c_i\); see 2, we find \[\require{physics} \begin{align} T \triangleq \tr \left(\rho N_{i_1}\dots N_{i_k}\right) &= \sum_{\mathbf{n}} \expval{N_{i_1}\dots N_{i_k} \rho}{\mathbf{n}}\nonumber\\ &= \sum_{\mathbf{n}} n_{i_1}\dots n_{i_k}\expval{\rho}{\mathbf{n}}\nonumber\\ &= \sum_{\mathbf{n}} n_{i_1}\dots n_{i_k}|\braket{\mathbf{n}}{\psi}|^2.\label{e:sum95over95n95exp} \end{align}\tag{52}\] We know recall that \(\ket{\psi} = b_1^*\dots b_r^*\ket{\emptyset}\) and \(b_k^* = \sum_{j=1}^N \mathbf{u}_{kj} c_j^*,\) so that \[\require{physics} \begin{align} \braket{\mathbf{n}}{\psi} &= \expval{(c_N)^{n_N}\dots (c_1)^{n_1}b_1^*\dots b_r^*}{\emptyset}\\ &= \sum_{j_1, \dots, j_r} \mathbf{u}_{1 j_1} \dots \mathbf{u}_{r j_r} \expval{(c_N)^{n_N}\dots (c_1)^{n_1}c_{j_1}^*\dots c_{j_r}^*}{\emptyset}. \end{align}\] Note that the sequence of Booleans \(n_1, \dots, n_N\) has to sum to \(r\) for the sum above not to vanish. Accounting for the constraint \(n_{i_1} \dots n_{i_k} \neq 0\) in 52 , this yields \[\require{physics} \begin{align} T = \sum_{\substack{1\leq m_1 \leq \dots \leq m_r \leq N \\ \{i_1, \dots, i_k\} \subseteq \{m_1, \dots,m_r\}} }\left| \sum_{j_1, \dots, j_r} \mathbf{u}_{1 j_1} \dots \mathbf{u}_{r j_r} \expval{c_{m_r}\dots c_{m_1} c_{j_1}^*\dots c_{j_r}^*}{\emptyset} \right|^2. \end{align}\] At this point, we note that \[\require{physics} \expval{c_{m_r}\dots c_{m_1} c_{j_1}^*\dots c_{j_r}^*}{\emptyset} \neq 0\] if and only if there is a permutation \(\sigma\) such that \(j_1 = \sigma(m_1), \dots, j_r = \sigma(m_r)\). Also, it is easy to check that \(\require{physics} \expval{c_{m_r}\dots c_{m_1} c_{m_1}^*\dots c_{m_r}^*}{\emptyset} = 1\) and therefore, by using ?? , we have \[\require{physics} \expval{c_{m_r}\dots c_{m_1} c_{\sigma(m_1)}^*\dots c_{\sigma(m_r)}^*}{\emptyset} = \mathop{\mathrm{\mathrm{sign}}}\sigma.\] Consequently, \[\require{physics} \begin{align} \sum_{j_1, \dots, j_r} \mathbf{u}_{1 j_1} \dots \mathbf{u}_{r j_r} \expval{c_{m_r}\dots c_{m_1} c_{j_1}^*\dots c_{j_r}^*}{\emptyset} &= \sum_{\sigma} \mathbf{u}_{1 \sigma(m_1)} \dots \mathbf{u}_{r \sigma(m_r)}\mathop{\mathrm{\mathrm{sign}}}\sigma\\ & = \det(\mathbf{U}_{[r],\{m_1,\dots,m_r\}}), \end{align}\] where we used the notation \([r] = \{1, \dots, r\}\). The expression of \(T\) becomes \[\begin{align} T &= \sum_{\substack{1\leq m_1 \leq \dots \leq m_r \leq N \\ \{i_1, \dots, i_k\} \subseteq \{m_1, \dots,m_r\}} } \det \mathbf{U}_{[r],\{m_1,\dots,m_r\}} \det \mathbf{U}^*_{\{m_1,\dots,m_r\},[r]} \\ &= \sum_{\substack{\mathcal{C}: \{i_1, \dots, i_k\}\subseteq \mathcal{C}\\ |\mathcal{C}| = r}} \det \mathbf{U}_{[r],\mathcal{C}} \det \mathbf{U}^*_{\mathcal{C},[r]} \\ &= \sum_{\substack{\mathcal{C}: \{i_1, \dots, i_k\}\subseteq \mathcal{C}\\ |\mathcal{C}| = r}} \det (\mathbf{U}^*_{\mathcal{C},[r]}\mathbf{U}_{[r],\mathcal{C}}). \end{align}\] For the last step of this derivation, we will use a well-known formula for projective DPPs. More precisely, we first let \(\mathbf{Q}= \mathbf{U}_{[r],:}\). Since \(\mathbf{Q}\) is a matrix with orthonormal rows, the kernel \(\mathbf{K} = \mathbf{Q}^* \mathbf{Q}\) is an orthogonal projector of rank \(r\). For \(Y\sim \mathrm{DPP}(\mathbf{K})\) and \(\mathcal{C}\) such that \(|\mathcal{C}| = r\), it holds that \(\mathbb{P}(Y = \mathcal{C}) = \det (\mathbf{K}_{\mathcal{C}})\). Furthermore, by definition, we also have \(\mathbb{P}(S \subseteq Y) = \det (\mathbf{K}_{S})\) for any subset \(S\). Now, we conclude by \[\begin{align} T &= \sum_{\substack{\mathcal{C}: \{i_1, \dots, i_k\}\subseteq \mathcal{C}\\ |\mathcal{C}| = r}} \det (\mathbf{K}_{\mathcal{C}}) \\ &= \sum_{\substack{\mathcal{C}: \{i_1, \dots, i_k\}\subseteq \mathcal{C}\\ |\mathcal{C}| = r}} \mathbb{P}(Y = \mathcal{C})\\ &= \mathbb{P}(\{i_1, \dots, i_k\}\subseteq Y) = \det (\mathbf{K}_{\{i_1, \dots, i_k\}}), \end{align}\] which is the desired result.

9 Gate details↩︎

We give here details about the implementation of a Givens rotation using elementary gates. Note first that zeroing out a matrix entry \(y\in\mathbb{C}\setminus\{0\}\) can be done with a Givens rotation matrix as follows \[\begin{pmatrix} \cos \theta & \mathrm{e}^{-\mathrm{i}\phi} \sin \theta\\ -\mathrm{e}^{\mathrm{i}\phi} \sin \theta & \cos \theta \end{pmatrix} \begin{pmatrix} x\\ y \end{pmatrix} = \begin{pmatrix} r\\ 0 \end{pmatrix},\] with \(\exp\mathrm{i}\phi = x^* y/|xy|\), \(\cos\theta = |x|/(|x|^2 + |y|^2)^{1/2}\) and \(\sin\theta = |y|/(|x|^2 + |y|^2)^{1/2}\).

Let us list a few elementary gates to implement the corresponding operation on a pair of qubits. For ease of comparison with the online documentation of Qiskit, we denote in this section the Pauli matrices 11 by \(\mathbf{X}\), \(\mathbf{Y}\), and \(\mathbf{Z}\).

Controlled NOT gate.

The CNOT gate is a two-qubit gate, i.e., a linear operator on \(\mathbb{C}^2\otimes \mathbb{C}^2\). In the computational basis \((\ket{00}, \ket{01}, \ket{10}, \ket{11})\), it is described by the matrix \[\begin{align} \begin{array}{c} \Qcircuit @C=1em @R=.7em { & \targ & \qw\\ & \ctrl{-1} & \qw } \end{array} \equiv \begin{pmatrix} 1 &0 &0 &0\\ 0 &0 &0 &1\\ 0 &0 &1 &0\\ 0 &1 &0 &0 \end{pmatrix}, \end{align}\] where the left-hand side is the graphical depiction of the operator in circuits. The CNOT gates thus permutes \(\ket{01}\) and \(\ket{11}\) while leaving \(\ket{00}\) and \(\ket{10}\) unchanged. In other words, it flips labels \(0\) and \(1\) in the first factor of the tensor product, provided that the second factor (the “control" qubit) is \(\ket 1\).

Controlled unitary gate.

Similarly, we can define the following gate, controlled this time by the first qubit, \[\begin{align} \begin{array}{c} \Qcircuit @C=1em @R=.7em { & \qw & \ctrl{1} & \qw \\ & \qw & \gate{e^{\mathrm{i}\theta \mathbf{Y}}} & \qw } \end{array} \equiv \begin{pmatrix} 1 &0 &0 &0\\ 0 &1 &0 &0\\ 0 &0 &c \theta & - s \theta\\ 0 &0 &s \theta & c \theta \end{pmatrix}, \end{align}\] where we denoted for simplicity \(c\theta = \cos \theta\) and \(s\theta = \sin \theta\). Explicitly, this gate rotates \(\ket{0}\) and \(\ket{1}\) in the second factor provided the first factor is \(\ket{1}\), namely it rotates \(\ket{10}\) and \(\ket{11}\) and leaves untouched \(\ket{00}\) and \(\ket{01}\).

9.1 Givens rotation using a controlled unitary gate.↩︎

This implementation is inspired by [22], although our definition of the Givens rotation slightly differs in order to match the definition 26 . The \(2\)-qubit gate representing a Givens rotation is implemented as \[\begin{align} \begin{array}{c} \Qcircuit @C=1em @R=.7em { & \qw & \targ & \ctrl{1} & \targ & \qw & \qw \\ & \gate{e^{\mathrm{i}\phi \mathbf{Z}/2}} & \ctrl{-1} & \gate{e^{\mathrm{i}\theta \mathbf{Y}}} & \ctrl{-1} & \gate{e^{-\mathrm{i}\phi \mathbf{Z}/2}}& \qw } \end{array}\equiv\begin{pmatrix} 1 &0 &0 &0\\ 0 &c \theta & e^{-\mathrm{i}\phi}s \theta & 0\\ 0 &- e^{\mathrm{i}\phi}s \theta & c \theta & 0\\ 0 &0 &0 &1 \end{pmatrix}. \end{align}\] We now give calculation details. First, we compute the inner bock of three gates as follows \[\begin{align} &\begin{array}{c} \Qcircuit @C=1em @R=.7em { & \qw & \targ & \ctrl{1} & \targ & \qw \\ & \qw & \ctrl{-1} & \gate{e^{\mathrm{i}\theta \mathbf{Y}}} & \ctrl{-1} & \qw } \end{array} \equiv \begin{pmatrix} 1 &0 &0 &0\\ 0 &0 &0 &1\\ 0 &0 &1 &0\\ 0 &1 &0 &0 \end{pmatrix} \begin{pmatrix} 1 &0 &0 &0\\ 0 &1 &0 &0\\ 0 &0 &c \theta & - s \theta\\ 0 &0 &s \theta & c \theta \end{pmatrix} \begin{pmatrix} 1 &0 &0 &0\\ 0 &0 &0 &1\\ 0 &0 &1 &0\\ 0 &1 &0 &0 \end{pmatrix} \\ &= \begin{pmatrix} 1 &0 &0 &0\\ 0 &c \theta & s \theta & 0\\ 0 &-s \theta & c \theta & 0\\ 0 &0 &0 & 1 \end{pmatrix}. \end{align}\] Now, the result follows by noting that \[\begin{array}{c} \Qcircuit @C=1em @R=.7em { & \qw & \qw \\ & \gate{e^{\mathrm{i}\phi \mathbf{Z}/2}} & \qw } \end{array} \equiv \begin{pmatrix} e^{\mathrm{i}\phi/2} &0 &0 &0\\ 0 &e^{-\mathrm{i}\phi/2} &0 &0\\ 0 &0 &e^{\mathrm{i}\phi/2} & 0\\ 0 &0 &0 & e^{-\mathrm{i}\phi/2} \end{pmatrix}.\]

9.2 Qiskit 0.42.1 implementation without controlled unitary gate.↩︎

The Givens rotation is implemented in Qiskit, as in 1, thanks to a gate \[R_{XX+YY}(2\theta,\phi -\pi/2) = \begin{pmatrix} 1 &0 &0 &0\\ 0 &c \theta & e^{-\mathrm{i}\phi}s \theta & 0\\ 0 &-e^{\mathrm{i}\phi}s \theta & c \theta & 0\\ 0 &0 &0 &1 \end{pmatrix}\] where the 2-qubit parameterized \(XX+YY\) interaction is defined as \[\begin{align} &R_{XX+YY}(\theta,\beta)\\ &= \left(\exp(-\mathrm{i}\beta \mathbf{Z}/2) \otimes \mathbb{I}\right) \exp\left(-\mathrm{i}\frac{\theta}{2}\frac{\mathbf{X}\otimes \mathbf{X} + \mathbf{Y} \otimes \mathbf{Y}}{2} \right)\left(\exp(\mathrm{i}\beta \mathbf{Z}/2) \otimes \mathbb{I}\right)\\ &=\begin{pmatrix} 1 &0 &0 &0\\ 0 &c \theta/2 & -\mathrm{i}e^{-\mathrm{i}\beta}s \theta/2 & 0\\ 0 &-\mathrm{i}e^{\mathrm{i}\beta}s \theta/2 & c \theta/2 & 0\\ 0 &0 &0 &1 \end{pmatrix}. \end{align}\] This gate is implemented in Qiskit 0.42.1 as a composition of elementary gates containing \(12\) single-qubit gates and \(2\) CNOT gates. Also, note that some details in their gate definitions may vary from the one used in this section.

10 Sources of error in quantum computers↩︎

Different quantities are usually specified by the constructor to assess the efficiency of a quantum machine. First, the error rate (or readout error) is an estimated frequency of getting an undesired measurement (a “bit flip") when measuring a state drawn independently from a fixed, user-defined prior distribution. Second, the de-excitation (or relaxation) of a qubit prepared in its excited state, and thus supposed to represent \(\ket{1}\), is a natural physical process. It is usually understood to be caused by the coupling of the qubit to electromagnetic radiation or, more abstractly, its environment. The typical timescale of the relaxation processes is called the relaxation time, and usually denoted by \(T_1\). Moreover, the presence of an environment is not only the source of relaxation (bit-flip error), but also an additional source of phase errors. This type of error destroys the quantum coherence properties in and between the qubits, i.e., it modifies the correlation structure in Boolean vectors built using Born’s rule 8 , possibly to the point of making the qubits behave as simple classical bits. This decoherence process is usually characterized by a typical timescale denoted \(T_2\). The output of any circuit with depth longer than \(T_1\) or \(T_2\) is likely to be strongly contaminated with the corresponding noises.

To give concrete numbers, the IBM 127 qubits Washington platform14 has a median \(T_1\) (the median is over all qubits) of \(T_1 = \SI{95,22}{\us}\), and the median \(T_2\) is \(T_2 = \SI{92,17}{\us}\). This means that, in practice, after about \(\SI{100}{\us}\), more than half of the qubits have either decohered (become classical bits) or have relaxed to their ground state. Moreover, the median readout error is \(p_{\text{err, ro}} = \SI{1.350e-2}{}\), and the median CNOT gate error is \(p_{\text{err, CNOT}} = \SI{1.287e-2 }{}\). Ideally, characterizing the noise of all the gates would require a quantum process tomography as well as quantum state tomography, to characterize the state of all qubits. In practice, this is out of reach for large systems. To circumvent this issue, [62] propose to partially characterize the noise with a scalable and robust algorithm called randomized benchmarking. This is how the cited numbers for IBM machines have been benchmarked.15

The long-term goal of the quantum computer race is to build a fault-tolerant quantum computer, based on quantum error corrections codes. These are techniques that build up on redundancies of the logical qubits to be robust to noise-induced errors according; see the so-called threshold theorem [63], [64]. As in the classical case, the required redundancy depends of the actual values of the error rates, which are to this day still too high to have a realistic implementation. In the meantime, other techniques are being developed to alleviate the influence or errors, either by directly eliminating errors on the hardware itself, like with so-called spin echos or dynamic decoupling techniques [65], or by statistical postprocessing, with so-called error mitigation techniques [66].

References↩︎

[1]
O. Macchi, “Processus ponctuels et coincidences – contributions à l’étude théorique des processus ponctuels, avec applications à l’optique statistique et aux communications optiques,” PhD thesis, Université Paris-Sud, 1972.
[2]
O. Macchi, Point processes and coincidences – contributions to the theory, with applications to statistical optics and optical communication, augmented with a scholion by suren poghosyan and hans zessin. Walter Warmuth Verlag, 2017.
[3]
R. Bardenet et al., “From point processes to quantum optics and back,” arXiv preprint arXiv:2210.05522, 2022, [Online]. Available: https://arxiv.org/abs/2210.05522.
[4]
K. Johansson, Random matrices and determinantal processes,” ArXiv Mathematical Physics e-prints, 2005.
[5]
F. Lavancier, J. Møller, and E. Rubak, “Determinantal point process models and statistical inference,” Journal of the Royal Statistical Society, Series B, 2015, [Online]. Available: https://www.jstor.org/stable/24775312.
[6]
R. Bardenet and A. Hardy, “Monte Carlo with determinantal point processes,” Annals of Applied Probability, 2020, [Online]. Available: https://doi.org/10.1214/19-AAP1504.
[7]
A. Kulesza and B. Taskar, “Determinantal point processes for machine learning,” Foundations and Trends in Machine Learning, 2012, [Online]. Available: http://dx.doi.org/10.1561/2200000044.
[8]
M. Derezinski and M. W. Mahoney, “Determinantal point processes in randomized numerical linear algebra,” Notices of the American Mathematical Society, vol. 68, no. 1, pp. 34–45, 2021, [Online]. Available: https://arxiv.org/abs/2005.03185.
[9]
R. Pemantle, Choosing a Spanning Tree for the Integer Lattice Uniformly,” The Annals of Probability, vol. 19, no. 4, pp. 1559–1574, 1991, [Online]. Available: http://www.jstor.org/stable/2244527.
[10]
R. Kyng and Z. Song, A Matrix Chernoff Bound for Strongly Rayleigh Distributions and Spectral Sparsifiers from a few Random Spanning Trees,” in 2018 IEEE 59th annual symposium on foundations of computer science (FOCS), 2018, pp. 373–384, [Online]. Available: https://arxiv.org/abs/1810.08345.
[11]
A. Kulesza and B. Taskar, “Learning determinantal point processes,” in Proceedings of the twenty-seventh conference on uncertainty in artificial intelligence, 2011, pp. 419–427, [Online]. Available: https://arxiv.org/abs/1202.3738.
[12]
N. Tremblay, S. Barthelmé, and P.-O. Amblard, “Determinantal point processes for coresets,” Journal of Machine Learning Research, vol. 20, no. 168, pp. 1–70, 2019, [Online]. Available: http://jmlr.org/papers/v20/18-167.html.
[13]
M. Derezinski, R. Khanna, and M. W. Mahoney, “Improved guarantees and a multiple-descent curve for column subset selection and the nystrom method,” in Advances in neural information processing systems, 2020, vol. 33, pp. 4953–4964, [Online]. Available: https://tinyurl.com/nhj5wexa.
[14]
M. Fanuel, J. Schreurs, and J. A. K. Suykens, “Diversity sampling is an implicit regularization for kernel methods,” SIAM Journal on Mathematics of Data Science, vol. 3, no. 1, pp. 280–297, 2021, [Online]. Available: https://doi.org/10.1137/20M1320031.
[15]
A. Belhadji, R. Bardenet, and P. Chainais, “A determinantal point process for column subset selection,” Journal of Machine Learning Research (JMLR), 2020, [Online]. Available: http://jmlr.org/papers/v21/19-080.html.
[16]
J. B. Hough, M. Krishnapur, Y. Peres, and B. Virág, “Determinantal processes and independence,” Probability surveys, 2006, [Online]. Available: https://arxiv.org/abs/math/0503110.
[17]
D. B. Wilson, Generating Random Spanning Trees More Quickly than the Cover Time,” in Proceedings of the twenty-eighth annual ACM symposium on theory of computing, 1996, pp. 296–303, [Online]. Available: https://doi.org/10.1145/237814.237880.
[18]
D. S. Dean, P. Le Doussal, S. N. Majumdar, and G. Schehr, “Noninteracting fermions in a trap and random matrix theory,” Journal of Physics A: Mathematical and Theoretical, vol. 52, no. 14, p. 144006, 2019.
[19]
G. Ortiz, J. E. Gubernatis, E. Knill, and R. Laflamme, “Quantum algorithms for fermionic simulations,” Physical Review A, vol. 64, no. 2, p. 022319, 2001, [Online]. Available: https://doi.org/10.1103/PhysRevA.64.022319.
[20]
D. Wecker, M. B. Hastings, N. Wiebe, B. K. Clark, C. Nayak, and M. Troyer, “Solving strongly correlated electron models on a quantum computer,” Physical Review A, vol. 92, no. 6, p. 062318, 2015, [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.92.062318.
[21]
I. D. Kivlichan et al., “Quantum simulation of electronic structure with linear depth and connectivity,” Phys. Rev. Lett., vol. 120, p. 110501, 2018, [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.120.110501.
[22]
Z. Jiang, K. J. Sung, K. Kechedzhi, V. N. Smelyanskiy, and S. Boixo, “Quantum algorithms to simulate many-body physics of correlated fermions,” Physical Review Applied, vol. 9, no. 4, p. 044036, 2018, [Online]. Available: https://arxiv.org/abs/1711.05395.
[23]
A. H. Sameh and D. J. Kuck, “On stable parallel linear system solvers,” Journal of the ACM (JACM), vol. 25, no. 1, pp. 81–91, 1978, [Online]. Available: https://doi.org/10.1145/322047.322054.
[24]
J. Demmel, L. Grigori, M. Hoemmen, and J. Langou, “Communication-optimal parallel and sequential QR and LU factorizations,” SIAM Journal on Scientific Computing, vol. 34, no. 1, pp. A206–A239, 2012, [Online]. Available: https://arxiv.org/abs/0806.2159.
[25]
I. Kerenidis and A. Prakash, “Quantum machine learning with subspace states,” arXiv preprint arXiv:2202.00054, 2022, [Online]. Available: https://arxiv.org/abs/2202.00054.
[26]
S. Koshida, “Pfaffian point processes from free fermion algebras: Perfectness and conditional measures,” SIGMA. Symmetry, Integrability and Geometry: Methods and Applications, vol. 17, p. 008, 2021, [Online]. Available: https://doi.org/10.3842/SIGMA.2021.008.
[27]
M. Dereziński, K. L. Clarkson, M. W. Mahoney, and M. K. Warmuth, “Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression,” in Conference on learning theory, 2019, pp. 1050–1069.
[28]
S. Barthelmé, N. Tremblay, and P.-O. Amblard, “A faster sampler for discrete determinantal point processes,” in Proceedings of the 26th international conference on artificial intelligence and statistics, 2023, vol. 206, pp. 5582–5592, [Online]. Available: https://proceedings.mlr.press/v206/barthelme23a.html.
[29]
M. A. Nielsen, “The fermionic canonical commutation relations and the Jordan-Wigner transform,” The University of Queensland, 2005. [Online]. Available: https://futureofmatter.com/assets/fermions_and_jordan_wigner.pdf.
[30]
Qiskit contributors, “Qiskit: An open-source framework for quantum computing.” 2023, doi: 10.5281/zenodo.2573505.
[31]
IBM Quantum, 2021, [Online]. Available: https://quantum-computing.ibm.com/.
[32]
E. M. Rains, “Correlation functions for symmetrized increasing subsequences,” arXiv preprint math/0006097, 2000, [Online]. Available: https://arxiv.org/abs/math/0006097.
[33]
A. Soshnikov, “Janossy densities. II. Pfaffian ensembles,” Journal of statistical physics, vol. 113, pp. 611–622, 2003, [Online]. Available: https://doi.org/10.1023/A:1026077020147.
[34]
A. Soshnikov, “Determinantal random point fields,” Russian Mathematical Surveys, vol. 55, pp. 923–975, 2000, [Online]. Available: https://dx.doi.org/10.1070/RM2000v055n05ABEH000321.
[35]
M. Derezinski, F. Liang, and M. Mahoney, “Bayesian experimental design using regularized determinantal point processes,” in Proceedings of the twenty third international conference on artificial intelligence and statistics, 2020, vol. 108, pp. 3197–3207, [Online]. Available: https://proceedings.mlr.press/v108/derezinski20a.html.
[36]
A. Poinas and R. Bardenet, “On proportional volume sampling for experimental design in general spaces,” Statistics and Computing, vol. 33, no. 1, p. 29, 2023, [Online]. Available: https://doi.org/10.1007/s11222-022-10115-0.
[37]
V. Kargin, “On pfaffian random point fields,” Journal of Statistical Physics, vol. 154, no. 3, pp. 681–704, 2014, [Online]. Available: https://doi.org/10.1007/s10955-013-0900-z.
[38]
A. Kassel, “Learning about critical phenomena from scribbles and sandpiles,” ESAIM: Proc., vol. 51, pp. 60–73, 2015, [Online]. Available: https://doi.org/10.1051/proc/201551004.
[39]
A. Kassel and T. Lévy, “Determinantal probability measures on grassmannians,” Annales de l’Institut Henri Poincaré D, 2022, [Online]. Available: https://doi.org/10.4171/aihpd/152.
[40]
A. I. Bufetov, F. Deelan Cunden, and Y. Qiu, Conditional measures for Pfaffian point processes: Conditioning on a bounded domain,” Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, vol. 57, no. 2, pp. 856–871, 2021, doi: 10.1214/20-AIHP1099.
[41]
M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information, 10th anniversary edition. Cambridge University Press, 2010.
[42]
L. Bouten, R. Van Handel, and M. R. James, “An introduction to quantum filtering,” SIAM Journal on Control and Optimization, vol. 46, no. 6, pp. 2199–2241, 2007, [Online]. Available: https://arxiv.org/abs/math/0601741.
[43]
S. B. Bravyi and A. Y. Kitaev, “Fermionic quantum computation,” Annals of Physics, vol. 298, no. 1, pp. 210–226, 2002, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0003491602962548.
[44]
R. C. Ball, “Fermions without fermion fields,” Phys. Rev. Lett., vol. 95, p. 176407, 2005, [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.95.176407.
[45]
F. Verstraete and J. I. Cirac, “Mapping local hamiltonians of fermions to local hamiltonians of spins,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2005, no. 9, p. P09012, 2005, [Online]. Available: https://dx.doi.org/10.1088/1742-5468/2005/09/P09012.
[46]
G. Olshanski, “Determinantal point processes and fermion quasifree states,” Communications in Mathematical Physics, vol. 378, no. 1, pp. 507–555, 2020, [Online]. Available: https://doi.org/10.1007/s00220-020-03716-1.
[47]
C. Cohen-Tannoudji, B. Diu, and F. Laloë, Quantum mechanics; 1st ed. New York, NY: Wiley, 1977.
[48]
B. Dierckx, M. Fannes, and M. Pogorzelska, Fermionic quasifree states and maps in information theory,” Journal of Mathematical Physics, vol. 49, no. 3, 2008, [Online]. Available: https://doi.org/10.1063/1.2841326.
[49]
A. P. Schnyder, S. Ryu, A. Furusaki, and A. W. W. Ludwig, “Classification of topological insulators and superconductors in three spatial dimensions,” Phys. Rev. B, vol. 78, p. 195125, 2008, [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevB.78.195125.
[50]
A. Grabsch, Y. Cheipesh, and C. W. J. Beenakker, “Pfaffian formula for fermion parity fluctuations in a superconductor and application to majorana fusion detection,” Annalen der Physik, vol. 531, no. 10, p. 1900129, 2019.
[51]
[52]
B. M. Terhal and D. P. DiVincenzo, “Classical simulation of noninteracting-fermion quantum circuits,” Phys. Rev. A, vol. 65, p. 032325, 2002, [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.65.032325.
[53]
A. Y. Kitaev, “Unpaired majorana fermions in quantum wires,” Physics-Uspekhi, vol. 44, no. 10S, p. 131, 2001, [Online]. Available: https://dx.doi.org/10.1070/1063-7869/44/10S/S29.
[54]
R. Lyons, “Determinantal probability measures,” Publications Mathematiques de l’IHES, vol. 98, pp. 167–212, 2003, [Online]. Available: https://doi.org/10.1007/s10240-003-0016-0.
[55]
G. H. Golub and C. F. Van Loan, Matrix computations. JHU Press, 2012.
[56]
T. Frerix and J. Bruna, “Approximating orthogonal matrices with effective givens factorization,” in Proceedings of the 36th international conference on machine learning, 2019, vol. 97, pp. 1993–2001, [Online]. Available: https://proceedings.mlr.press/v97/frerix19a.html.
[57]
P. Rebentrost, A. Steffens, I. Marvian, and S. Lloyd, “Quantum singular-value decomposition of nonsparse low-rank matrices,” Phys. Rev. A, vol. 97, p. 012327, 2018, [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.97.012327.
[58]
S. Jahangiri, J. M. Arrazola, N. Quesada, and N. Killoran, “Point processes with gaussian boson sampling,” Phys. Rev. E, vol. 101, p. 022134, 2020, [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevE.101.022134.
[59]
N. Tremblay, S. Barthelmé, K. Usevich, and P.-O. Amblard, “Extended l-ensembles: A new representation for determinantal point processes,” The Annals of Applied Probability, vol. 33, no. 1, pp. 613–640, 2023.
[60]
B. C. Hall, Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, vol. 222. New York: Springer, 2010.
[61]
A. Borodin and E. M. Rains, “Eynard–mehta theorem, schur process, and their pfaffian analogs,” Journal of statistical physics, vol. 121, no. 3, pp. 291–317, 2005.
[62]
A. Elben et al., “The randomized measurement toolbox,” Nature Reviews Physics, pp. 1–16, 2022, [Online]. Available: https://arxiv.org/abs/2203.11374.
[63]
A. Yu. Kitaev, “Fault-tolerant quantum computation by anyons,” Annals of physics, vol. 303, no. 1, pp. 2–30, 2003, [Online]. Available: https://doi.org/10.1016/S0003-4916(02)00018-0.
[64]
E. Knill, R. Laflamme, and W. H. Zurek, “Resilient quantum computation: Error models and thresholds,” Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, vol. 454, no. 1969, pp. 365–384, 1998, [Online]. Available: https://www.jstor.org/stable/i203157.
[65]
J. Preskill, “Lecture notes for physics 229: Quantum information and computation,” California Institute of Technology, vol. 16, no. 1, pp. 1–8, 1998, [Online]. Available: http://theory.caltech.edu/~preskill/ph229/.
[66]
Z. Cai et al., “Quantum error mitigation,” arXiv preprint arXiv:2210.00921, 2022, [Online]. Available: https://arxiv.org/abs/2210.00921.

  1. Corresponding author: remi.bardenet@univ-lille.fr↩︎

  2. Note our different notation compared to [15], who use \(N\) for the number of rows.↩︎

  3. Statisticians would say epistemic uncertainty, i.e., imprecise knowledge of the state.↩︎

  4. [def:JWT] gives a realization of ?? ; see 3.↩︎

  5. The parametrization of Givens rotations slightly differs from [22].↩︎

  6. Note that, unlike the common QR decomposition, the matrix \(R\) obtained here has more structure: it contains only one non-zero element per row. To some extent, the decomposition we discuss is close to a singular value decomposition; see the end of 5.2.3 for a short discussion.↩︎

  7. There might be zeros that appear out of collinearity, and thus do not need to be forced.↩︎

  8. An alternative strategy uses a chemical potential as in 4.3.↩︎

  9. Recall that the \(a_k\)’s are the Jordan-Wigner operators representing the \(c_k\)’s in 23 .↩︎

  10. An example illustrating a case where \(\mathbf{W}_2\) is rank-deficient is given in 8.6.↩︎

  11. https://github.com/For-a-few-DPPs-more/quantum-sampling-DPPs↩︎

  12. In python, the first entry of an array has label zero.↩︎

  13. Recall that the TV is the maximum absolute difference between the probabilities assigned to a subset, where the maximum is taken across subsets.↩︎

  14. See https://quantum-computing.ibm.com/services/resources?services=systems↩︎

  15. https://qiskit.org/textbook/ch-quantum-hardware/randomized-benchmarking.html↩︎