September 26, 2025
Random states find diverse applications in modern quantum science including quantum circuit complexity [1], quantum device benchmark [2], and entanglement estimation [3]. Different assumptions on the structure of Hilbert space lead to different random state models with the most prominent ones being the Hilbert-Schmidt ensemble [3], [4], the Bures-Hall ensemble [5], [6], and the fermionic Gaussian ensemble [7], [8]. Exact statistical characterization of entropic metrics, such as von Neumann entropy and entanglement capacity, over these ensembles has been intensively studied for a single random state from one ensemble [3]–[17]. Much less explored are distinguishability metrics of two random states from the same or two distinct ensembles. In such scenarios, the most fundamental metric is the relative entropy [18], [19].
A density matrix \(\rho\) is a positive semidefinite matrix in a Hilbert space satisfying \[\label{eq:dm} \,\mathrm{tr}\rho=1.\tag{1}\] The relative entropy between two density matrices \(\rho\) and \(\sigma\) of the same dimension is defined as [18], [19] \[\label{eq:red} D(\rho||\sigma)=\,\mathrm{tr}\left(\rho\left(\ln\rho-\ln\sigma\right)\right).\tag{2}\] In classical information theory, relative entropy is known as Kullback-Leibler divergence [20] that measures the difference between two probability density functions. As a distinguishability metric of density matrices of quantum states, the relative entropy (2 ) satisfies various information-theoretic properties [19]. These include nonnegativity \[D(\rho||\sigma)\geq0\] with equality \(D(\rho||\sigma)=0\) for identical density matrices \(\rho=\sigma\), and monotonicity \[D(\rho||\sigma)\geq D\left(\Psi(\rho)||\Psi(\sigma)\right)\] with \(\Psi\) denoting any completely positive trace-preserving map.
Clearly, the relative entropy (2 ) becomes a random variable when considering random states. A natural first task is to understand the typical behavior of relative entropy by computing its average value. As discussed in [18], closed-form formulas of relative entropy of random states are useful in several areas of quantum information theory including quantum hypothesis testing [21] and eigenstate thermalization hypothesis [22]. Despite the importance, existing results in this direction are rather limited. In [18], using some large-dimensional estimation and the replica trick, an approximate expression of average relative entropy of Hilbert-Schmidt ensemble (13 ) was obtained. In this work, we derive exact yet explicit formulas of average relative entropy of the Hilbert-Schmidt ensemble (13 ) and another generic state model of Bures-Hall ensemble (18 ). The more interesting case of relative entropy between the two ensembles is also considered, where we derive the corresponding exact average formulas. From the definition (2 ), the average relative entropy of random states is given by \[\label{eq:rem} \,\mathbb{E}\!\left[D(\rho||\sigma)\right]=\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho\ln\rho)\right]-\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho\ln\sigma)\right].\tag{3}\] The first term \(\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho\ln\rho)\right]\) is, up to a negative sign, the average entanglement entropy of reduced density matrix \(\rho\) of a bipartite system. In the literature, an exact formula of average entanglement entropy for the Hilbert-Schmidt (HS) ensemble (13 ) is obtained in [3], [4], [9], [11] as \[\label{eq:HSm} \,\mathbb{E}\!\left[-\,\mathrm{tr}(\rho_{\rm{HS}}\ln\rho_{\rm{HS}})\right]=\psi_{0}(mn+1)-\psi_{0}(n)-\frac{m+1}{2n}\tag{4}\] and that for the Bures-Hall (BH) ensemble (18 ) is obtained in [5], [15] as \[\label{eq:BHm} \,\mathbb{E}\!\left[-\,\mathrm{tr}(\rho_{\rm{BH}}\ln\rho_{\rm{BH}})\right]=\psi_{0}\!\left(mn-\frac{m^2}{2}+1\right)-\psi_{0}\!\left(n+\frac{1}{2}\right),\tag{5}\] where the dimension of density matrices \(\rho_{\rm{HS}}\) and \(\rho_{\rm{BH}}\) is \(m\) with parameter \(n\), cf. (15 ). The function \(\psi_{0}\) denotes the digamma function [23] \[\label{eq:dg} \psi_{0}(x)=\frac{\,\mathrm{d}}{\,\mathrm{d}x}\ln\Gamma(x),\tag{6}\] where, for a positive integer \(l\), one has \[\label{eq:dgs} \begin{eqnarray} &&\psi_{0}(l)=-\gamma+\sum_{k=1}^{l-1}\frac{1}{k} \\ &&\psi_{0}\!\left(l+\frac{1}{2}\right)=-\gamma-2\ln2+2\sum_{k=0}^{l-1}\frac{1}{2k+1} \end{eqnarray}\tag{7}\] with \(\gamma\approx0.5772\) being the Euler’s constant. The result (4 ) is the celebrated Page’s mean entropy formula [3].
The computation of average relative entropy now boils down to computing \[\label{eq:rem2} \,\mathbb{E}\!\left[\,\mathrm{tr}(\rho\ln\sigma)\right]\tag{8}\] in (3 ) that involves two independent random density matrices. The expected value (8 ) quantifies the average lack of information of \(\rho\) under \(\sigma\), where \(\sigma\) is typically a simpler model associated with a more general model \(\rho\). For example, if \(\rho\) is some full probabilistic model, then \(\sigma\) can be a reduced-order approximate model, which is less informative but is often useful to accelerate computations [20]. Consequently, in addition to the two cases \(\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho_{\rm{HS}}\ln\sigma_{\rm{HS}})\right]\) and \(\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho_{\rm{BH}}\ln\sigma_{\rm{BH}})\right]\), we also focus on the case \(\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho_{\rm{BH}}\ln\sigma_{\rm{HS}})\right]\) instead of the option \(\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho_{\rm{HS}}\ln\rho_{\rm{BS}})\right]\). This is because the Bures-Hall ensemble is a natural generalization of the Hilbert-Schmidt ensemble as can be seen by comparing their density matrix constructions (16 ) and (20 ). Exact formulas of (8 ) in the considered three cases are the main results of this work. The results are summarized in the three propositions below, where the proofs are found in Section 2.
Proposition 1. For density matrices \(\rho_{\rm{HS}}\) and \(\sigma_{\rm{HS}}\) of Hilbert-Schmidt ensemble (13 ) of dimension \(m\) with parameters \(n_1=m+\alpha_1\) and \(n_2=m+\alpha_2\), respectively, the average relative entropy (3 ) is given by \[\begin{align} \label{eq:r1} \,\mathbb{E}\!\left[D(\rho_{\rm{HS}}||\sigma_{\rm{HS}})\right]&=&\psi_{0}\!\left(mn_{2}\right)-\psi_{0}\!\left(mn_{1}+1\right)+\psi_{0}\!\left(n_{1}\right)-\frac{1}{m}\left(n_{2}\psi_{0}\!\left(n_{2}\right)-\alpha_{2}\psi_{0}\!\left(\alpha_{2}\right)\right) \nonumber \\ &&+~\!\frac{m+2n_{1}+1}{2n_{1}}. \end{align}\qquad{(1)}\]
It is worth noting a limiting formula of Proposition 1; the result is presented as a corollary below.
Corollary 1. For density matrices \(\rho_{\rm{HS}}\) and \(\sigma_{\rm{HS}}\) of dimension \(m\) with parameters \(n_{1}\) and \(n_{2}\), respectively, in the regime \[\label{eq:lim} m\to\infty,~~~~n_{i}\to\infty,~~~~\frac{n_{i}}{m}=c_{i}\geq1,~~~~i=1,2\qquad{(2)}\] with \(c_{1}\) and \(c_{2}\) being fixed constants, the limiting formula of average relative entropy (?? ) is \[\label{eq:HSlim} \,\mathbb{E}\!\left[D(\rho_{\rm{HS}}||\sigma_{\rm{HS}})\right]=\left(c_{2}-1\right)\ln\left(1-\frac{1}{c_{2}}\right)+\frac{1}{2c_{1}}+1+\mathcal{O}\left(\frac{1}{m}\right).\qquad{(3)}\]
Corollary 1 is readily obtained from Proposition 1 by applying, in the limit (?? ), the asymptotic expansion of digamma function [23] \[\label{eq:dga} \psi_{0}(x)\sim\ln x-\frac{1}{2x}-\sum_{k=1}^{\infty}\frac{B_{2k}}{2k~\!x^{2k}},~~~~~~x\to\infty\tag{9}\] with \(B_{2k}\) being Bernoulli numbers.
The limiting formula (?? ) in the special case \(c_{1}=c_{2}\), i.e., states of equal parameters \(n_{1}=n_{2}\), has appeared recently as the main finding in [18]. Despite being derived using the replica trick along with some approximations, the result [18] precisely captures the asymptotic behavior of average relative entropy of Hilbert-Schmidt ensemble to the leading order.
The next proposition pertains to average relative entropy of Bures-Hall ensemble.
Proposition 2. For density matrices \(\rho_{\rm{BH}}\) and \(\sigma_{\rm{BH}}\) of Bures-Hall ensemble (18 ) of dimension \(m\) with parameters \(n_1=m+\alpha_1\) and \(n_2=m+\alpha_2\), respectively, the average relative entropy (3 ) is given by \[\begin{align} \label{eq:r2} \,\mathbb{E}\!\left[D(\rho_{\rm{BH}}||\sigma_{\rm{BH}})\right]&=&\psi_{0}\!\left(mn_{2}-\frac{m^{2}}{2}\right)-\psi_{0}\!\left(mn_{1}-\frac{m^{2}}{2}+1\right)+\psi_{0}\!\left(n_{1}+\frac{1}{2}\right)\nonumber\\ &&+~\!\frac{n_{2}}{m}\psi_{0}\!\left(n_{2}\right)-\frac{2n_{2}-m}{m}\left(\psi_{0}\!\left(n_{2}-\frac{m}{2}\right)+\psi_{0}\!\left(n_{2}-\frac{m}{2}+\frac{1}{2}\right)\right)\nonumber\\ &&+~\!\frac{\alpha_{2}}{m}\left(\psi_{0}\!\left(\alpha_{2}\right)+2\psi_{0}\!\left(\alpha_{2}+\frac{1}{2}\right)\right)+1. \end{align}\qquad{(4)}\]
The limiting formula of Proposition 2 in the regime (?? ) is similarly obtained by using the asymptotic expansion (9 ) as \[\begin{align} \label{eq:BHlim} \,\mathbb{E}\!\left[D(\rho_{\rm{BH}}||\sigma_{\rm{BH}})\right]&=&\ln c_{1}+c_{2}\ln c_{2}-\ln\left(c_{1}-\frac{1}{2}\right)-\left(4c_{2}-3\right)\ln\left(c_{2}-\frac{1}{2}\right)\nonumber\\ &&+~\!3\left(c_{2}-1\right)\ln\left(c_{2}-1\right)+1+\mathcal{O}\left(\frac{1}{m}\right). \end{align}\tag{10}\]
Besides states from the same ensemble, relative entropy \(D(\rho||\sigma)\) also quantifies the lack of information between \(\sigma\) of one ensemble and \(\rho\) of a generalized yet distinct ensemble. The proposition below quantifies the average of \(D(\rho_{\rm{BH}}||\sigma_{\rm{HS}})\).
Proposition 3. For density matrices \(\rho_{\rm{BH}}\) of Bures-Hall ensemble (18 ) and \(\sigma_{\rm{HS}}\) of Hilbert-Schmidt ensemble (13 ) of dimension \(m\) with parameters \(n_1=m+\alpha_1\) and \(n_2=m+\alpha_2\), respectively, the average relative entropy (3 ) is given by \[\begin{align} \label{eq:r3} \,\mathbb{E}\!\left[D(\rho_{\rm{BH}}||\sigma_{\rm{HS}})\right]&=&\psi_{0}\left(mn_{2}\right)-\psi_{0}\left(mn_{1}-\frac{m^{2}}{2}+1\right)+\psi_{0}\left(n_{1}+\frac{1}{2}\right)\nonumber\\ &&-~\!\frac{1}{m}\left(n_{2}\psi_{0}\left(n_{2}\right)-\alpha_{2}\psi_{0}\left(\alpha_{2}\right)\right)+1. \end{align}\qquad{(5)}\]
Note that an exact formula of \(\,\mathbb{E}\!\left[D(\rho_{\rm{HS}}||\sigma_{\rm{BH}})\right]\) can be similarly derived, which, despite being less relevant from an information-theoretic perspective, is presented in (41 ) for completeness. We also note that a direct use of (9 ) leads to the asymptotic formula of Proposition 3 in the regime (?? ) as \[\begin{align} \label{eq:HS2BHlim} \,\mathbb{E}\!\left[D(\rho_{\rm{BH}}||\sigma_{\rm{HS}})\right]&=&\ln c_{1}-\left(c_{2}-1\right)\ln c_{2}-\ln\left(c_{1}-\frac{1}{2}\right)\nonumber\\ &&+\left(c_{2}-1\right)\ln\left(c_{2}-1\right)+1+\mathcal{O}\left(\frac{1}{m}\right). \end{align}\tag{11}\]
Before presenting proofs to the main results in Section 2, we perform a number of numerical studies as summarized in the two figures. In Figure 1, we plot average relative entropy of states from the same ensemble, where the left-hand side and right-hand side subfigures correspond to the Hilbert-Schmidt and Bures-Hall ensembles, respectively. For each ensemble, we consider three different values of parameter \(c_{1}\) while fixing \(c_{2}\). The numerical simulations, illustrated by the scatters with each averaged over \(10^{6}\) realizations of density matrices, match well with exact formulas of Proposition 1 and Proposition 2. As system dimension \(m\) increases, we observe fast convergence to the dash-dot horizontal lines representing limiting values computed from (?? ) and (10 ). It is seen from the figure that for a fixed \(c_{2}\) the average relative entropy decreases as \(c_{1}\) increases, which agrees with the observation in [18]. This phenomenon is consistent with the general principle that relative entropy becomes larger when the two density matrices \(\rho\) and \(\sigma\) are more distinct [20]. Intuitively, as \(c_{1}\) (or \(c_{2}\)) increases, the random density matrix \(\rho\) (or \(\sigma\)) approaches a deterministic matrix leading to diminished distinguishability between the two density matrices. The case \(c_1=c_2=1\) reaches the largest value of average relative entropy for any given \(m\), which corresponds to maximum randomness in the density matrices. In fact, as \(c_1\) or \(c_2\) increases the monotonic decreasing phenomenon can be analytically verified from the limiting formulas (?? ) and (10 ). We also observe that the average relative entropy of Bures-Hall ensemble attains a larger value than the corresponding Hilbert-Schmidt ensemble, which is due to the wider spectral width of Bures-Hall ensemble [17].
In Figure 2, we plot average relative entropy of states from two different ensembles pertaining to the case in Proposition 3. We consider several values of parameters \(c_{1}\) and \(c_{2}\). Similarly to Figure 1, one observes even faster convergence to the dash-dot horizontal lines that represent limiting constants calculated from (11 ). As expected, it is seen that for a fixed parameter the average relative entropy decreases as the other parameter increases, where the case \(c_1=c_2=1\) corresponds to the largest value of average relative entropy for a given \(m\). It is also observed that for \(c_{1}\) greater than \(c_{2}\) the relative average entropy is greater than that of the case when \(c_{1}\) and \(c_{2}\) are exchanged. This may indicate improved robustness to parameter changes of Bures-Hall ensemble in distinguishing quantum states than Hilbert-Schmidt ensemble.
In this section, we present proofs to the main results in Propositions 1–3 on average relative entropy. In Section 2.1, we introduce random state models of Hilbert-Schmidt and Bures-Hall ensembles, and discuss their significance in quantum information processing. In Section 2.2, we first evaluate the required average over unitary group of the problem that leads to a factorization of averages of two density matrices before computing the factorized averages in scenarios considered in the three propositions.
We outline the density matrix formalism [24], introduced by von Neumann, from which different generic state ensembles are constructed. Consider a composite system of two subsystems \(A\) and \(B\) of Hilbert space dimensions \(m\) and \(n\), respectively. Without loss of generality, one assumes \(m\leq n\). A generic state of the bipartite system is written as a linear combination of the random coefficients \(c_{i,j}\) and bases of the two subsystems as \[\label{eq:state} \Ket{\psi}=\sum_{i=1}^{m}\sum_{j=1}^{n}c_{i,j}\Ket{i_{\rm{A}}}\otimes\Ket{j_{\rm{B}}},\tag{12}\] where the coefficients \(c_{i,j}\) are independently and identically distributed standard complex Gaussian random variables.
The Hilbert-Schmidt ensemble is the probability density function of the \(m\times m\) reduced density matrix \(\rho_{\rm{A}}=\,\mathrm{tr}_{\rm{B}}(\rho)\) of the smaller subsystem \(A\), obtained through partial trace of the full density matrix \(\rho=\ket{\psi}\bra{\psi}\), as [3], [24] \[\label{eq:HS} f_{\rm{HS}}\!\left(\rho_{\rm{A}}\right)=\frac{1}{C_{\rm{HS}}} \delta\left(1-\,\mathrm{tr}\rho_{\rm{A}}\right)\det\!\!\!~^{\alpha}(\rho_{\rm{A}})\,\mathrm{d}\rho_{\rm{A}},\tag{13}\] where \[\label{eq:cHS} C_{\rm{HS}}=\frac{\pi^{\frac{1}{2}m(m-1)}}{\Gamma\left(m(m+\alpha)\right)}\prod_{i=1}^{m}\Gamma(i+\alpha)\tag{14}\] and \[\label{eq:a} \alpha=n-m\geq0\tag{15}\] denotes the dimension difference of the two subsystems. The Hilbert-Schmidt ensemble (13 ) is also referred to as fixed-trace Wishart ensemble [3], [4], [11] constructed from a Wishart matrix \(YY^{\dagger}\) as \[\label{eq:maHS} \rho_{\rm{A}}=\frac{YY^{\dagger}}{\,\mathrm{tr}\left(YY^{\dagger}\right)}\tag{16}\] with \(Y\) denoting an \(m\times n\) matrix of independent complex Gaussian entries.
The Bures-Hall ensemble generalizes the Hilbert-Schmidt ensemble in that its state \[\label{eq:BHs} \Ket{\varphi}=\Ket{\psi}+\left(U\otimes I_{n}\right)\Ket{\psi}\tag{17}\] is a superposition of the state (12 ) with a rotation by an \(m\times m\) random unitary matrix \(U\), where \(I_{n}\) denotes an identity matrix of dimension \(n\). The probability density function of the unitary matrix \(U\) is proportional to \(\det\left(I_{m}+U\right)^{2\alpha}\). Taking the partial trace \(\rho_{\rm{A}}=\,\mathrm{tr}_{\rm{B}}(\rho)\) over density matrix \(\rho=\Ket{\varphi}\Bra{\varphi}\) of the state (17 ), the Bures-Hall ensemble is written as [5], [24] \[\label{eq:BH} f_{\rm{BH}}\!\left(\rho_{\rm{A}}\right)=\frac{1}{C_{\rm{BH}}} \delta\left(1-\,\mathrm{tr}\rho_{\rm{A}}\right)\det\!\!\!~^{\alpha}(\rho_{\rm{A}})\int_{Z}\,\mathrm{e}^{-\,\mathrm{tr}\left(\rho_{\rm{A}}Z^{2}\right)}\,\mathrm{d}Z\,\mathrm{d}\rho_{\rm{A}},\tag{18}\] where \[\label{eq:cBH} C_{\rm{BH}}=\frac{\pi^{m^2}m!~\!2^{-m(m+2\alpha-1)}}{\Gamma\left(m(m+2\alpha)/2\right)}\prod_{i=1}^{m}\frac{\Gamma(i+2\alpha)}{\Gamma(i+\alpha)}\tag{19}\] and \(Z\) is an \(m\times m\) Hermitian matrix. The matrix model corresponding to the state (17 ) of Bures-Hall ensemble is [5] \[\label{eq:maBH} \rho_{\rm{A}}=\frac{\left(I_{m}+U\right)YY^{\dagger}\left(I_{m}+U^{\dagger}\right)}{\,\mathrm{tr}\left(\left(I_{m}+U\right)YY^{\dagger}\left(I_{m}+U^{\dagger}\right)\right)},\tag{20}\] where \(YY^{\dagger}\) is a Wishart matrix as in (16 ).
The Dirac delta function \(\delta\) in the ensembles (13 ) and (18 ) reflects the defining property of density matrices (1 ) that \[\label{eq:t1} \,\mathrm{tr}\rho_{\rm{A}}=1.\tag{21}\] It is important to point out that the two unitarily invariant densities (13 ) and (18 ) are valid for a non-negative real \(\alpha\), and so are the results obtained involving \(\alpha\). Note also that we only display matrix-variate densities of the two ensembles that are sufficient for the purpose of this work. For computation of entanglement entropy involving eigenvalue densities of the two ensembles, we refer to [3]–[6], [10]–[16].
In principle, other generic state ensembles in the space of density matrices [24] can be constructed such as the fermionic Gaussian ensemble [7], [8]. Main reasons that the considered ensembles (13 ) and (18 ) stand out as major ones are discussed below.
The Hilbert-Schmidt ensemble corresponds to the simplest model of generic quantum states, where no prior information of the states needs to be assumed. The randomness of the states comes from the assumption of Gaussian distributed coefficients, which correspond to the most non-informative distribution. The ensemble can be thought of as the baseline Gaussian model universal in statistical modelling of an unknown variable. In the investigation of quantum information processing tasks, it is desirable to make use of Gaussian generic states to benchmark the performance.
The Bures-Hall ensemble is an improved variant of the Hilbert-Schmidt ensemble that satisfies a few additional properties. The Bures metric, that induces the Bures-Hall ensemble, is the only monotone metric that is simultaneously Fisher adjusted and Fubini-Study adjusted. The Bures metric, related to quantum distinguishability, is the minimal monotone metric. The ensemble is often used as a prior distribution known as Bures prior in reconstructing quantum states from measurements. Generic states from the Hilbert-Schmidt and Bures-Hall ensembles are physical in that they can be generated in polynomial time.
We consider eigenvalue decompositions of the density matrices \[\rho=V\Lambda_{\rho}V^{\dagger},~~~~~~~~\sigma=W\Lambda_{\sigma}W^{\dagger},\] where the diagonal matrices \(\Lambda_{\rho}\), \(\Lambda_{\sigma}\) consist of eigenvalues and \(V\), \(W\) are unitary matrices of eigenvectors. It is seen that the first term of the relative entropy (2 ), \[\,\mathrm{tr}(\rho\ln\rho)=\,\mathrm{tr}\left(\Lambda_{\rho}\ln\Lambda_{\rho}\right)\] does not depend on the eigenvectors, whereas the second term \[\label{eq:rem2a} \,\mathrm{tr}(\rho\ln\sigma)=\,\mathrm{tr}\left(\Lambda_{\rho}U\ln\Lambda_{\sigma}U^{\dagger}\right)\tag{22}\] does depend on eigenvectors through the unitary matrix \(U=V^{\dagger}W\). Since the densities (13 ) and (18 ) are unitarily invariant, evaluating the average (22 ) over eigenvalues and eigenvectors can be performed separately. The average over the latter is shown below via the connection to zonal polynomials [25]–[27].
For an \(m\times m\) Hermitian matrix \(X\), the \(l\)-th positive integer power of the trace can be uniquely decomposed as [27] \[\label{eq:trc} \,\mathrm{tr}^{l}\!\left(X\right)=\sum_{\kappa}C_{\kappa}(X),\tag{23}\] where the sum of zonal polynomials \(C_{\kappa}(X)\) is over partitions \(\kappa=(\kappa_{1},\kappa_{2},\dots,\kappa_{m})\) of \(l\) into no more than \(m\) parts \[\kappa_{1}+\kappa_{2}+\dots+\kappa_{m}=l,~~~~~~\kappa_{1}\geq\kappa_{2}\geq\dots\geq\kappa_{m}\geq0.\] The zonal polynomial \(C_{\kappa}(X)\) is a symmetric polynomial of degree \(l\) in the \(m\) eigenvalues \(\{x_i\}_{i=1}^{m}\) of \(X\) given by \[\label{eq:czp} C_{\kappa}(X)=\chi_{\kappa}(1)\chi_{\kappa}(X),\tag{24}\] where \[\label{eq:czp1} \chi_{\kappa}(1)=\frac{l!\prod_{1\leq i<j\leq m}(\kappa_{i}-\kappa_{j}-i+j)}{\prod_{j=1}^{m}(\kappa_{j}+m-j)!}\tag{25}\] is the dimension of the representation of the symmetric group and \[\label{eq:czp2} \chi_{\kappa}(X)=\frac{\det\left(x_{i}^{\kappa_{j}+m-j}\right)}{\det\left(x_{i}^{m-j}\right)}\tag{26}\] is character of the representation. In fact, the character (26 ) is the Schur polynomial, which can be written as a determinant of elementary symmetric polynomials \[e_{k}\left(x_{1},\dots,x_{m}\right)=\sum_{1\leq i_1<\cdots<i_k\leq m}x_{i_1}x_{i_2}\cdots x_{i_k}\] as [28] \[\label{eq:czp3} \chi_{\kappa}(X)=\det\left(e_{\kappa'-i+j}\left(x_{1},x_{2},\dots,x_{m}\right)\right)_{i,j=1}^{l(\kappa')}\tag{27}\] with \(l(\kappa)\) and \(\kappa'\) respectively denoting the length and the conjugate of a partition \(\kappa\).
Of interest to this work is the following integral identity of zonal polynomials involving two \(m\times m\) Hermitian matrices \(X\), \(Y\) over the invariant measure \(\,\mathrm{d}U\) of the unitary group \(U(m)\) as [25], [26] \[\label{eq:AUBU} \int_{U(m)}C_{\kappa}\!\left(XUYU^{\dagger}\right)\,\mathrm{d}U=\frac{C_{\kappa}\!\left(X\right)C_{\kappa}\!\left(Y\right)}{C_{\kappa}\!\left(I_{m}\right)}\tag{28}\] valid for any partition \(\kappa\). As a result, the unitary matrix in (22 ) is integrated out as \[\begin{align} \int_{U(m)}\,\mathrm{tr}(\rho\ln\sigma)\,\mathrm{d}U &=& \int_{U(m)}C_{1}\!\left(\Lambda_{\rho}U\ln\Lambda_{\sigma}U^{\dagger}\right)\,\mathrm{d}U \tag{29}\\ &=&\frac{C_{1}\!\left(\Lambda_{\rho}\right)C_{1}\!\left(\ln\Lambda_{\sigma}\right)}{C_{1}\!\left(I_{m}\right)} \tag{30} \\ &=&\frac{1}{m}\,\mathrm{tr}\rho\,\mathrm{tr}\left(\ln\sigma\right) \tag{31} \\ &=&\frac{1}{m}\,\mathrm{tr}\left(\ln\sigma\right). \tag{32} \end{align}\] In obtaining (29 ), we have employed the result (23 ) for \(l=1\), \[\,\mathrm{tr}(X)=C_{1}(X),\] where the only partition is \[\label{eq:k1} \kappa_{1}=1,~~~~\kappa_{2}=\dots=\kappa_{m}=0.\tag{33}\] The result (30 ) is an application of (28 ). The equality (31 ) follows from the identity \[C_{1}\!\left(I_{m}\right)=\chi_{\kappa}(1)\chi_{\kappa}\!\left(I_{m}\right)=m,\] which is established by inserting the partition (33 ) into (25 ) and (27 ) leading respectively to \[\chi_{\kappa}(1)=1,~~~~~~\chi_{\kappa}\!\left(I_{m}\right)=e_{1}(1,1,\dots,1)=m.\] Finally, (32 ) is due to the fact that \(\rho\) is a density matrix (21 ).
The above machinery of evaluating unitary integrals (28 ) through the connection to zonal polynomials (23 ) will be a key ingredient in computing higher-order moments of relative entropy. However, if the only goal is to compute the average relative entropy, the result (32 ) can be directly obtained by a standard fact of Weingarten calculus of a unitary matrix \(U=\left(u_{ij}\right)_{i,j=1}^{m}\), \[\int_{U(m)}u_{ij}u_{kl}^{\dagger}\,\mathrm{d}U=\frac{1}{m}\delta_{ik}\delta_{jl}\] as \[\begin{align} \int_{U(m)}\,\mathrm{tr}(\rho\ln\sigma)\,\mathrm{d}U &=& \int_{U(m)}\,\mathrm{tr}\left(\Lambda_{\rho}U\ln\Lambda_{\sigma}U^{\dagger}\right)\,\mathrm{d}U \\ &=&\sum_{i,j=1}^{m}\rho_{i}\ln\sigma_{j}\int_{U(m)}|u_{ij}|^{2}\,\mathrm{d}U \\ &=&\frac{1}{m}\sum_{i=1}^{m}\rho_{i}\sum_{j=1}^{m}\ln\sigma_{j} \\ &=&\frac{1}{m}\,\mathrm{tr}\left(\ln\sigma\right), \label{eq:U5} \end{align}\tag{34}\] where \(\{\rho_i\}_{i=1}^{m}\) and \(\{\sigma_i\}_{i=1}^{m}\) are the set of eigenvalues of \(\rho\) and \(\sigma\), respectively.
Either (32 ) or (34 ) now leads to the result \[\label{eq:rem3} \,\mathbb{E}\!\left[\,\mathrm{tr}(\rho\ln\sigma)\right]=\frac{1}{m}\,\mathbb{E}\!\left[\,\mathrm{tr}(\ln\sigma)\right],\tag{35}\] which is the starting point of following proofs.
For the Hilbert-Schmidt ensemble (13 ), by (3 ) and (35 ) the remaining task is to compute \[\,\mathbb{E}\!\left[\,\mathrm{tr}(\ln\sigma_{\rm{HS}})\right]\] that amounts to \[\begin{align} \,\mathbb{E}\!\left[\,\mathrm{tr}(\ln\sigma_{\rm{HS}})\right] &=& \,\mathbb{E}\!\left[\ln\left(\det\sigma_{\rm{HS}}\right)\right] \\ &=&\frac{1}{C_{\rm{HS}}}\int\delta\left(1-\,\mathrm{tr}\sigma_{\rm{HS}}\right)\frac{\,\mathrm{d}}{\,\mathrm{d}\alpha}\det\!\!\!~^{\alpha}(\sigma_{\rm{HS}})\,\mathrm{d}\sigma_{\rm{HS}} \\ &=&\frac{1}{C_{\rm{HS}}}\frac{\,\mathrm{d}}{\,\mathrm{d}\alpha}C_{\rm{HS}}, \label{eq:dHSc} \end{align}\tag{36}\] where one has made use of the definition \[C_{\rm{HS}}=\int\delta\left(1-\,\mathrm{tr}\sigma_{\rm{HS}}\right)\det\!\!\!~^{\alpha}(\sigma_{\rm{HS}})\,\mathrm{d}\sigma_{\rm{HS}}.\] The result (36 ) requires \(\alpha\) derivative of the constant (14 ) computed as \[\begin{align} \frac{\,\mathrm{d}}{\,\mathrm{d}\alpha}C_{\rm{HS}} &=& \left(-m\psi_{0}(m(m+\alpha))+\sum_{i=1}^{m}\psi_0(i+\alpha)\right)C_{\rm{HS}} \\ &=& \left(-m\psi_{0}(mn)+n\psi_{0}(n)-\alpha\psi_0(\alpha)-m\right)C_{\rm{HS}}, \label{eq:dHScr} \end{align}\tag{37}\] where we have utilized the definitions (6 ), (15 ), and the summation formula [4], [13] \[\label{eq:sum} \sum_{i=1}^{m}\psi_0(i+\alpha)=(m+\alpha)\psi_{0}(m+\alpha)-\alpha\psi_0(\alpha)-m.\tag{38}\] Inserting (37 ) into (36 ), one arrive at \[\label{eq:rem3HS} \,\mathbb{E}\!\left[\,\mathrm{tr}(\ln\sigma_{\rm{HS}})\right]=-m\psi_{0}(mn)+n\psi_{0}(n)-\alpha\psi_0(\alpha)-m.\tag{39}\] Finally, putting together the results (3 ), (4 ), (35 ), and (39 ) while keeping in mind that the respective parameters in (4 ) and (39 ) are \(n_{1}\) and \(n_{2}\) leads to the average relative entropy formula (?? ) in Proposition 1.
For the Bures-Hall ensemble (18 ), the remaining task is also to compute \[\,\mathbb{E}\!\left[\,\mathrm{tr}(\ln\sigma_{\rm{BH}})\right]\] that, similarly to (36 ), boils down to computing \(\alpha\) derivative of the constant (19 ) as
\[\begin{align} \,\mathbb{E}\!\left[\,\mathrm{tr}(\ln\sigma_{\rm{BH}})\right] &=& \,\mathbb{E}\!\left[\ln\left(\det\sigma_{\rm{BH}}\right)\right] \nonumber \\ &=&\frac{1}{C_{\rm{BH}}}\int\delta\left(1-\,\mathrm{tr}\sigma_{\rm{BH}}\right)\frac{\,\mathrm{d}}{\,\mathrm{d}\alpha}\det\!\!\!~^{\alpha}(\sigma_{\rm{BH}})\int_{Z}\,\mathrm{e}^{-\,\mathrm{tr}\left(\sigma_{\rm{BH}}Z^{2}\right)}\,\mathrm{d}Z\,\mathrm{d}\sigma_{\rm{BH}} \nonumber \\ &=&\frac{1}{C_{\rm{BH}}}\frac{\,\mathrm{d}}{\,\mathrm{d}\alpha}C_{\rm{BH}} \nonumber \\ &=&-2m\ln2-m\psi_{0}\!\left(\frac{m}{2}(m+2\alpha)\right)+2\sum_{i=1}^{m}\psi_{0}(i+2\alpha)-\sum_{i=1}^{m}\psi_{0}(i+\alpha) \nonumber\\ &=&-m\psi_{0}\!\left(mn-\frac{m^{2}}{2}\right)-n\psi_{0}\!\left(n\right)-\alpha\left(\psi_{0}\!\left(\alpha\right)+2\psi_{0}\!\left(\alpha+\frac{1}{2}\right)\right) \nonumber\\ &&+\left(2n-m\right)\left(\psi_{0}\!\left(n-\frac{m}{2}\right)+\psi_{0}\!\left(n-\frac{m}{2}+\frac{1}{2}\right)\right)-m, \label{eq:rem3BH} \end{align}\tag{40}\] where we have utilized the definitions (6 ), (15 ), the summation formula (38 ), and the identity [23] \[2\psi_{0}(2x)=\psi_{0}\!\left(x+\frac{1}{2}\right)+\psi_{0}(x)+2\ln2.\] Inserting the results (5 ), (35 ), and (40 ) into (3 ) before replacing parameters \(n\) in (5 ) and (40 ) respectively by \(n_{1}\) and \(n_{2}\), we arrive at the formula (?? ) in Proposition 2.
For average relative entropy between Bures-Hall and Hilbert-Schmidt ensembles, the corresponding formulas can be read off from the obtained results (5 ), (35 ), and (39 ) as \[\begin{align} \,\mathbb{E}\!\left[D(\rho_{\rm{BH}}||\sigma_{\rm{HS}})\right]&=&\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho_{\rm{BH}}\ln\rho_{\rm{BH}})\right]-\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho_{\rm{BH}}\ln\sigma_{\rm{HS}})\right]\\ &=&\,\mathbb{E}\!\left[\,\mathrm{tr}(\rho_{\rm{BH}}\ln\rho_{\rm{BH}})\right]-\frac{1}{m}\,\mathbb{E}\!\left[\,\mathrm{tr}(\ln\sigma_{\rm{HS}})\right]\\ &=&\psi_{0}\!\left(mn_{2}\right)-\psi_{0}\!\left(mn_{1}-\frac{m^{2}}{2}+1\right)+\psi_{0}\!\left(n_{1}+\frac{1}{2}\right)\nonumber\\ &&-~\!\frac{1}{m}\left(n_{2}\psi_{0}\!\left(n_{2}\right)-\alpha_{2}\psi_{0}\!\left(\alpha_{2}\right)\right)+1, \end{align}\] which is the formula (?? ) in Proposition 3.
Similarly, an exact formula of average relative entropy of the case \(D(\rho_{\rm{HS}}||\sigma_{\rm{BH}})\) can be extracted as \[\begin{align} \label{eq:r4} \,\mathbb{E}\!\left[D(\rho_{\rm{HS}}||\sigma_{\rm{BH}})\right]&=&\psi_{0}\!\left(mn_{2}-\frac{m^{2}}{2}\right)-\psi_{0}\!\left(mn_{1}+1\right)+\psi_{0}\!\left(n_{1}\right)\nonumber\\ &&+~\!\frac{n_{2}}{m}\psi_{0}\!\left(n_{2}\right)-\frac{2n_{2}-m}{m}\left(\psi_{0}\!\left(n_{2}-\frac{m}{2}\right)+\psi_{0}\!\left(n_{2}-\frac{m}{2}+\frac{1}{2}\right)\right)\nonumber\\ &&+~\!\frac{\alpha_{2}}{m}\left(\psi_{0}\!\left(\alpha_{2}\right)+2\psi_{0}\!\left(\alpha_{2}+\frac{1}{2}\right)\right)+\frac{m+2n_{1}+1}{2n_{1}}. \end{align}\tag{41}\]
The work of Lu Wei was supported by the U.S. National Science Foundation (2306968) and the U.S. Department of Energy (DE-SC0024631).