This article introduces operator on operator regression in quantum probability. Here in the regression model, the response and the independent variables are certain operator valued observables, and they are linearly associated with unknown scalar
coefficient (denoted by ), and the error is a random operator. In the course of this study, we propose a quantum version of a class of estimators (denoted by estimator) of
, and the large sample behaviour of those quantum version of the estimators are derived, given the fact that the true model is also linear and the samples are observed eigenvalue pairs of the operator valued
observables.
Regression analysis is one of the most well-known toolkit in statistical modelling. More specifically, regression analysis studies the “relationship" between the dependent/response variable and one or more independent/explanatory variables.
This”relationship" can be linear, non-linear or even non-parametric as well (see, e.g., [1] and a few relevant references therein), and using various
statistical techniques, one may estimate the unknown parameters involved in the “relationship". Recently, the concepts of regression have been extended to functional valued random elements also; for instance, the interested reader may look at [2] and a few relevant references therein regarding function on function regression. However, to the best of our knowledge, regression analysis is an area
unexplored in the setting of quantum probability. This work aims to investigate various statistical issues involved in operator on operator regression in quantum probability. We first provide a literature review and then brief the research problem without
much technical details in the following.
In the context of other statistical concepts studied in the language of quantum probability theory, recently, [3] extended the concepts of sufficient
statistic and Rao-Blackwellization in the framework of quantum probability, and a few years before, [4] overviewed and reviewed the problem of
statistical sufficiency in the setting of quantum statistics. They provided sufficient and necessary conditions for sufficiency and established many results from traditional statistical theory in the language of noncommutative geometry, and at the same
time, [5] derived the noncommutative version of well-known factorization theorem in traditional Statistics (see, e.g, [6]).
Let us now consider a pair of operator valued observables (see, e.g., [7] for details on observables) with
observed pairs of eigen values . Observe that here in the setting of quantum probability, we are observing the eigen values of , unlike the observations on the random element as in traditional Statistics/classical Probability. Suppose that unobserved operators
and are linearly associated in true sense with scalar valued coefficient (see 1 ), and in the sense of regression modelling, the operator valued observables are assumed to be linear with
scalar valued coefficient having some operator valued error (see 2 ). In this work, based on , we estimate (denoted by as an estimator) the unknown parameter involved in the model described in 2 . In the course of this study, we establish the connection between
the concepts in quantum probability and classical probability (see Section 2.1), and in view of this relation, we investigate the large sample properties (see
Section 3).
The rest of the article is structured as follows. Section 2 describes the preliminaries of the model in terms of quantum probability, and for this problem, a key connection between the quantum probability and classical
probability is investigated in Section 2.1, and the problem is reformulated in terms of classical probability in Section 2.2. The large sample properties of the proposed estimator is
studied in Section 3, and Section 4 contains a few relevant concluding remarks. Finally, necessary technical details are provided in the Appendix in Section 5.
Let be a real separable Hilbert space with inner-product and norm . If and are two bounded linear operators on related by for some non-zero real scalar , then given any eigen-value of , we have an eigen-value
of with the same eigen vector of unit norm. Again, given any eigen-value of , we
have an eigen-value of with the same eigen vector of unit norm. In light of this fact, one may expect that the linear regression model associating two
operator valued observables can be represented by their corresponding eigen values. The research problem is foramlly described in the following.
Let and be two compact self-adjoint operator (on ) valued observables related by where is unknown. We now consider the operator on operator regression model where is a random operator valued error, and is an unknown constant. In this work, our objective is to propose an
estimator (denoted by ) of the unknown based on the observed eigenvalues of . The technical assumptions are stated in
Section 3, where the large sample properties of are studied.
Suppose that are pairs of observed eigen-values of . Motivated by our earlier
observation on deterministic operators and , we assume that each pair appear with a common eigen-vector of unit norm. Here, as mentioned earlier, using the observed , we would like to estimate the unknown parameter . The estimation procedure of is described in the following.
In order to carry our the estimation of , we first reformulate the problem, from originally stated in terms of and in the quantum
probability setup to their eigen-value pairs in the classical probability setup. Next, using the classical probability model thus obtained the problem is reduced to a standard linear regression problem
and we can then use classical statistical tools in order to estimate . Section 2.1 studies the eigen decomposition of operator valued observables and , which is the key concept in reformulating the problem in the setup of classical probability.
2.1 Distribution of eigen decomposition of and : From quantum to classical
probability↩︎
Assume that and are compact self-adjoint operators on with discrete spectrum given by and , respectively. As per our discussion above, the eigenspaces corresponding to each feasible eigen-value pairs are the same. Consequently, we have the following
spectral decompositions (see [8]): where the sum is countable, and denotes the projection onto the eigen-space of any eigen-value . Here, the eigenvalues are all real and the
eigen-spaces corresponding to are finite-dimensional. Now, given any state (unit trace), we have the expectations in the setting of quantum
probability as
Note that . Moreover, . A similar statement is true when is replaced by .
Remark 1. Observe that one can think of ’s drawn independently from a discrete probability distribution supported on and with the
probability mass function Similarly, ’s are drawn independently from a discrete probability distribution
supported on and with the probability mass function This fact enables us to reformulate the problem in the set
up of traditional statistics.
Given pairs of observed eigen-values of , assume that be the corresponding eigen-vectors of unit norm. Then, where for , , and similarly, we have Next, it follows from 2 , 3 and 4 that for all , , which yields and consequently,
We now estimate , involved in 5 , based on in various ways, by assuming conditions on the error terms . Now, recall Remark 1 that ’s are i.i.d. random variables following a discrete probability distribution supported on and with the probability mass function and ’s are i.i.d. random variables following a discrete probability distribution supported on and with the probability mass function
For the data (), the -estimator of (see 5 ), which is denoted by , is defined as where is a certain convex function, and few more assumptions will be stated in the subsequent section. Observe that for ,
coincides with the well-known least squares estimator (see, e.g., [9], pp.–43) based on the data
, and when , coincides with the well-known least absolute deviation estimator
(see, e.g., [9], pp.–43) based on the data . Besides, it also includes Huber’s
estimator with , where (see, e.g., [9], p.), regression estimate with , where (see, e.g., [10]) and well-known -th regression quantiles with , where (see, e.g., [9], p., equation
(5.5)). This is one of the reasons to consider the technique of -estimation, which includes many well-known estimators of the unknown regression parameter.
In order to study the large sample properties of defined in 6 , one needs to assume the following technical conditions. (A1)
is a strictly convex function.
(A2) The error random variables () are i.i.d., and the common distribution of , which is denoted by (depends on joint distribution of ), is such that . Here is the set of discontinuity points of , where denotes
the derivative of .
(A3) as . Here and are two some constants.
(A4) exists for some , where is a some constant. Moreover, is a continuous function at .
(A5) .
(A6) for some and
Remark 2. Condition (A1) is required to have unique minima of the equation described in 6 , and it is a common assumption across the literature of M-estimation (see, e.g., [9]) in traditional Statistics. Condition (A2) indicates certain smoothness of the function along with the fact that the error terms
(in the forms of inner product) are i.i.d. random variables. Conditions (A3) and (A4) imply the existence of the derivative of the criterion function of estimation in some neighbourhood of , which is necessary to solve minimization problem involved in estimation. Next, Condition (A5) imply that the -estimator (see 6 ), after appropriate normalization, will weakly converge to a non-degenerate random variable (see the statement of Theorem 2 below, and finally, Condition (A6) is required to have the asymptotic normality of , and from statistical point of view, one can interpret as the variation explained by any eigen
value will not be a dominating factor.
Now, we state the consistency and the asymptotic normality results associated with .
Theorem 1. Under (A1)–(A6), as . Here denotes the convergence in probability, and and are the same as defined in 6 and 5 , respectively.
Theorem 2. Under (A1)–(A6), where is a random variable associated with standard
normal distribution, and denotes the convergence in distribution. Here and , where , and are the same as defined in (A5), (A3) and (A6), respectively.
Remark 3. In the set up of quantum probability, the observations are eigen value pairs of unlike the classical probability, where the observations are obtained from the distribution of . Despite this issue, Theorems 1 and 2 assert that is a consistent estimator of , and after appropriate normalization, converges weakly to a standard normal random variable.
In this work, observe that the true model (see 1 ) is also linear with scalar valued coefficient, i.e., in other words, we work on specified model. However, the study will be entirely different if the model becomes
mis-specified, which is of great interest in many statistical problem (see, e.g., [11] and a few relevant references therein). Investigating this issue
may be of interest to future research. Besides, we have assumed our operators to be compact and self-adjoint. The classical analog would be to assume that the real random variables are discrete. It will be of interest when the involved operators are just
self-adjoint. Then the spectrum may be continuous and in this case, our technique would not suffice. Moreover, in this work as the parameter is scalar, the operators involved here are apriori simultaneously
disgonalizable and therefore they can be measured simultaneously.
In the course of this study, it is assumed that the error random variables () involved in 5 are i.i.d., and the assertions in the main results (i.e., Theorems 1 and 2)
rely on this assumption. However, this assumption can be weakened by following the approach considered in [12]. In this
work, we did not dig in this issue as establishing the large sample properties of is not the main theme of the work, and therefore, we kept the assumption simpler so that the readers can appreciate the
notion on quantum probability conveniently for operator on operator regression.
Subhra Sankar Dhar gratefully acknowledges his core research grant (CRG/2022/001489), Government of India. Suprio Bhar gratefully acknowledges the Matrics grant MTR/2021/000517 from the Science and Engineering Research Board (Department of Science &
Technology, Government of India). Soumalya Joardar acknowledges support from the SERB MATRICS grant (MTR 2022/000515).
The arguments in the proof are parallel to the proof of Theorem 2.2 in [13]. Without loss of
generality, we assume that . It is now enough to show that as , where
is such that as . Observe that it follows from (3.11) in [13] that there exists a sequence such that as and for all . Moreover,
in view of it, we also have as , where is the same as defined in the statement of Theorem 2.
Further, when for some , then observe that which follows from the definition of of defined in the statement of Theorem 2 and defined in (A6) before Remark 2. Moreover, observe that using (A4) on 7 , we have and hence,
Therefore, using 7 , 8 , 9 and 10 , we have as , and next, using (A1), we have as .
Finally, by the definition of in 6 , we have as . As for all , the proof follows.
Without loss of generality, we here consider . First of all, using (A6) and in view of Lindeberg Feller CLT (see [9]), we have as , where is a random variable associated with standard normal distribution, is
the same as defined in the statement of Theorem 2, and ‘’ denotes weak convergence. Hence, in order to prove this theorem, it is
enough to show that as , where and are the same as defined in 14 and 6 , respectively. Observe that using 7 and the definition of in 14 , we have
as , and for some sequence such that as and for some arbitrary constant , we have as . Now, it follows from 13 and 17 , we have as . Moreover, observe that when , we have Therefore, using 18 and 19 , we have as , where is an arbitrary constant. It completes the proof.
Dhar, S. S., Jha, P., and Rakshit, P. (2022). The trimmed mean in non-parametric regression function estimation. Theory Probab. Math. Statist., (107):133–158.
[2]
Dette, H. and Tang, J. (2024). Statistical inference for function-on-function linear regression. Bernoulli, 30(1):304–331.
[3]
Sinha, K. B. (2022). Sufficient statistic and Rao-Blackwell theorem in quantum probability. Infin. Dimens. Anal. Quantum Probab. Relat. Top.,
25(4):Paper No. 2240005, 16.
[4]
Jeň cová, A. and Petz, D. (2006b). Sufficiency in quantum statistical inference. A survey with examples. Infin. Dimens. Anal. Quantum Probab. Relat. Top.,
9(3):331–351.
[5]
Jeň cová, A. and Petz, D. (2006a). Sufficiency in quantum statistical inference. Comm. Math. Phys., 263(1):259–276.
[6]
Shao, J. (2003). Mathematical statistics. Springer Texts in Statistics. Springer-Verlag, New York, second edition.
[7]
Dirac, P. A. M. (1945). On the analogy between classical and quantum mechanics. Rev. Modern Phys., 17:195–199.
[8]
Knapp, A. W. (2005). Advanced real analysis. Cornerstones. Birkhäuser Boston, Inc., Boston, MA. Along with a companion volume Basic real analysis.
[9]
van der Vaart, A. W. (1998). Asymptotic statistics, volume 3 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press,
Cambridge.
[10]
Lai, P. Y. and Lee, S. M. S. (2005). An overview of asymptotic properties of lp regression under general classes of error distributions. Journal of the American Statistical
Association, 100(470):446–458.
[11]
Bagchi, P. and Dhar, S. S. (2024). Characterization of the least squares estimator: mis-specified multivariate isotonic regression model with dependent errors. Theory Probab. Math.
Statist., (110):143–158.
[12]
Wu, W. B. (2007). . The Annals of Statistics, 35(2):495 – 521.
[13]
Bai, Z. D., Rao, C. R., and Wu, Y. (1992). M-estimation of multivariate linear regression parameters under a convex discrepancy function. Statistica Sinica, 2(1):237–254.