February 09, 2024
We develop the Perron–Frobenius theory using a variational approach and extend it to a set of arbitrary matrices, including those that are neither irreducible nor essentially positive, and non-preserved cones. We introduce a new concept called a "quasi-eigenvalue of a matrix", which is invariant under orthogonal transformations of variables, and has various useful properties, such as determining the largest value of the real parts of the eigenvalues of a matrix. We extend Weyl’s inequality for the eigenvalues to the set of arbitrary matrices and prove the new stability result to the Perron root of irreducible nonnegative matrices under arbitrary perturbations. As well as this, we obtain new types of estimates for the ranges of the sets of eigenvalues and their real parts.
minimax problems ,matrix analysis,Perron root ,Birkhoff-Varga formula ,Rayleigh quotient 47A75,47A08,47A25 , 49J35 ,47A10
The Perron–Frobenius theorem, established by O. Perron (1907) and G. Frobenius (1912), states that any irreducible non-negative matrix \(A=(a_{i,j})\) has a simple eigenvalue \(\lambda^*(A) \in \mathbb{R}\) of the largest magnitude, moreover, its right and left eigenvectors \(\phi^*\), \(\psi^*\) are positive. Later, this theory has been extended by M. Krein, M. Rutman (1948) to operators on a Banach space that preserved cones. Another development of the theory was proposed by G. Birkhoff and R. Varga (1958), who extended the theory to the essentially positive matrices, including those for which the cone of positive vectors \(S^o_+:=\{u\in \mathbb{R}^n\mid~u_i> 0,i=1,\ldots,n\}\) is not necessarily preserved. Furthermore, they discovered the following identity \[\label{BV} \lambda^*(A)=\lambda_{S_+}^*(A):=\sup _{u\in S^o_+}\inf _{v\in S^o_+}{\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}}=\inf _{v\in S^o_+}\sup _{u\in S^o_+}{\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}}\tag{1}\] which is now known as the Birkhoff-Varga formula [1]. Here \(S_+=\overline{S^o_+}\setminus 0\), \(\left\langle \cdot,\cdot \right\rangle\) denotes the inner product in \(\mathbb{R}^n\). The matrix \(A\) is said to be irreducible if \(A\) cannot be conjugated into block upper triangular form by a permutation matrix \(P\), and \(A\) is essentially positive matrix if it is irreducible and has nonnegative elements off the diagonal [2]. Hereafter, \(\lambda_j(A) \in \mathbb{C}\), \(j=1,\ldots,n\) denotes the eigenvalues of \(A\) which are repeated according to their multiplicity. For reader’s convenience, we provide the statement of the Birkhoff–Varga theorem (see Theorem 2.1, Theorem 8.2 in [2])
Theorem (Birkhoff–Varga). Assume that \(A\) is an essentially positive matrix. Then 1 holds true, moreover
\(\lambda_{S_+}^*(A)\) is a simple eigenvalue of \(A\), and the corresponding right and left eigenvectors \(\phi^*\), \(\psi^*\) are positive.
\(\lambda_{S_+}^*(A)= \max\{\operatorname{Re}\,\lambda_j(A),~j=1,\ldots,n\}\).
If in addition, \(A\) is a nonnegative matrix. Then \(\lambda_{S_+}^*(A)= \max\{|\lambda_j(A)|,~j=1,\ldots,n\}\).
\(\lambda_{S_+}^*(A)\) increases when any element of \(A\) increases.
It is common to call an eigenvalue of a matrix the Perron root if it equals the spectral radius \(\rho(A)\) of \(A\). Thus, under the hypothesis of assertion c) of the theorem the quantity \(\lambda_{S_+}^*(A)\) corresponds to the Perron root of \(A\). At the same time, this quantity for the essentially positive matrices is not necessarily equal to the spectral radius, moreover, it can take the values of any sign. It makes sense to call \(\lambda_{S_+}^*(A)\) defined by 1 the quasi-Perron root.
There are numerous applications of the Perron–Frobenius theory, see, e.g., [3]–[5] and references therein. This topic has gained our attention in light of the study initiated by the first author in [6]–[10], where the saddle-node bifurcations of nonlinear equations of the form \(F(u,\lambda):=T(u)-\lambda G(u)=0\) are determined in terms of the saddle points of the quotient \[\mathcal{R}(u,v):=\frac{\left\langle T(u), v\right\rangle}{\left\langle G(u), v\right\rangle},~~(u,v) \in U\times V,~\left\langle G(u), v\right\rangle \neq 0.\] Here \(U,V\) are appropriate subsets of a normed space \(X\), \(T,G: X \mapsto X^*\) are given maps. In particular, the saddle points of \(\mathcal{R}(u,v)\) can be found using the minimax formula (see [6], [8], [9]) \[\label{IG} \lambda^*=\sup _{u\in U}\inf _{v\in V}\mathcal{R}(u,v).\tag{2}\]
The minimax formula 2 can be justified by directly substituting the saddle point \((u^*,v^*)\) of \(\mathcal{R}(u,v)\) into 2 if it is known in advance, see [7], [8], [11]. It was this method that Birkhoff and Varga had used in [1] to justify the minimax formula 1 (see also [12]). However, for non-linear and infinite-dimensional equations finding solutions is itself a challenging problem. This entails the following problem
Could one find the saddle point \((u^*,v^*)\) of \(\mathcal{R}(u,v)\) directly through variational problem 2 ?.
The reader may find some answers to this problem in [6], [8]–[10], [13], [14], which deal with variational problems of type 2 , including those corresponding to nonlinear partial differential equations. At the same time, this field of study is still in its early stages, and many important questions remain unanswered. In this regard, finding the quasi-Perron root of the matrix \(A\) and the eigenvectors \(\phi^*, \psi^*\) directly through variational problem 1 may be viewed as a first-step problem to A.
The extension of Kerin-Rutman’s theory to cone-preserving matrices has received considerable attention (see, e.g., [12], [15]–[20]). In particular, a generalization of the Birkhoff-Varga formula 1 for cone-preserving matrices was studied in [12]. Note that 1 implies the well-known Collatz-Wieland formula: \[\lambda^*_{S_+}(A)= \max_{u>0}\min_{e\in \{e_1,\ldots,e_n\}} \frac{\left\langle A u, e\right\rangle}{\left\langle u, e\right\rangle},\] where \(e_1:=(1,0,\ldots,0)^T,...,e_n:=(0,\ldots,0,1)^T \in \mathbb{R}^n\) (see, e.g., [7]). Methods based on this formula, the so-called Wieland’s approach, were used to generalize the Perron-Frobenius theory to a set of matrices preserving cones [15], [21]–[23]. The first results on generalizing Perron’s theorem to matrices non-preserving cones were obtained by Birkhoff and Varga (1958). A remarkable feature of their work was that their proposed method, along with the Perron root made it possible to determine the largest value of the real parts of the eigenvalues of matrices. Finding the largest real part of the eigenvalues of operators is important in many problems, including the stability analysis of solutions to differential equations, detection of the Hopf bifurcation, analysis of iterative algorithms, etc, see, e.g., [2], [24]–[29]. It is worth noting that in applications, and in particular in finding bifurcations of solutions, it arises matrices that are not essentially positive or easy to classify of their type (see, e.g., [10], [25]), moreover, solutions are often sought in non-invariant cones. Nevertheless, the development of the Birkhoff-Varga approach did not receive as much attention as it should have. This naturally leads us to the following general problem:
Determine the widest class of matrices and cones for which the Perron-Birkhoff theory still holds, possibly in a generalized form.
Let us state our main results. In what follows, \((M_{n\times n}(\mathbb{R}),\|\cdot\|_M)\) denotes the space of real \(n\times n\) matrices endowed by standard operator norm \(\|\cdot\|_M\), \(\overline{K}(\mathbb{R}^n)\) is a set of self-dual solid convex cones in \(\mathbb{R}^n\), i.e., \(\overline{C} \in \overline{K}(\mathbb{R}^n)\) if \(\alpha u +\beta w \in \overline{C}\), \(\forall u,w \in \overline{C}\), \(\alpha, \beta \geq 0\), \(\overline{C}^{*}=\{u\in\mathbb{R}^n\mid \langle w,u\rangle \geq 0, \forall w\in \overline{C}\}=\overline{C}\), and \(C^o:=\)int\(\overline{C}\neq \emptyset\) (see [3], [30]). We denote \(C:=\overline{C}\setminus 0\), \(K(\mathbb{R}^n):=\{C\mid ~\overline{C}\in \overline{K}(\mathbb{R}^n)\}\). The subset of cones from \(K(\mathbb{R}^n)\) that are the positive orthant’s orthogonal transform will be denoted by \(K_O(\mathbb{R}^n)\), i.e., \(K_O(\mathbb{R}^n):=\{C \in K(\mathbb{R}^n)\mid C=US_+,~U \in O(n)\}\), where \(O(n)\) is the group of orthogonal matrixes. We call \[\displaystyle{\lambda(u,v):=\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}}, ~~ \langle u, v\rangle \neq 0, ~~u,v \in \mathbb{R}^n,\] the extended Rayleigh quotient (cf. [6], [9]). Note that for \(C \in K(\mathbb{R}^n)\), \(\lambda(u,v)\) is well-defined if \((u,v) \in C \times C^o\) or \((u,v) \in C^o \times C\). Let \(X,Y\) be subsets of \(\mathbb{R}^n\) such that \(\lambda(u,v)\) is well-defined on \(X \times Y\). We say that the minimax principle for \(\lambda(u,v)\) is satisfied in \(X \times Y\) if \[\begin{align} \label{MminD} -\infty<\sup_{u\in X}\inf_{v \in Y}\lambda(u,v)=\inf_{v \in Y}\sup_{u\in X}\lambda(u,v)<+\infty. \end{align}\tag{3}\] Vectors \(u_C(A) \in C\) and \(v_C(A) \in C\) is said to be the right and left quasi-eigenvectors of \(A\) in \(C\) if the following is fulfilled \[\begin{align} \tag{4} &\overline{\lambda}_C(A):=\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)=\inf_{v \in C^o}\lambda(u_C(A),v),\\ &\underline{\lambda}_C(A):=\inf_{v \in C}\sup_{u\in C^o}\lambda(u,v)=\sup_{u\in C^o}\lambda(u,v_C(A)). \tag{5} \end{align}\] We call \(\overline{\lambda}_C(A)\) the upper quasi-eigenvalue, and \(\underline{\lambda}_C(A)\) the lower quasi-eigenvalue of \(A\) in \(C\). Furthermore, if \[\begin{align} &\partial_u\lambda(u_C(A),v_C(A))=0,~~~\partial_v\lambda (u_C(A),v_C(A))=0,\\ &\lambda_C(A):=\overline{\lambda}_C(A)=\underline{\lambda}_C(A)=\lambda (u_C(A),v_C(A)), \label{sadd} \end{align}\tag{6}\] then \((u_C(A),v_C(A))\) is said to be saddle point of \(\lambda(u,v)\) in \(C \times C\). Observe, this implies that \(\lambda_C(A)\) is an eigenvalue of \(A\) and \(u_C(A)\), \(v_C(A)\) are the associated right and left eigenvectors of \(A\).
Our first result provides certain answers to problems A, B.
Theorem 1.
Assume that \(A \in M_{n\times n}(\mathbb{R})\), \(C \in K(\mathbb{R}^n)\). Then the minimax principles for \(\lambda(u,v)\) are satisfied in \(C\times C^o\) and \(C^o\times C\) \[\label{eq:mimaPrin} \sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)=\inf_{v \in C^o}\sup_{u \in C}\lambda(u,v)~~and~~\sup_{u\in C^o}\inf_{v \in C}\lambda(u,v)=\inf_{v \in C}\sup_{u \in C^o}\lambda(u,v)\qquad{(1)}\] Moreover, there exist right and left quasi-eigenvectors \(u_C(A), v_C(A) \in C\) of \(A\) in \(C\).
If \(u_C(A) \in C^o\) \((v_C(A) \in C^o)\), then the minimax principle for \(\lambda(u,v)\) is satisfied in \(C^o\times C^o\), and \[\begin{align} &\overline{\lambda}_C(A)=\lambda_C(A):=\sup_{u\in C^o}\inf_{v \in C^o}\lambda(u,v)=\inf_{v \in C^o}\sup_{u \in C^o}\lambda(u,v),\\ (&\underline{\lambda}_C(A)=\lambda_C(A):=\sup_{u\in C^o}\inf_{v \in C^o}\lambda(u,v)=\inf_{v \in C^o}\sup_{u \in C^o}\lambda(u,v)). \end{align}\] Furthermore, if \(u_C(A), v_C(A) \in C^o\), then \(\lambda_C(A)=\lambda(u_C(A),v_C(A))\) is eigenvalue and \(u_C(A)\), \(v_C(A)\) are corresponding right and left eigenvectors of \(A\) in \(C\).
Notice that ?? implies \(\overline{\lambda}_C(A)\geq \underline{\lambda}_C(A)\). Indeed, by ?? \[\begin{align} \underline{\lambda}_C(A)=\sup_{u\in C^o}\inf_{v \in C}\lambda(u,v)\leq \sup_{u\in C^o}\inf_{v \in C^o}\lambda(u,v)\leq \sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)=\overline{\lambda}_C(A). \end{align}\]
Example 1. The minimax principle for \(\lambda(u,v)\) may not hold in \(C^o\times C^o\). Indeed, consider \[A= \left( {\begin{array}{cc} 2 & 0\\ 0 & 1 \\ \end{array} } \right),~ ~C=S_+:=\{x \in \mathbb{R}^2\setminus 0\mid ~x_1\geq 0,~x_2\geq 0\}.\] Observe, \[\sup _{u \in S_+^o}\inf _{v\in S_+^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}= \sup _{u \in S_+^o}\inf _{v\in S_+}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}=1,~~\inf _{v\in S_+^o}\sup _{u \in S_+^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}=\inf _{v\in S_+^o}\sup _{u \in S_+}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}=2.\] Meanwhile, \[\inf _{v\in S_+^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}|_{u=(1,0)^T}=2~~\Rightarrow~\sup _{u \in S_+}\inf _{v\in S_+^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}=2,\] and thus, \(\overline{\lambda}_{S_+}=\sup_{u\in S_+}\inf_{v \in S_+^o}\lambda(u,v)=\inf_{v \in S_+^o}\sup_{u\in S_+}\lambda(u,v)=2\), \(u_{S_+}(A)=(1,0)^T\). By the same reasoning, \(\underline{\lambda}_{S_+}=\sup_{u\in S_+^o}\inf_{v \in S_+}\lambda(u,v)=\inf_{v \in S_+}\sup_{u\in S_+^o}\lambda(u,v)=1\), \(v_{S_+}(A)=(0,1)^T\).
Example 2. The right and left quasi-eigenvectors \(u_C(A), v_C(A)\) of matrices may be not eigenvectors in the usual sense, and the quasi-eigenvalues \(\overline{\lambda}_C(A)\), \(\underline{\lambda}_C(A)\) may be not eigenvalues. Indeed, consider \[A= \left( {\begin{array}{cc} 1 & -1\\ 1 & 1 \\ \end{array} } \right),~ ~C=S_+:=\{x \in \mathbb{R}^2\setminus 0\mid ~x_1\geq 0,~x_2\geq 0\}.\] Then \(\lambda_1(A)=1-i\), \(\lambda_2(A)=1+i\) with corresponding eigenvectors \(\phi_1=(i,1)^T\), \((1,i)^T\). However by Theorem 5 below, we have \(\overline{\lambda}_C(A)=\underline{\lambda}_C(A)=1=\min_{i=1,2}{\operatorname{Re} \lambda_i(A)}=\max_{i=1,2}{\operatorname{Re} \lambda_i(A)}\), with the right and left quasi-eigenvectors equal to \((1,0)^T\).
Remark 1. Theorem 1 entails that if \(C^o\) does not contain right and left eigenvectors of \(A\), then \(v_C(A) \in \partial C\) or/and \(u_C(A)\in \partial C\).
It is important to note that the concept of quasi-eigenvalues \(\overline{\lambda}_C(A)\), \(\underline{\lambda}_C(A)\) of the matrix \(A \in M_{n\times n}(\mathbb{R})\) in \(C \in K(\mathbb{R}^n)\) is an invariant under the orthogonal change of variable \(x=U^Ty\), \(U \in O(n)\). Indeed, we have \[\begin{align} &\overline{\lambda}_{C}(A) =\sup _{u \in C}\inf _{v\in C^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}=\sup _{u\in U^TC}\inf _{v\in U^TC^o}{\frac{\left\langle U^TA U u, v\right\rangle}{\left\langle u, v\right\rangle}}=\overline{\lambda}_{U^TC}( U^TA U),\tag{7}\\ &\underline{\lambda}_{C}(A) =\sup _{u \in C^o}\inf _{v\in C}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}=\sup _{u\in U^TC^o}\inf _{v\in U^TC}{\frac{\left\langle U^TA U u, v\right\rangle}{\left\langle u, v\right\rangle}}=\underline{\lambda}_{U^TC}( U^TA U).\tag{8} \end{align}\]
Let \(\lambda_i(A)\) for \(i\in \{1,\ldots,n\}\) be a real eigenvalue of \(A\). We denote by \(\phi_{i}, \psi_{i}\) right and left eigenvectors corresponding \(\lambda_i(A)\). In what follows, we use notation \(\operatorname{span}(Z)\) for the linear span of a set \(Z \subset \mathbb{R}^n\).
Corollary 1. Let \(\lambda_i(A)\) for \(i\in \{1,\ldots,n\}\) be a real eigenvalue of the matrix \(A\).
If there exists a right (left) eigenvector of \(A\) such that \(\operatorname{span}(\phi_i)\cap C^o \neq \emptyset\) (\(\operatorname{span}(\psi_i)\cap C^o \neq \emptyset\)), then \(\lambda_i(A)\leq \underline{\lambda}_C(A)\) \((\lambda_i(A)\geq \overline{\lambda}_C(A))\).
If there exist right and left eigenvector such that \(\operatorname{span}(\phi_i)\cap C^o\neq \emptyset\), \(\operatorname{span}(\psi_i)\cap C^o \neq \emptyset\), then \(\overline{\lambda}_C(A)=\underline{\lambda}_C(A)=\lambda_i(A)\). The inverse statement is also true.
Some statements of the Perron–Frobenius and Birkhoff–Varga theorems still hold under hypothesis of Theorem 1 and certain additional conditions.
Corollary 2.
If \(A\in M_{n\times n}(\mathbb{R})\) is a nonnegative matrix, then \[\label{eq:Noneg1} \overline{\lambda}_{S_+}(A)\geq \max\{|\lambda_j(A)|,~j=1,\ldots,n\}>0,\qquad{(2)}\] moreover, if in addition \(\lambda_{S_+}(A)\) is an eigenvalue of \(A\), then \[\label{eq:Noneg} \overline{\lambda}_{S_+}(A)= \max\{|\lambda_j(A)|,~j=1,\ldots,n\}>0.\qquad{(3)}\]
If \(A\in M_{n\times n}(\mathbb{R})\) is a matrix with nonnegative elements off the diagonal, then \[\label{eq:Re1} \overline{\lambda}_{S_+}(A)\geq \max\{\operatorname{Re}\lambda_j(A),~j=1,\ldots,n\},\qquad{(4)}\] moreover, if in addition \(\lambda_{S_+}(A)\) is a real part of an eigenvalue of \(A\), then \[\label{eq:Re} \overline{\lambda}_{S_+}(A)= \max\{\operatorname{Re}\lambda_j(A),~j=1,\ldots,n\}.\qquad{(5)}\]
Corollary 3. Assume that \(A\) is a real symmetric matrix and \(C\in K_O(\mathbb{R}^n)\). Then the following holds:
If \(\operatorname{span}(\phi_i)\cap C^o = \emptyset\) for any eigenvector \(\phi_i\), \(i\in \{1,\ldots,n\}\) of \(A\), then \[\overline{\lambda}_{C}(A)= \max_{1\leq j\leq n}\lambda_j(A),~~ \underline{\lambda}_{C}(A)=\min_{1\leq j\leq n}\lambda_j(A)\]
If there exists an eigenvector \(\phi_i\) of \(A\) such that \(\operatorname{span}(\phi_i)\cap C^o \neq \emptyset\), then \(\overline{\lambda}_C(A)=\underline{\lambda}_C(A)=\lambda_i(A)\).
The following result takes advantage of the fact that the Perron–Frobenius theory in Theorem 1 extends to arbitrary matrices and cones.
Theorem 2. Assume that \(A \in M_{n\times n}(\mathbb{R})\), \(C \in K(\mathbb{R}^n)\). \[\begin{align} If~~v_C(A) \in C^o,~~then&\nonumber\\ &\overline{\lambda}_{C}(A+D)-\underline{\lambda}_{C}(A)\leq c_1(A,C)\|D\|_M, ~~\forall D \in M_{n\times n}(\mathbb{R}), \label{cont0}\\ moreover, if~ D\leq 0, ~then&\nonumber\\ &\overline{\lambda}_{C}(A+D)\leq \underline{\lambda}_{C}(A), \label{cont0AD} \\ If~~u_C(A) \in C^o,~~then&\nonumber\\ &\underline{\lambda}_{C}(A+D)-\overline{\lambda}_{C}(A)\geq -c_2(A,C)\|D\|_M, ~~\forall D \in M_{n\times n}(\mathbb{R}), \label{cont00}\\ moreover, if~ D\geq 0, ~then&\nonumber\\ &\underline{\lambda}_{C}(A+D)\geq \overline{\lambda}_{C}(A),\label{cont00AD} \end{align}\] {#eq: sublabel=eq:cont0,eq:cont0AD,eq:cont00,eq:cont00AD} where \[c_1(A,C)= \sup _{u \in C}\frac{\| u\| }{\langle u, v_{C}(A)\rangle},~~c_2(A,C)= \sup _{v \in C}\frac{\| v\| }{\langle u_{C}(A), v \rangle}<+\infty.\] In particular, if \(u_C(A), v_C(A) \in C^o\), then \[\begin{align} &|\overline{\lambda}_{C}(A+D)-\lambda_{C}(A)|\leq c_0(A,C) \|D\|_M, ~\forall D \in M_{n\times n}(\mathbb{R}), \label{contOC}\\ &|\underline{\lambda}_{C}(A+D)-\lambda_{C}(A)|\leq c_0(A,C) \|D\|_M, ~\forall D \in M_{n\times n}(\mathbb{R}),\label{contOCO} \end{align}\] {#eq: sublabel=eq:contOC,eq:contOCO} where \(c_0(A,C):=\max\{c_1(A,C),c_2(A,C)\}\).
We say that a matrix \(A\) has sign-constant elements off the diagonal if one of the following is satisfied: \(a_{i,j}\geq 0\) or \(a_{i,j}\leq 0\) for all \(i,j \in \{1,\ldots,n\}\), \(i\neq j\). In what follows, we denote by \(M_{n\times n}^{isc}(\mathbb{R})\) the set irreducible matrix with sign-constant elements off the diagonal.
Theorem 3.
Assume that \(A \in M_{n\times n}^{isc}(\mathbb{R})\), then \(\lambda_{S_+}(A)\) is a simple eigenvalue of \(A\), and \(u_{S_+}, v_{S_+}\) are the corresponding positive right and left eigenvectors. Moreover, \((u_{S_+}, v_{S_+})\) is a saddle point of \(\lambda(u,v)\) in \(S_+\) which is unique up to the multipliers of \(u_{S_+}\) and \(v_{S_+}\). Furthermore, the function \(\lambda_{S_+}(\cdot)\) is continuous on \((M_{n\times n}(\mathbb{R}), \|\cdot\|_M)\) at any point \(A \in M_{n\times n}^{isc}(\mathbb{R})\) in the following sense \[\label{eq:ConC} \max\{|\underline{\lambda}_{S_+}(A+D)-\lambda_{S_+}(A)|,|\overline{\lambda}_{S_+}(A+D)-\lambda_{S_+}(A)|\} \leq c_0(A,C) \|D\|_M,\qquad{(6)}\] \(\forall D \in M_{n\times n}(\mathbb{R})\), where \(c_0(A,C)\) does not depend on \(D\), moreover, \(\lambda_{S_+}(A+D)\geq \lambda_{S_+}^*(A)\) if \(D\geq 0\), and \(\overline{\lambda}_{S_+}(A+D)\leq \lambda_{S_+}^*(A)\) if \(D\leq 0\).
Remark 2. The proof of Theorems 1, 3 provides a new approach to the proof of the Perron-Frobenius and Birkhoff-Varga Theorems. Indeed, from these results it follows that if \(A \in M_{n\times n}^{isc}(\mathbb{R})\) is a nonnegative matrix, then ?? , is fulfilled, while ?? holds true if \(A \in M_{n\times n}^{isc}(\mathbb{R})\) is a matrix with nonnegative elements off the diagonal. Thus, Theorem 3 and Corollary 2 yield the Perron–Frobenius and Birkhoff–Varga Theorems. Moreover, the theorem is supplemented with new properties of the essentially positive matrix such as continuity ?? .
Remark 3. The matrices with sign-constant elements off the diagonal arise in a multitude of scientific disciplines and practical applications, see, e.g., [2], [3].
Note that inequalities ?? , ?? , and ?? are related to Weyl’s inequality [31] for the eigenvalues of symmetric matrices, moreover they generalize it to arbitrary matrices. Furthermore, ?? means that the Perron root of irreducible nonnegative matrices is stable under arbitrary perturbations, which may be thought of as a generalization of Weyl’s assertion on the stability of the spectrum of symmetric matrices under perturbations on the manifold of symmetric matrices [31].
Clearly, \(\lambda_{S_+}(A)\) in ?? is the Peron root if \(A\) is an irreducible non-negative matrix. But \(A+D\) may not even be the irreducible non-negative matrices, and thus, \(\overline{\lambda}_{S_+}(A+D)\), \(\underline{\lambda}_{S_+}(A+D)\) might be not the eigenvalue and Peron root of \(A+D\). In this regard, it appears that ?? provides a definite answer to the question posed by Meyer in [29] about the continuity of the Peron root depending on the arbitrary matrices.
Define \(d(C, C'):=\|I-U\|_M\) for \(C,C'\in K(\mathbb{R}^n)\), \(U \in O(n)\) such that \(C'=UC\). It is easily seen that \(d(C, C')\) is a well-defined metric on \(K(\mathbb{R}^n)\). The topology induced on \(K(\mathbb{R}^n)\) due to this metric will be denoted by \(\tau(O(n))\).
Corollary 4. Assume that \(A \in M_{n\times n}(\mathbb{R})\), \(C \in K(\mathbb{R}^n)\) and \(u_C(A), v_C(A) \in C^o\). Then the map \(\lambda_{(\cdot)}(A)\) is continuous on \((K(\mathbb{R}^n), \tau(O(n)))\) at \(C\). Moreover, if \(C' \in K(\mathbb{R}^n)\) such that \(d(C, C')\) is sufficiently small, then \[\label{eq:contCone} \max\{|\overline{\lambda}_{C'}(A)- \lambda_{C}(A)|,~|\underline{\lambda}_{C'}(A)- \lambda_{C}(A)|\} \leq c_3(A,C)d(C, C'),\qquad{(7)}\] where \(c_3(A,C)\) does not depends on \(C'\).
Recall that a matrix \(A\) is normal if \(A^TA=A A^T\). It is well known that for any real normal matrix \(A\) there exists a set of invariant subspaces \(V_j\), \(1\leq j \leq n-l\) of \(A\) with \(l \in [0, n/2]\) such that \(\operatorname{dim} V_j=2\), \(\operatorname{Im}\lambda(A)|_{V_j}\neq 0\), \(j=1,\ldots l\), and \(\operatorname{dim} V_j=1\), \(\operatorname{Im}\lambda(A)|_{V_j}= 0\), \(j=2l+1,\ldots,n\). Here \(\lambda(A)|_{V_j}\) denotes the eigenvalue of the operator \(A\) restricted on invariant subspaces \(V_j\).
Theorem 4. Assume that \(A\in M_{n\times n}(\mathbb{R})\) and \(C\in K(\mathbb{R}^n)\).
\[\label{eq:norEst12} \min\{\lambda_j(\frac{A+A^T}{2})\mid~j=1,\ldots,n\}\leq ~ \underline{\lambda}_C(A)\leq \overline{\lambda}_C(A)\leq \max\{\lambda_j(\frac{A+A^T}{2})\mid~j=1,\ldots,n\}.\qquad{(8)}\]
If \(A\) is a normal matrix, then \[\min_{1\leq j\leq n}\operatorname{Re}\lambda_j(A) \leq ~ \underline{\lambda}_C(A)\leq \overline{\lambda}_C(A) ~\leq \max_{1\leq j\leq n}\operatorname{Re}\lambda_j(A), ~~\forall C \in K(\mathbb{R}^n).\]
Note that the eigenvalues of the symmetric matrix \((A+A^T)/2\) are real.
Corollary 5. If \(A\) is a skew-symmetric matrix, then \(\underline{\lambda}_C(A)= \overline{\lambda}_C(A)=0\) for any \(C \in K(\mathbb{R}^n)\).
Theorem 5. Assume that \(A\) is a real normal matrix, \(C\in K_O(\mathbb{R}^n)\). Then there holds the following:
If \(V_i\cap C^o = \emptyset\), \(\forall i=1,\ldots n-l\), then \[\overline{\lambda}_{C}(A)= \max_{1\leq j\leq n}\operatorname{Re}\lambda_j(A),~~ \underline{\lambda}_{C}(A)=\min_{1\leq j\leq n}\operatorname{Re}\lambda_j(A).\]
If \(\exists i\in \{1,\ldots,n\}\) such that \(V_i\cap C^o \neq \emptyset\), then \[\overline{\lambda}_C(A)=\underline{\lambda}_C(A)=\operatorname{Re}\lambda_i(A).\]
Remark 4. The Birkhoff-Varga formula 1 and its generalizations 4 , 5 , being variational, allows us to apply variational methods to numerically finding quasi-eigenpairs of matrices, such as gradient-based algorithms and other non-iterative methods, see, e.g., [7], [10], [11], [32]. Thus, in particular cases, such as irreducible nonnegative matrices, variational formulas 4 , 5 can produce algorithms for finding eigenpairs based on new principles, which may have certain advantages (see, e.g., [7], [11]) over the commonly used ones, see, e.g, [33], [34] and references therein.
Remark 5. We believe that the generalization of the Birkhoff-Varga formula 2 may be developed for other spectral problems in matrix theory, such as nonlinear spectral problems [35], [36], spectral problems for multilinear forms [37], [38] etc. [39].
Remark 6. Identifying the largest real part of eigenvalues of matrices with non-zero imaginary parts is important in finding solutions to various problems, see, e.g., [2], [24]–[28]. For instance, verifying that \(\displaystyle{\max_{1\leq j\leq n}\operatorname{Re}\lambda_j(A)\geq 0}\) is crucial in the detection of the Hopf bifurcation, see, e.g., [25]–[27]. Meanwhile, existing methods of verifying this condition, such as the Routh-Hurwitz criterion, see, e.g., [26], are often difficult to apply, especially for large matrices see, e.g., [25]. An alternative strategy is provided by Corollary 2, Theorems 4, 5 for addressing this problem, and it can be anticipated that its further development will produce the desired outcome.
A function \(f:X\to \mathbb{R}\) defined on a convex subset \(X\) of a real vector space is quasiconvex if for all \(x,y\in X\) and \(\alpha \in [0,1]\) we have \[f(\alpha x+(1-\alpha )y)\leq \max {\big \{}f(x),f(y){\big \}}.\] A function whose negative is quasiconvex is called quasiconcave. A function which is quasiconvex and quasiconcave said to be quasilinear (see [40]).
Our method is based on Sion’s minimax theorem [40]:
Theorem (Sion) Let \(X\) be a compact convex subset of a linear topological space and \(Y\) a convex subset of a linear topological space. If \(f\) is a real-valued function on \(X\times Y\) with
\(f(x,\cdot)\) upper semicontinuous and quasi-concave on \(Y\), \(\forall x\in X\), and
\(f(\cdot,y)\) lower semicontinuous and quasi-convex on \(X\), \(\forall y\in Y\),
then the minimax principle is satisfied \[\sup_{y\in Y}\min_{x\in X}f(x,y)=\min_{x\in X}\sup_{y\in Y} f(x,y).\]
Let us prove
Proposition 1. Let \(A \in M_{n\times n}(\mathbb{R})\), \(C \in K(\mathbb{R}^n)\). Then the functions \(\lambda(\cdot, v)\), \(\forall v \in C\) and \(\lambda(u,\cdot)\), \(\forall u \in C\) are quasilinear on \(C^o\).
Proof. Let \(v \in C^o\), \(u,w \in C\). Consider \[\begin{align} f(\alpha):=&\lambda(\alpha u+(1-\alpha)w,v)=\frac{\alpha\left\langle A u, v\right\rangle+(1-\alpha)\left\langle A w, v\right\rangle}{\left\langle \alpha u+(1-\alpha) w ,v\right\rangle},~~\alpha \in [0,1]. \end{align}\] Calculate \[\begin{align} f'(\alpha)= \frac{\left\langle A w, v\right\rangle \left\langle u, v\right\rangle-\left\langle A u, v\right\rangle\left\langle w, v\right\rangle}{\left\langle \alpha u+(1-\alpha) w ,v\right\rangle^2},~~\alpha \in (0,1). \end{align}\] Hence if \(\left\langle A w, v\right\rangle \left\langle u, v\right\rangle-\left\langle A u, v\right\rangle\left\langle w, v\right\rangle \neq 0\), then \(f'(\alpha)> 0\) or \(f'(\alpha)< 0\), \(\forall \alpha \in (0,1)\), whereas if \(\left\langle A w, v\right\rangle \left\langle u, v\right\rangle-\left\langle A u, v\right\rangle\left\langle w, v\right\rangle = 0\), then \(f'(\alpha)\equiv 0\), \(\forall \alpha \in (0,1)\). This implies that \[\min\{\lambda(u,v),\lambda(w,v)\}\leq \lambda(\alpha u+(1-\alpha)w,v)\leq \max\{\lambda(u,v),\lambda(w,v)\}, ~~\alpha \in (0,1).\] Hence, \(\lambda(\cdot, v)\) is quasiconvex and quasiconcave function, and therefore, is a quasilinear on \(C\). The proof for \(\lambda(u, \cdot)\) is similar. ◻
We need also
Proposition 2. \(\lambda_{inf}(u):=\inf_{v \in C^o}\lambda(u,v)\) is a upper semicontinuous functional in \(C\).
Proof. Let \((\phi_n)_{n=1}^\infty\) be a countable dense set in \(C^o\). Clearly, \[\lambda_{inf}(u)=\inf_{n\ge 1}\lambda(u,\phi_n), ~~u \in C.\] Hence, for any \({\displaystyle \tau \in \mathbb{R} }\), \[\{u \in C \mid \lambda_{inf}(u)< \tau\}=\bigcup_{n=1}^{\infty}\{u \in C \mid \lambda(u,\phi_n)< \tau\}.\] It is easily seen that \(\lambda(\cdot,\phi_n)\) is a continuous function in \(C\), for \(n=1,\ldots\). Hence the set \(\{u \in C \mid \lambda(u,\phi_n)< \tau\}\) is open in \(C\), and therefore, \(\{u \in C\mid \lambda_{inf}(u)< \tau\}\) is open for any \({\displaystyle \tau \in \mathbb{R} }\). This means that \(\lambda_{inf}(u)\) is a upper semicontinuous functional. ◻
Proof of Theorem 1
Clearly, \(\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)>-\infty\) and \(\inf_{v \in C^o}\sup_{u\in C}\lambda(u,v)<+\infty\). Since there holds \(\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)\leq \inf_{v \in C^o}\sup_{u\in C}\lambda(u,v)\), we have \[-\infty<\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)\leq \inf_{v \in C^o}\sup_{u\in C}\lambda(u,v)<+\infty.\]
Define \(B_1=\{x \in C:~||x||\leq 1\}\). Clearly, \(\partial \dot{B}_1:=\{x \in C:~||x||=1\}\) is a compact in \(\mathbb{R}^n\). By homogeneity of \(\lambda(\cdot,v)\) in \(C\), we have \(\overline{\lambda}_C(A):=\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)=\sup_{u\in \partial \dot{B}_1}\inf_{v \in C^o}\lambda(u,v)\). By Proposition 2, \(\lambda_{inf}(u):=\inf_{v \in C^o}\lambda(u,v)\) is upper semicontinuous on \(B_1\), and therefore, there exists \(u_{C} \in \partial \dot{B}_1\) such that \(\overline{\lambda}_{C}(A)=\inf_{v \in C^o}\lambda(u_{C},v)=\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)\).
It is easily found a sequence of compact convex subsets \(B_1(\epsilon) \subset B_1\), \(\epsilon\in (0,1)\) such that \(B_1(\epsilon) \subset B_1(\epsilon')\), \(\forall \epsilon\geq \epsilon'\) and \(\cup_{\epsilon>0} B_1(\epsilon) =B_1\). Since \(\lambda(\cdot,\cdot) \in C( B_1(\epsilon)\times C^o)\), Proposition 1 and Sion’s theorem yield \[\sup_{u\in B_1(\epsilon)}\inf_{v \in C^o}\lambda(u,v)= \inf_{v \in C^o}\sup_{u\in B_1(\epsilon)}\lambda(u,v),~~\epsilon\in (0,1).\] Using \(\lambda(\cdot,\cdot) \in C( B_1(\epsilon)\times C^o)\) it is not hard to show that \[\label{eq:converg1} \inf_{v \in C^o}\sup_{u\in B_1(\epsilon)}\lambda(u,v) \to \inf_{v \in C^o}\sup_{u\in B_1}\lambda(u,v),~~ as~~\epsilon \to 0.\tag{9}\] Since \(\cup_{\epsilon>0} B_1(\epsilon) =B_1\), there exists \(\epsilon_0>0\) such that \(u_{C} \in B_1(\epsilon)\) for any \(\epsilon<\epsilon_0\). Hence, \[\sup_{u\in B_1(\epsilon)}\inf_{v \in C^o}\lambda(u,v)=\sup_{u\in B_1}\inf_{v \in C^o}\lambda(u,v)=\overline{\lambda}_C(A), ~~\forall \epsilon \in (0,\epsilon_0),\] and consequently, by 9 we get the minimax principle for \(\lambda(u,v)\) in \(C\times C^o\) \[\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)= \inf_{v \in C^o}\sup_{u\in C}\lambda(u,v).\] Similar arguments apply for proving the existence of the left quasi-eigenvector \(v_C(A)\in C\) and that the minimax principle for \(\lambda(u,v)\) in \(C^o\times C\) holds.
Assume that \(u_C(A) \in C^o\). This implies \(\overline{\lambda}_C(A)=\sup_{u\in C}\inf_{v \in C^o}\lambda(u,v)=\sup_{u\in C^o}\inf_{v \in C^o}\lambda(u,v)\). Hence, we have \[\begin{align} \overline{\lambda}_C(A)=\inf_{v \in C^o}\sup_{u \in C}\lambda(u,v)\geq \inf_{v \in C^o}\sup_{u \in C^o}\lambda(u,v)\geq \sup_{u\in C^o}\inf_{v \in C^o}\lambda(u,v)=\overline{\lambda}_C(A), \end{align}\] and thus, \[\overline{\lambda}_C(A)= \inf_{v \in C^o}\sup_{u \in C^o}\lambda(u,v)= \sup_{u\in C^o}\inf_{v \in C^o}\lambda(u,v),\] that is, the minimax principle in \(C^o\times C^o\) holds true. The same reasoning applies to the case \(v_C(A) \in C^o\).
We thus have, if \(u_C(A), v_C(A) \in C^o\), then \(\lambda_C(A):=\overline{\lambda}_C(A)=\underline{\lambda}_C(A)\), and \[\begin{align} \lambda_C(A)=\inf_{v \in C^o}\lambda(u_C(A),v)\leq \lambda(u_C(A),v_C(A)),\\ \lambda_C(A)=\sup_{u\in C^o}\lambda(u,v_C(A))\geq \lambda(u_C(A),v_C(A)). \end{align}\] Thus, \(\lambda_C(A)=\lambda(u_C(A),v_C(A))=\inf_{v \in C^o}\lambda(u_C(A),v)=\sup_{u\in C^o}\lambda(u,v_C(A))\). Since \(u_C(A), v_C(A)\) are internal points in open set \(C^o\), this implies \(\partial_u \lambda(u_C(A),v_C(A))=0\), \(\partial_v \lambda(u_C(A),v_C(A))=0\), and thus, \(u_C(A)\), \(v_C(A)\) are right and left eigenvectors of \(A\) with eigenvalue \(\lambda_C(A)\).
Proof of Corollary 1.
(i) By Theorem 1 there exists a left quasi-eigenvectors \(v_C(A) \in C\) of \(A\). Hence, since \(\operatorname{span}(\phi_i )\cap C^o \neq \emptyset\), \[\underline{\lambda}_{C}(A) =\sup _{u \in C^o}\frac{\left\langle A u, v_C(A)\right\rangle}{\left\langle u, v_C(A)\right\rangle}\geq \frac{\left\langle A \phi_i, v_C(A)\right\rangle}{\left\langle \phi_i, v_C(A)\right\rangle}=\lambda_i(A).\] The case \(\operatorname{span}(\psi_i )\cap C^o \neq \emptyset\) is handled by the similar method.
(ii) If \(\operatorname{span}(\phi_i )\cap C^o \neq \emptyset,\operatorname{span}(\psi_i )\cap C^o \neq \emptyset\), then by the above \(\overline{\lambda}_C(A)\leq \lambda_i(A)\leq \underline{\lambda}_C(A)\). Since \(\overline{\lambda}_C(A)\geq \underline{\lambda}_C(A)\), we derive \(\overline{\lambda}_C(A)= \lambda_i(A)=\underline{\lambda}_C(A)=\lambda(\phi_i,\psi_i)\). Hence, using 4 , 5 we infer that \(u_C(A)=\phi_i\), \(v_C(A)=\psi_i\) up to multipliers. The converse statement can be proved similar.
Proof of Corollary 2
(i) Let \(\lambda_i(A) \in \mathbb{C}\), \(i=1,\ldots,n\) be an eigenvalue of \(A\). Suppose \(\phi_i \in \mathbb{C}^n\) is a corresponding right eigenvector, i.e., \(A\phi_i=\lambda_i(A) \phi_i\). Since \(A\geq 0\), \(|\lambda_i(A)||\phi_i|=|\lambda_i(A) \phi_i|\leq A|\phi_i|\), and consequently, \[\lambda(|\phi_i|,v)= \frac{\langle A |\phi_i|, v\rangle}{\langle |\phi_i|, v\rangle} \geq |\lambda_i(A)|, ~~\forall v \in S_+^o,~~i=1,\ldots,n.\] Hence by 4 we have \[\overline{\lambda}_{S_+}(A)=\sup _{u\in S_+}\inf _{v\in S_+^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}\geq \inf _{v\in S_+^o} \frac{\langle A |\phi_i|, v\rangle}{\langle |\phi_i|, v\rangle}\geq |\lambda_i(A)|\] \(\forall i=1,\ldots,n\), and thus, we obtain ?? . Assume that \(\overline{\lambda}_{S_+}(A)\) is an eigenvalue. Then by ?? , \(\overline{\lambda}_{S_+}(A)>0\), and therefore, \(\overline{\lambda}_{S_+}(A) \in \{|\lambda_j(A)|,~j=1,\ldots,n\}\). Hence ?? holds true.
(ii) Clearly, there exists a sufficiently large \(\gamma>0\) such that \(A+\gamma I\) is a nonnegative matrix, where \(I\) is an identity matrix. Then by (i), Corollary 2, we obtain \(\overline{\lambda}_{S_+}(A)+\gamma\geq |\lambda_i(A)+\gamma|\geq \operatorname{Re}(\lambda_i(A)) +\gamma\), \(\forall i=1,\ldots,n\), and consequently, \(\overline{\lambda}_{S_+}(A)\geq \operatorname{Re}(\lambda_i(A))\), \(i=1,\ldots,n\), which yields ?? . The proof of ?? is similar to ?? .
Proof of Corollary 3 Given 7 -8 , it is sufficient to prove the assertions of the corollary for diagonal matrix \(A= \operatorname{diag}(\lambda_1, ...,\lambda_n)\). In this case, \(\phi_i= e_i\), \(i\in \{1,\ldots,n\}\) and we may assume that \(\lambda_1\leq ...\leq \lambda_n\). Let \(C\in K_O(\mathbb{R}^n)\). It is easily seen that the only following is possible: 1) \(\operatorname{span}(e_i)\cap \partial C \neq \emptyset\), \(\forall i\in \{1,\ldots,n\}\); 2) \(\exists e_i\) such that \(\operatorname{span}(e_i)\cap C^o \neq \emptyset\). Note that cases 1) and 2) imply assumptions (a) and (b) of the corollary are satisfied, respectively. Moreover, case 1) means that \(C\) coincides with some orthant of \(\mathbb{R}^n\).
Assume that 1) is satisfied. Since \(u_iv_i\geq 0\), \(i=1,\ldots,n\), for any \(u:=(u_1\ldots u_n), v_n:=(v_1\ldots v_n)\) belonging the orthant \(C\), we have \[\overline{\lambda}_{C}(A)=\sup _{u \in C}\inf _{v\in C^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}\leq \lambda_n+\sup _{u \in C}\inf _{v\in C^o}\frac{ \sum_{i=1}^n (\lambda_i-\lambda_n) u_i v_i}{\left\langle u, v\right\rangle}\leq \lambda_n.\] On the other hand, \[\overline{\lambda}_{C}(A)=\sup _{u \in C}\inf _{v\in C^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}\geq \inf _{v\in C^o}\frac{\left\langle A e_n, v\right\rangle}{\left\langle e_n, v\right\rangle}= \lambda_n.\] Hence, we get \(\overline{\lambda}_{C}(A)= \lambda_n= \max_{1\leq j\leq n}\lambda_j(A)\). By a similar argument, we have \(\underline{\lambda}_{C}(A)=\min_{1\leq j\leq n}\lambda_j(A)\).
Assuming that 2) is true, assertion (b) follows directly from Corollary 1.
Since \(v_C(A) \in C^o\), \(\inf_{u\in\partial \dot{B}_1}\langle u, v_{C}(A)\rangle >0\), where \(\partial \dot{B}_1=\{x \in C:~||x||=1\}\). Thus, \(\lambda(\cdot, v_C(A))\) is a continuous bounded function on \(C\), and therefore, we have \(\underline{\lambda}_{C}(A)=\sup _{u \in C^o}\lambda(u, v_C(A))=\sup _{u \in C}\lambda(u, v_C(A))\). From this and by Theorem 1 we derive \[\begin{align} \overline{\lambda}_{C}(A+D)= & \sup _{u \in C}\inf _{v\in C^o}\frac{\langle (A+D) u, v\rangle}{\langle u, v\rangle}\leq \sup _{u \in C}\frac{\langle (A+D) u, v_{C}(A)\rangle}{\langle u, v_{C}(A)\rangle}\leq \nonumber \\ &\sup _{u \in C}\frac{\langle A u, v_{C}(A)\rangle}{\langle u, v_{C}(A)\rangle}+\sup _{u \in C}\frac{\langle D u, v_{C}(A)\rangle}{\langle u, v_{C}(A)\rangle}=\underline{\lambda}_{C}(A)+\sup _{u \in C^o}\frac{\langle D u, v_{C}(A)\rangle}{\langle u, v_{C}(A)\rangle}.\label{eq:IneqF} \end{align}\tag{10}\] Note that \[\sup _{u \in C^o}\frac{\langle D u, v_{C}(A)\rangle}{\langle u, v_{C}(A)\rangle}\leq \|D\|_M \|v_{C}(A)\|\sup _{u \in C}\frac{\| u\| }{\langle u, v_{C}(A)\rangle}.\] Since \(\inf_{u\in \partial \dot{B}_1}\langle u, v_{C}(A)\rangle >0\), by the homogeneity of \(\frac{\| \cdot\| }{\langle \cdot, v_{C}(A)\rangle}\) we have \[0<\sup _{u \in C^o}\frac{\| u\| }{\langle u, v_{C}(A)\rangle}=c_1(A,C)<+\infty,\] Thus we have proved ?? . Clearly, if \(D\leq 0\), then 10 implies ?? . Proof of ?? , ?? are similar.
Let \(u_C(A), v_C(A) \in C^o\). Then by Theorem 1, \(\lambda_{C}(A)=\overline{\lambda}_{C}(A)=\underline{\lambda}_{C}(A)\). Since \(\overline{\lambda}_{C}(A+D)\geq \underline{\lambda}_{C}(A+D)\), ?? , ?? imply \[\begin{align} -c_2(A,C)\|D\|_M\leq \underline{\lambda}_{C}(A+D)-\lambda_{C}(A)\leq \overline{\lambda}_{C}(A+D)-\lambda_{C}(A)\leq c_1(A,C)\|D\|_M, \end{align}\] \(\forall D \in M_{n\times n}(\mathbb{R})\). Thus, we get ?? , ?? .
To be specific, consider the case \(a_{ij}\leq 0\), \(\forall i\neq j\). In view of that \(A -\overline{\lambda}_{S_+}(A) I \in M_{n\times n}^{isc}(\mathbb{R})\), the inequality \(A u_{S_+}-\overline{\lambda}_{S_+}(A) u_{S_+}\geq 0\), for \(u_{S_+} \in S_+\) implies that \(u_{S_+} \in S_+^o\). Therefore, \(\langle u_{S_+}, v_{S_+} \rangle \neq 0\), and thus, \(\lambda(u_{S_+}, v_{S_+})\) is well-defined. Hence, by Theorem 1 \[\overline{\lambda}_{S_+}(A)=\lambda_{S_+}(A)=\sup_{u \in S_+^o}\inf_{v \in S_+}\lambda(u,v) =\inf_{v \in S_+}\lambda(u_{S_+} ,v)\leq \lambda(u_{S_+}, v_{S_+}).\] On the other hand, \[\overline{\lambda}_{S_+}(A) \geq \underline{\lambda}_{S_+}(A)=\sup_{u\in S_+^o}\lambda(u,v_{S_+})\geq \lambda(u_{S_+}, v_{S_+})\] Thus, \[\label{eq:irre5} \lambda_{S_+}(A)=\underline{\lambda}_{S_+}(A)=\inf_{v \in S_+}\lambda(u_{S_+} ,v)=\sup_{u\in S_+}\lambda(u,v_{S_+})= \lambda(u_{S_+}, v_{S_+}).\tag{11}\] Hence and since \(u_{S_+} \in S_+^o\), we have \(\partial_u\lambda(u_{S_+},v_{S_+})= 0\), i.e., \(A^Tv_{S_+}= \lambda_{S_+}(A) v_{S_+}\). Since \(A^T -\lambda_{S_+}(A) I \in M_{n\times n}^{isc}(\mathbb{R})\), equality \(A^T v_{S_+}-\lambda_{S_+}(A) v_{S_+}= 0\) can only be achieved if \(v_{S_+} \in S_+^o\), and hence, by 11 we get \(\partial_v\lambda(u_{S_+},v_{S_+})=0\). Thus, the first part of the theorem is proved.
The uniqueness of \(u_{S_+}(A)\) and \(v_{S_+}(A)\) up to multipliers is a consequence of the Perron–Frobenius theorem. Indeed, since by the assumption \(A\) is an irreducible matrix with sign-constant elements off the diagonal, one can find a sufficiently large \(\gamma>0\) such that \((A+\gamma I)\) or \((-A+\gamma I)\) are irreducible matrices with non-negative elements. Hence by the Perron–Frobenius theorem \(u_{S_+}(A)\) and \(v_{S_+}(A)\) are unique up to multipliers of the right and left eigenvectors of \((A+\gamma I)\) or \((-A+\gamma I)\).
Since \(u_{S_+}(A), v_{S_+}(A) \in S_+^o\), Theorem 2 yields ?? .
Proof of Corollary 4.
Let \(C'=UC\) with \(U \in O(n)\). Observe \[\overline{\lambda}_{C'}(A)=\overline{\lambda}_{UC}(A)=\sup _{u\in C}\inf _{v\in C^o}{\frac{\left\langle U^TA U u, v\right\rangle}{\left\langle u, v\right\rangle}}=\overline{\lambda}_{C}( U^TA U).\] Since \(u_C(A), v_C(A) \in C^o\), Theorem 2 yields \[\label{eq:324462} |\overline{\lambda}_{C'}(A)- \lambda_{C}(A)|=|\overline{\lambda}_{C}( U^TA U)- \lambda_{C}(A)|\leq c_0(A,C)\| U^TA U-A\|.\tag{12}\] By Stone’s theorem there exists a self-adjoint matrix \(G\) such that \(U=\exp(G)\), and therefore, \(U=I+G+\bar{o}(\|G\|)\) for sufficiently small \(\|G\|\), where \(\bar{o}(\|G\|)\) is a little-o of \(\|G\|\) as \(\|G\| \to 0\). From this \(U^TA U=A+[A,G]+\bar{o}(\|G\|)\) with \([A,G]:=AG-GA\), and thus \[\label{eq:UTU} \| U^TA U-A\| =\|[A,G]+\bar{o}(\|G\|)\|\leq \|G\|\|[A,\tilde{G}]+\bar{o}(1)\|\leq c_3(A,C) \|G\|\tag{13}\] for sufficiently small \(\|G\|\). Here \(\tilde{G}=G/\|G\|\), and a constant \(c_3(A,C) \in (0,+\infty)\) does not depends on \(G\). Since \(d(C, C')=\|U-I\|=\|G+\bar{o}(\|G\|)\|\), we have \(\|G\| \to 0\) as \(d(C, C') \to 0\). This by 12 , 13 implies \[|\overline{\lambda}_{C'}(A)- \lambda_{C}(A)| \leq c_3(A,C)d(C, C').\] The rest of ?? is obtained similarly.
First we prove \((2^o)\). It is easily follows from 4 \[\begin{align} \label{ss1} &\overline{\lambda}_C(A)=\sup _{u \in C}\inf _{v\in C^o}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}\leq \sup _{u \in \mathbb{R}^n\setminus 0}\frac{\left\langle A u, u\right\rangle}{\left\langle u, u\right\rangle}=\sup _{x\in \mathbb{C}^n\setminus 0: \operatorname{Im} x=0}\frac{\langle\langle A x, x\rangle\rangle}{\langle\langle x, x\rangle\rangle}. \end{align}\tag{14}\] Here \(\langle\langle x,y\rangle\rangle:=\sum_{i=1}^n x_i \overline{y}_i\), \(x,y \in \mathbb{C}^n\). The set \(J:=\{z =\langle\langle Ax,x\rangle\rangle\mid x \in \mathbb{C}^n, ~\langle\langle x, x\rangle\rangle = 1\}\) is called the field of values of \(A\) (see [41]). It is known that for the normal matrix \(A\), \(J\) coincides with the convex hull \(H(\lambda(A))\) of the set of eigenvalues \(\{\lambda_i(A),~i=1,\ldots,n\}\) (see [41], p. 168). This by 14 implies that \[\overline{\lambda}_C(A)\leq \sup \operatorname{Re}(H(\lambda(A))) =\max\{\operatorname{Re}(\lambda_i(A)),~i=1,\ldots,n\}.\] Similarly, from 4 and by the minimax principle ?? in \(C\times C^o\) we have \[\begin{align} \label{ss2} &\overline{\lambda}_C(A)=\inf _{v\in C^o}\sup _{u \in C}\frac{\left\langle A u, v\right\rangle}{\left\langle u, v\right\rangle}\geq \inf _{v \in \mathbb{R}^n\setminus 0}\frac{\left\langle A v, v\right\rangle}{\left\langle v, v\right\rangle}, \end{align}\tag{15}\] and therefore \(\overline{\lambda}_C(A)\geq \inf \operatorname{Re}H(\lambda(A)) =\min\{\operatorname{Re}\lambda_i(A),~i=1,\ldots,n\}\). In the same way, using the minimax principle ?? in \(C^o\times C\) we derive \[\min\{\operatorname{Re}\lambda_i(A),~i=1,\ldots,n\}\leq \underline{\lambda}_C(A) \leq \max\{\operatorname{Re}(\lambda_i(A)),~i=1,\ldots,n\}\]
Let us now prove \((1^o)\). Observe, \[\left\langle\frac{A+A^T}{2}u,u \right\rangle =\left\langle A u, u\right\rangle,~~\forall u \in \mathbb{R}^n.\] Hence, by 14 , 15 \[\inf _{u \in C^o}\frac{\left\langle (A+A^T) u, u\right\rangle}{2\left\langle u, u\right\rangle}\leq \overline{\lambda}_C(A)\leq \sup _{u \in C}\frac{\left\langle (A+A^T) u, u\right\rangle}{2\left\langle u, u\right\rangle}.\] Note that \(A+A^T\) is a symmetric, and consequently normal matrix. Hence, the same arguments as in the above proof of \((1^o)\) implies that \[\min\{\lambda_i(\frac{A+A^T}{2})\mid~i=1,\ldots,n\}\leq \overline{\lambda}_{C}(A)\leq \max\{\lambda_i(\frac{A+A^T}{2})\mid~i=1,\ldots,n\},\] where we take into account that the eigenvalues of the symmetric matrix \(A+A^T\) are real. The inequality for \(\underline{\lambda}_{C}(A)\) in ?? is obtained similarly. 0◻
It is well known (see, e.g., [41]) that for any normal matrix \(A\) there corresponds an orthogonal matrix \(U_A\) that reduces \(A\) to the canonical form, i.e., \[\label{eq:canNorm} {\displaystyle U^T_AA U_A= \mathcal{O}_A={\begin{bmatrix}{\begin{matrix}Q(r_1, \theta_1)&&\\&\ddots &\\&&Q(r_l, \theta_l)\end{matrix}}&0\\0&{\begin{matrix}\mu_{2l+1}&&\\&\ddots &\\&&\mu_{n}\end{matrix}}\\\end{bmatrix}},}\tag{16}\] where \(\theta_i \in [0,2\pi)\), \(r_i \in \mathbb{R}\), \(i=1,\ldots l\), \(0\leq l\leq n/2\), \(\mu_i \in \mathbb{R}\), \(i=2l+1,\ldots,n\), and the matrix \(Q(r,\theta)\) is defined as follows \[Q(r,\theta):= r \left[ {\begin{array}{cc} \cos\,\theta & -\sin\, \theta \\ \sin\,\theta & \cos\,\theta \\ \end{array} } \right], ~~\theta \in [0,2\pi), ~~r \in \mathbb{R}.\] Note that \(\lambda_{1}(Q(r,\theta))= r(\cos\, \theta +i \sin\,\theta)\), \(\lambda_2(Q(r,\theta))= r(\cos\,\theta -i \sin\,\theta)\), and thus, \(\operatorname{Re} \lambda_{j}(Q(r,\theta))=r\cos\, \theta\), \(j=1,2\).
In light of 16 and 7 -8 , it is sufficient to prove the assertions of the theorem for the matrix in the canonical form \(\mathcal{O}_A\) with \(C\in K_0(\mathbb{R}^n)\). Without loss of generality, we may assume that \(l=n/2\). Note that in this case we have \(V_i=\operatorname{span}(\{e_i,e_{i+1}\})\), \(i=1,\ldots, l\).
Let \(C\in K_O(\mathbb{R}^n)\). It is easily seen that the only following is possible: 1) \(V_i\cap \partial C \neq \emptyset\), \(\forall i\in \{1,\ldots,n\}\), or 2) \(\exists V_i\) such that \(V_i\cap C^o \neq \emptyset\). Note that 1) and 2) imply assumptions of \((1^o)\) and \((2^o)\) of the theorem, respectively. Additionally, 1) implies that \(C\) coincides with some orthant, which without losing generality may be assumed to be a nonnegative orthant of \(\mathbb{R}^n\).
Assume that 1) is satisfied. Observe, \[\overline{\lambda}_{C}(\mathcal{O}_A)=\sup _{u \in C}\inf _{v\in C^o}\frac{\left\langle \mathcal{O}_A u, v\right\rangle}{\left\langle u, v\right\rangle}\geq \sup _{u \in V_i}\inf _{v\in C^o}\frac{\left\langle \mathcal{O}_A u, v\right\rangle}{\left\langle u, v\right\rangle}\geq \operatorname{Re}\lambda_i(\mathcal{O}_A),~i=1,\ldots,l,\] and thus, \(\overline{\lambda}_{C}(\mathcal{O}_A)\geq \max_{1\leq j\leq n}\operatorname{Re}\lambda_j(A)\). On the other hand, by Theorem 4, we have \(\overline{\lambda}_{C}(\mathcal{O}_A)\leq \max_{1\leq j\leq n}\operatorname{Re}\lambda_j(\mathcal{O}_A)\). Hence, we get \(\overline{\lambda}_{C}(\mathcal{O}_A)= \max_{1\leq j\leq n}\operatorname{Re}\lambda_j(\mathcal{O}_A)\). By a similar argument, \(\underline{\lambda}_{C}(\mathcal{O}_A)=\min_{1\leq j\leq n}\lambda_j(\mathcal{O}_A)\).
Assume that 2) is true. Without loss of generality we may assume that \(i=1\), i.e., \(V_1\cap C^o \neq \emptyset\), and therefore, \(\operatorname{span}(e_1)\cap C^o \neq \emptyset\) or \(\operatorname{span}(e_2)\cap C^o \neq \emptyset\). Suppose \(\operatorname{span}(e_1)\cap C^o \neq \emptyset\) is fulfilled, and \(r\sin\,\theta>0\).
By Theorem 1 there exists a quasi-eigenvectors \(v_C(A) \in C\), and thus, \[\langle A^Tv_C(A)- \underline{\lambda}_C(A) v_C(A), w\rangle\leq 0,~\forall w \in C^o.\] Hence, \(\langle A^Tv_C(A)- \underline{\lambda}_C(A) v_C(A), e_1 \rangle\leq 0\), and consequently, \((r\cos\,\theta -\underline{\lambda}_C(A) )v_1+v_2 r \sin\,\theta \leq 0\). Since \(v_1,v_2, r\sin\,\theta\geq 0\), we derive from here that \(r\cos\,\theta\leq \underline{\lambda}_C(A)\). Similarly, using \(u_C(A) \in C\) we derive \(\langle Au_C(A)- \overline{\lambda}_C(A) u_C(A), e_1 \rangle\geq 0\), and consequently, \((r\cos\,\theta -\overline{\lambda}_C(A) )u_1-u_2 r \sin\,\theta \geq 0\). Hence, by \(u_1,u_2, r\sin\,\theta\geq 0\), we obtain \(r\cos\,\theta\geq \overline{\lambda}_C(A)\). Since \(\overline{\lambda}_C(A) \geq \underline{\lambda}_C(A)\), we infer \(\overline{\lambda}_C(A)=\underline{\lambda}_C(A)=r\cos\,\theta=\operatorname{Re} \lambda_{1}(\mathcal{O}_A)\). The case \(r\sin\,\theta<0\) is handled in the same way.