We complete a uniform construction of canonical absolute parallelism for bracket generating rank \(2\) distributions with \(5\)-dimensional cube on \(n\)-dimensional manifold with \(n\geq 5\) by showing that the condition of maximality of class that was assumed previously by Doubrov-Zelenko for such a construction holds automatically at
generic points. This also gives analogous constructions in the case when the cube is not \(5\)-dimensional but the distribution is not Goursat through the procedure of iterative Cartan deprolongation. This together with the
classical theory of Goursat distributions covers in principle the local geometry of all bracket generating rank 2 distributions in a neighborhood of generic points. As a byproduct, for any \(n\geq 5\) we describe the
maximally symmetric germs among bracket generating rank \(2\) distributions with \(5\)-dimensional cube, as well as among those which reduce to such a distribution under a fixed number of
Cartan deprolongations. Another consequence of our results on maximality of class is for optimal control problems with constraint given by a rank \(2\) distribution with \(5\)-dimensional
cube: it implies that for a generic point \(q_0\) of \(M\), there are plenty abnormal extremal trajectories of corank \(1\) (which is the minimal possible
corank) starting at \(q_0\). The set of such points contains all points where the distribution is equiregular.
This paper is devoted to the proof of the existence of a canonical absolute parallelism and to the characterization of maximally symmetric models for bracket generating rank \(2\) distributions that satisfy a natural
condition of maximal growth for iterated Lie brackets of length at most \(3\) of their sections with further consequences to rank \(2\) distributions which are not Goursat.
A rank \(l\) vector distribution \(D\) on an \(n\)-dimensional manifold \(M\) or an \((l,n)\)-distribution (where \(l<n\)) is a subbundle of the tangent bundle \(TM\) with \(l\)-dimensional fibers. The weak
derived flag at \(p\in M\) of the distribution \(D\) is the flag \(\{D^{i}\}_{i=1}^\infty\) defined by \[\label{week95derived} D^{1}(q) = D(q)\quad \text{and}\quad D^{i}(q) = D^{i-1}(q)+[D,D^{i-1}](q)\;\text{for } i>1.\tag{1}\] The space \(D^i(q)\) is called the \(i\)th power of the distribution \(D\) at point \(q\). In particular, \(D^2(q)\) (respectively, \(D^3(q)\)) is called the square (respectively, the cube) of the distribution \(D\) at \(q\). A distribution \(D\) is called
bracket generating if for every \(q\), \(D^k(q)\) coincides with the whole tangent space \(T_q M\) for sufficiently large \(k\). The tuple \((\dim D(q). \dim D^2(q), \ldots \dim D^i(q), \cdots)\) is called the small growth vector of the distribution \(D\) at the point \(q\). The distribution \(D\) is called equiregular at a point \(q_0\) if there exists a neighborhood \(U\) of \(q_0\) such that the small growth vector of \(D\) is the same for all \(q\in U\). If \(D\) is bracket generating, the set of
points at which \(D\) is equiregular, is generic.
Elementary counting implies that for a rank \(2\) distribution \(D\), \(\dim D^3(q)\) is at most \(5\). One of the main
results of the paper can be formulated as follows:
Theorem 1. For any bracket generating rank \(2\) distribution \(D\) on an \(n\)-dimensional manifold \(M\), \(n>5\), with \[\label{D94395eq}
\mathop{\mathrm{rank}}D^3=5\qquad{(1)}\] at a generic point, the following statements hold:
One can assign to \(D\) a canonical frame on the \((2n-1)\)-dimensional bundle over a neighborhood of generic point of \(M\) which implies that
the group of symmetries of \(D\) is at most \((2n-1)\)-dimensional;
Any \((2,n)\)-distribution with \((2n-1)\)-dimensional group of symmetries is locally equivalent to the distribution associated with the Monge equation \[\label{Monge} z'(x)=\bigl(y^{(n-3)}(x)\bigr)^2,\qquad{(2)}\] or equivalently, with a rank \(2\) distribution on \({\mathbb{R}}^n\) with coordinates
\((x,y_0,\ldots, y_{n-3},z_0)\) given by the intersection of the annihilators of the following \(n-2\) one-forms: \[\label{Pfaff} \begin{align}
~&dy_i-y_{i+1} dx , \,\,0\leq i\leq n-4,\\ ~&dz-y_{n-3}^2dx.
\end{align}\qquad{(3)}\]
Note that for \(n>5\) the infinitesimal symmetry algebra of the distribution associated with the Monge equation ?? is isomorphic to the natural semidirect sum of \(\mathfrak{gl}_2(\relax
\ifmmode \mathbb{R} \else\)\(\fi)\) with the \((2n-5)\)-dimensional Heisenberg algebra (for details, see [1] or [2]).
É. Cartan proved an analogue of Theorem 1 for the case \(n = 5\)[3]: the dimension of the bundle in item (1) and the corresponding upper bound for the dimension of the symmetry algebra is 14 (equal to the dimension of the exceptional Lie algebra \(G_2\)).
The maximally symmetric model in this case is given by equation ?? (the Cartan–Hilbert equation) or, equivalently, by the Pfaffian system ?? with \(n = 5\), and its infinitesimal symmetry algebra is isomorphic to the split
real form of \(G_2\).
Theorem 1 strengthens Theorems 1 and 3 of [1]by
removing an additional assumption that the distribution \(D\) is of so-called maximal class (see Definitions 3 and 4 in Section 2 for the precise geometric definition.1 Thus, Theorem 1 is a direct consequence of those theorems from [1] and of the following result, which forms the main technical core of the present paper:
Theorem 2. Any bracket generating distribution with \(5\)-dimensional cube is of maximal class at a generic point. The set of such points contains all points where the distribution is
equiregular.
A precise description of the bundle and the canonical frame of item (1) of Theorem 1, using the theory of Jacobi curves [4] and geometry of curves in projective spaces, can be found in [1]. An
alternative construction of this bundle and the canonical frame, using Tanaka-Morimoto theory, can be found in section 3 of [2].
Theorem 2 was conjectured in [5] (see also [1]) nearly 20 years ago, but before the present paper, it had been confirmed only in the following very limited specific cases (see, e.g., [6]):
\(5\leq n\leq 8\);
for distributions with small grow vector \((2,3,5,6,7,\cdots, n)\) at every point; i.e., when \(\dim D^j(q)=j+2\) for \(4\leq j\leq n-2\);
in the case of \((2, 14)\)-distributions with “free" small growth vector \((2,3,5,8,14)\);
for distributions associated with Monge equations \[z^{(m)} = F\big(x, y, y', \cdots, y^{(n-2-m)}, z,\cdots, z^{(m-1)}(n)\big), \quad \displaystyle{{\partial ^2 F\over \partial (y^{(n-2-m)})^2}\neq
0}.\]
Note that in [7] it was demonstrated that the local model ?? is the most symmetric among Monge distributions. This was done by showing
first that Monge distributions with fixed \(1\leq m\leq \lfloor{\frac{n-2}{2}}\rfloor\) have fixed Tanaka symbol. In the sequel, those Tanaka symbols will be called Monge symbols. Then by computing the universal
Tanaka prolongation of such symbols, the authors find that the maximally symmetric model corresponds to \(m=1\) and is locally equivalent to the model in ?? . However, for \(n\geq 6\)
generic germs of rank 2 distributions are not Monge 2: for \(n\geq 7\) this follows from the fact that generic Tanaka symbols of \((2,n)\)-distributions are not Monge symbols. For \(n=6\) the statement follows from the fact that certain nontrivial invariants vanish for Monge distributions. 3
The previous approaches to proving Theorem 2 relied on attempts to compute the filtration 5 (below) directly in terms of
the original distribution \(D\). A key insight in [6] was that it suffices to prove Theorem 2 for flat distributions with prescribed Tanaka symbols 4 (for
the definition of the Tanaka symbols and flat distributions, see [8], [9] and
also the end of section 3, after formula 21 ). However, this method depends on the classification of Tanaka symbols, which becomes infeasible due to the combinatorial explosion of
possibilities as the dimension increases. To circumvent this obstacle, a two-stage strategy was proposed in [6]. The first stage involves computing the
filtration 5 for free truncated nilpotent Lie algebras with two generators of a given step. The second stage aims to use the result of these computations for all nilpotent graded Lie algebras of the same step. However, the
first stage quickly becomes computationally infeasible as the dimension grows, even for computer algebra systems. Moreover, even when the first stage was successfully completed (as in the 5-step case of item (3) above), it remained unclear how to use it to
implement the second stage effectively.
In this paper, we adopt a completely different strategy, which enables us to prove Theorem 2 in full generality, without relying on any computer
algebra computations. Rather than attempting to compute the filtration 5 directly from the original distribution \(D\) on \(M\), we begin with an abstract
filtration of the type 5 on the corresponding submanifold of the projectivized cotangent bundle \(\mathbb{P}T^*M\). Assuming that the filtration enjoys the properties of one associated with a
distribution of a given constant class greater than \(1\)5, we show that it corresponds to a bracket-generating distribution only if the
class is maximal. A key factor that convinced us of the promise of this method, and motivated us to pursue it, was the realization—based on general reasoning—that if the method were to fail, it would effectively yield a counterexample. Moreover, we
recognized that the analysis should not strongly depend on the dimension, but from item (1) above, no counterexamples arise in low-dimensional cases. Finally, this approach is completely independent of the Tanaka symbols of the original distribution.
The next two subsections will expound the implications of Theorem 1 for non-Goursat rank 2 distributions and the implications of Theorem 2 for optimal control problems with constraint given by a rank 2 distribution with 5-dimensional cube, respectively.
1.1 Canonical Frames for Non-Goursat Rank 2 Distributions↩︎
We now explain why the assumption in equation ?? is, in fact, not restrictive. In short, given a distribution which is not Goursat, one can apply iterative Cartan deprolongations (outlined below) at a generic point to reduce to a distribution satisfying
?? . We thereby justify the title of the paper.
First, for \(n = 3\) and \(n = 4\), all generic germs of \((2, n)\)-distributions are locally equivalent to each other, as established by the classical
Darboux and Engel theorems. These distributions are modeled by the Cartan (or “contact”) distributions on the jet spaces \(J^1(\mathbb{R}, \mathbb{R})\) and \(J^2(\mathbb{R}, \mathbb{R})\),
respectively, where \(J^k(\mathbb{R}, \mathbb{R})\) denotes the space of \(k\)-jets of functions from \(\mathbb{R}\) to itself. For arbitrary \(n \geq 3\), the Cartan distribution on \(J^{n-2}(\mathbb{R}, \mathbb{R})\) provides a canonical model of a \((2, n)\)-distribution, possessing an
infinite-dimensional Lie algebra of symmetries given by the group of contact transformations. It is also worth noting that for \(n \geq 4\), all such distributions have a \(4\)-dimensional
cube. This shows that without assumption ?? we may get models with infinite-dimensional symmetries.
On the other hand, even if ?? does not hold at generic points, Theorem 1 is applicable (at a generic point) after a certain reduction procedure (see [1]) also called deprolongation in [1].
Indeed, because \(D\) is bracket generating, its cube has dimension at least \(4\) at generic points. Suppose that \(D\) satisfies \(\dim D^3(q)=4\) on an open set \(M^\circ\) of \(M\). Then the rank \(3\) distribution \(D^2\)
on \(M^\circ\) has a one-dimensional characteristic sub-distribution lying in \(D\). At any \(q_0\in M^\circ\), we can consider the quotient \(\Pi:U\to \mathop{\mathrm{depr}}^1_{q_0}U\) of a neighborhood \(U\) of \(q_0\) by the corresponding one-dimensional foliation together with a new bracket
generating rank \(2\) distribution \(\mathop{\mathrm{depr}}^1_{q_0}D\) on \(\mathop{\mathrm{depr}}^1_{q_0}U\) obtained by the factorization of \(D^2\). In fact, the germ of \(D\) at \(q_0\) can be uniquely reconstructed from \(\mathop{\mathrm{depr}}_{q_0}^1D\), because it
is equivalent to its Cartan prolongation (see [2] for details). Therefore \(\mathop{\mathrm{depr}}^1_{q_0}D\) is called the
(first) deprolongation of \(D\) at the point \(q_0\).
In the case that \(\mathop{\mathrm{depr}}_{q_0}^1D\) has cube of rank 4 in a neighborhood of \(\Pi(q_0)\), we can repeat this process at \(\Pi(q_0)\) to
obtain \(\mathop{\mathrm{depr}}^1_{\Pi(q_0)}\big(\mathop{\mathrm{depr}}^1_{q_0}D\big)\), which we shall denote by \(\mathop{\mathrm{depr}}^2_{q_0}D\) and call the second deprolongation
of \(D\) at \(q_0\). The \(s\)th deprolongation is defined inductively on the condition that \(\mathop{\mathrm{depr}}^{s-1}_{q_0}D\) has cube of rank 4 in a neighborhood of image of \(q_0\) under the corresponding composition of quotients. We denote this distribution by \(\mathop{\mathrm{depr}}^s_{q_0}D\).
At a generic point \(q_0\) of \(M\), there exists \(s\) so that after iterating this procedure \(s\) times, one arrives
at one of the following two cases:
The \((2,n-s)\) distribution \(\mathop{\mathrm{depr}}^s_{q_0}D\) has \(5\)-dimensional cube in a neighborhood of the image of \(q_0\) under the corresponding composition of quotients (so \(s\leq n-5\)).
The \((n-4)\)th deprolongation \(\mathop{\mathrm{depr}}^{n-4}_{q_0}D\) is well defined (and in this case is locally equivalent to the Engel distribution).
We call the \(s\) appearing in case (i) the deprolongation degree of \(D\) at \(q_0\). We also set the deprolongation degree for case (ii) equal
to \(n-4\). Note that the deprolongation degree is defined at a generic point of \(M\), but in general, this degree is only locally constant. It is not greater than \(n-5\) in case (i).
In case (ii), the original distribution \(D\) is locally equivalent to a Goursat distribution [10] at \(q_0\). Goursat distributions are defined using the strong derived flag\(\{D^{[j]}\}_{j=1}^\infty\), where instead of 1 we have
\[\label{strong95derived} D^{[1]}(q) = D(q)\quad \text{and}\quad D^{[i]}(q) = D^{[i-1]}(q)+[D^{[i-1]},D^{[i-1]}](q)\;\text{for } i>1.\tag{2}\] The distribution \(D\) is called Goursat if \(\dim D^{[i]}(q)=i+1\) for every \(i\geq 1\) and every \(q\in M\). It is very well known
[10] that any \((2,n)\) Goursat distribution at a generic point is locally equivalent to the Cartan distribution on \(J^{n-2}(\mathbb{R}, \mathbb{R})\) and therefore the germs at such points have an infinite-dimensional infinitesimal symmetry algebra.
In case (ii), the germ of \(D\) at \(q_0\) is Goursat. In case (i) our original distribution is not Goursat near \(q_0\), and we can apply Theorem 1 or Cartan’s result in [3] to \(\mathop{\mathrm{depr}}_{q_0}^sD\) with \(n\) replaced by \(n-s\), namely:
Theorem 3. Let \(D\) be a bracket-generating rank \(2\) distribution on an \(n\)-dimensional manifold \(M\), which is nowhere Goursat. For a point \(q_0\) for which the deprolongation degree \(s\) is defined 6, the following two statements are true:
If \(s<n-5\), then one can canonically assign to \(\mathop{\mathrm{depr}}_{q_0}^s D\) a frame on a \((2n - 2s - 1)\)-dimensional bundle over a
neighborhood of a generic point in the ambient manifold of \(\mathop{\mathrm{depr}}_{q_0}^s D\). In particular, this implies that the symmetry group of the germ of \(D\) at \(q_0\) has dimension at most \(2n - 2s - 1\).
If \(s=n-5\), then one can canonically assign to \(\mathop{\mathrm{depr}}_{q_0}^s D\) a frame on a \(14\)-dimensional bundle over a neighborhood
of a generic point in the ambient manifold of \(\mathop{\mathrm{depr}}_{q_0}^s D\). In particular, this implies that the symmetry group of the germ of \(D\) at \(q_0\) has dimension at most \(14\).
In either case, the maximally symmetric germ among \((2,n)\) with deprolongation degree \(s\) is locally equivalent to the generic germ of the \(s\)th
Cartan prolongation of the rank \(2\) distribution associated with the Monge equation \[\label{Monge95k} z'(x) = \left( y^{(n - 3 - s)}(x)
\right)^2,\qquad{(4)}\] or, equivalently, with the rank \(2\) distribution on \(\mathbb{R}^{n-s}\) given by the Pfaffian system ?? , with \(n\)
replaced by \(n-s\).
The last theorem together with the classical theory of Goursat distributions covers in principle (i.e., modulo analysis of the invariants coming from the canonical frame in non-Goursat case) the local geometry of all bracket generating rank 2
distributions in a neighborhood of generic points: In the Goursat case, the germs at generic points are all locally equivalent. In the non-Goursat case, after a suitable number of deprolongations, one can construct a canonical structure of absolute
parallelism.
We emphasize that our analysis concerns arbitrary bracket generating rank \(2\) distributions, but only in neighborhoods of generic points, where the notion of genericity is explicitly defined by the
assumption that the class of the distribution (or its appropriate deprolongation) at a point is maximal (see Definition 4 below).
We do not address the equivalence problem for germs of distributions at points that do not satisfy this genericity condition (which we refer to as singular points). Even in the case of Goursat distributions, the moduli space of all germs is quite wild
and consists of all germs appearing in the so-called Monster Tower (see [10])—even though the generic germ is unique up to local equivalence.
In the non-Goursat case, the situation is much more intricate: even generic germs admit functional invariants coming from the canonical frames. Nonetheless, if these function invariants are nontrivial, the canonical frame constructed at generic points
may still provide valuable information about the equivalence of germs at singular points, for instance, by studying how the invariants of the distribution behave as one approaches these singularities.
1.2 The Existence of Corank 1 Abnormal Extremals through Generic Points↩︎
Finally, note that Theorem 2 is of independent interest from the perspective of optimal control theory. On the space of Lipschitz curves that are
almost everywhere tangent to a given distribution \(D\), called horizontal curves of \(D\), consider any variational problem that assigns a cost to each such curve—for example, the
problem of minimizing length with respect to a sub-Riemannian metric. The Pontryagin Maximum Principle [11], [12] characterizes, through the Hamiltonian formalism, a class of curves, known as Pontryagin extremal trajectories, among which the minimizers of such problems (with fixed endpoints) must
lie. Extremal trajectories for which the Lagrange multiplier which multiplies the cost vanishes are called abnormal extremal trajectories; they depend only on the distribution \(D\) and not on the specific cost
functional, and they are also called singular curves of the distribution \(D\)[13].
While abnormal extremal trajectories can be described purely geometrically using the canonical symplectic form on the cotangent bundle of \(M\) ([13]–[15], see also subsection 2.1
below), for brevity we use here an equivalent description as critical points of the endpoint map: Given a point \(q_0\) and a time \(T\), denote by \(\Omega_{q_0}(T)\) the set of all horizontal curves of \(D\) starting at \(q_0\) defined on \([0,T]\), and by \(F_{q_0, T} : \Omega_{q_0}(T) \to M\) the endpoint map that takes each \(\gamma \in \Omega_{q_0}(T)\) to the endpoint \(\gamma(T)\). Note that if the set
\(\Omega_{q_0}(T)\) has the structure of a \((L^\infty([0,T]))^l\)-manifold, where \(l\) is the rank of \(D\).
Definition 1. A horizontal curve \(\gamma : [0, T] \to M\) is an abnormal extremal trajectory* of the distribution \(D\) if it is a critical point of the mapping
\(F_{q_0,T}\), that is, if \(\operatorname{Im} \left( d(F_{q_0,T})_\gamma \right) \neq T_{\gamma(T)} M\), where \(d (F_{q_0,T})_\gamma\) denotes the
differential of the endpoint map \(F_{q_0,T}\) at \(\gamma\). The corank of the abnormal extremal trajectory \(\gamma\) is defined as the
codimension of \(\operatorname{Im} \left( D_\gamma F_{q_0,T} \right)\) in \(T_{\gamma(T)}\) and is denoted by \(\mathrm{corank}(\gamma)\).*
Obviously, the corank of abnormal extremal trajectory is at least \(1\). Theorem 2 implies the following
theorem:
Theorem 4. Given a bracket generating distribution \(D\) with \(5\)-dimensional cube on a manifold \(M\), for a generic point \(q_0\in M\) there exists an abnormal extremal trajectory of corank \(1\) starting at \(q_0\). The set of such points contains all points where the distribution is
equiregular.
Remark 5. In fact, Theorem 2 implies a stronger statement: first, such an abnormal extremal is regular in a sense of [14] (equivalently, satisfies the generalized Legendre-Glebsch condition in terminology of [12], [15]); and second, there are plenty of such abnormal extremal trajectories; see
Theorem 9 below.
We conjecture that Theorem 4 (and its stronger version, Theorem 9 below) are valid at any point–that is, the phrase “at a generic point" can be omitted—although this remains beyond our current reach. See Remark 10 below for discussion of this issue.
The paper is organized as follows: In Section 2 we recall the definition of the class of a distribution from [1] which is used in Theorem 2. In Section 3 we proof Theorem 2 and therefore Theorems 1 and 3. Finally, in section 4 we prove a stronger version of Theorem 4, which is labeled Theorem 9.
Acknowledgment We would like to thank Boris Doubrov for several valuable comments, which clarified and simplified some arguments.
The class of a distribution was defined in [1], [5] in the development of
the so-called symplectification procedure. For completeness, we define the notion of class here, reproducing the constructions of our previous work [2],
which deviates only modestly from the original construction. This is especially important, because in contrast to [1], [2], [5] , which were focused mainly on the case of maximal class, here we need to consider an arbitrary class.
The symplectification procedure utilizes the natural contact structure on the projectivized cotangent bundle \(\mathbb{P}T^*M\) to construct a so-called “even contact structure” on a submanifold \(\mathcal{M}\subseteq \mathbb{P}T^*M\) of codimension 3. The kernel of this even contact structure is a canonical line distribution \(\mathcal{C}\). Lifting \(D\)
to \(\mathbb{P}T^*M\) and osculating with \(\mathcal{C}\) yields a flag at each point of \(\mathbb{P}T^*M\). The class of the distribution will then be
defined by considering the growth of the dimensions associated with this flag.
2.1 The Characteristic Line Distribution and Regular Abnormal Extremals↩︎
Let \(D\) be a bracket generating rank 2 distribution on a smooth manifold \(M\) of dimension \(n\geq5\). Define the annihilator of \(D^{\ell}\)\[\big(D^{\ell}\big)^\perp = \big\{ (p,q)\in T^*M : p\cdot v = 0\;\forall\;v\in D^{\ell}(q)\big\}.\]
Consider the fiberwise projectivization \(\mathbb{P}T^*M\) of the cotangent bundle. Since each \((D^{\ell})^\perp\) is a linear subbundle of \(T^*M\), we
may define a codimension 3 submanifold \[\mathcal{M} = \mathbb{P}\Big((D^{2})^\perp\setminus (D^{3})^\perp\Big)\subseteq \mathbb{P}T^*M.\] Let \(\mathfrak{s}\) be the tautological
(Liouville) one-form on \(T^*M\); explicitly, for coordinates \((q^i)\) on \(M\) with conjugate variables \(p_i\), \(\mathfrak{s}=\sum p_i\text{d}q^i\). Recall that \(d\mathfrak{s}\) is the canonical symplectic form on \(T^*M\). The form \(\mathfrak{s}\) passes to a conformal class \(\overline{\mathfrak{s}}\) of 1-forms on \(\mathbb{P}T^*M\) which defines a contact structure.
Since \(\text{rank}(D^{2})=3\), the submanifold \(\mathcal{M}\) has codimension 3 in the contact manifold \(\mathbb{P}T^*M\). Restricting the contact
forms \(\overline{\mathfrak{s}}\) to \(\mathcal{M}\) gives a hyperplane distribution \[\label{evencontact95H}
H=\ker\left(\overline{\mathfrak{s}}|_{\mathcal{M}}\right)\tag{3}\] with a conformal class of skew-symmetric forms \(\overline{\sigma}=d\overline{\mathfrak{s}}|_{H}\) well-defined on this hyperplane
distribution. Since \(H\) is a hyperplane distribution, it has rank \(2n-5\), so the kernel of the form \(\overline{\sigma}\) must have odd rank. In [1], the authors show that \(\ker(\overline{\sigma})\) has the minimal rank of 1, so that \(\mathcal{M}\) is equipped with a so-called even contact structure. We shall write \(\mathcal{C}\) for the line distribution \(\ker(\overline{\sigma})\),
called the characteristic line distribution of \(D\). Following ([14]–[16])
Definition 2. The integral curves of \(\mathcal{C}\) are called regular abnormal extremals* of the distribution \(D\), and their projections onto \(M\) are regular abnormal extremal trajectories.*
The reason why regular abnormal extremal trajectories are indeed abnormal extremal trajectories in the sense of Definition 1 is explained at the beginning of the proof of
Theorem 9 in Section 4. In particular, see relation 22 .
Let \(\pi:\mathcal{M}\to M\) be the canonical projection. The lift of \(D\) to \(\mathcal{M}\) is denoted by: \[\label{J0} \mathcal{J}(\lambda) = \big\{v\in T_\lambda \mathcal{M} : \pi_*(v)\in D\bigl(\pi(\lambda)\bigr)\big\}\tag{4}\] which is a distribution of rank \(n-2\). Osculating with the
characteristic line distribution \(\mathcal{C}\), we obtain from \(\mathcal{J}\) a flag at each point of \(\mathcal{M}\). Write \(\mathcal{J}^{(0)} = \mathcal{J}\) and define recursively \[\label{geod95flag} \mathcal{J}^{(i)}(\lambda) = \mathcal{J}^{(i-1)}(\lambda) + [\mathcal{C},
\mathcal{J}^{(i-1)}](\lambda)\quad \text{for}\;i\geq 1\tag{5}\] In Proposition 1 of the paper [1], the authors show that for
each \(0\leq i\) and each \(\lambda\in \mathcal{M}\), we have \[\label{jump} \text{dim}\big(\mathcal{J}^{(i+1)}(\lambda)\big) -
\text{dim}\big(\mathcal{J}^{(i)}(\lambda)\big) \leq 1.\tag{6}\] so that \[\label{max95dim95i} \dim \mathcal{J}^{(i)}(\lambda) \leq n-2+i.\tag{7}\] Let \(H\) be as in 3 . Since \(\mathcal{C}\) is a Cauchy characteristic of \(H\), and since \(\mathcal{J}\) is contained within this distribution, so too is each \(\mathcal{J}^{(i)}\). Hence \[\label{max95dim}
\dim\mathcal{J}^{(i)}(\lambda)\leq \text{rank}(H)= 2n-5.\tag{8}\] Define the integer-valued functions on \(\mathcal{M}\) and \(M\), respectively: \[\begin{gather} \tag{9} \nu(\lambda) = \min\{i\in \relax \ifmmode \mathbb{N} \else \mathbb{N}\fi: \mathcal{J}^{(i+1)}(\lambda) = \mathcal{J}^{(i)}(\lambda)\} \\ \tag{10} m(q) = \max\{\nu(\lambda): \lambda\in \pi^{-1}(q)\}
\end{gather}\] One can show that the set \(\{\lambda\in \pi^{-1}(q): \nu(\lambda)=m(q)\}\) is nonempty and
Zariski open in the fiber \(\pi^{-1}(q)\) and that \(\nu(\cdot)\) and \(m(\cdot)\) are lower semicontinuous.
Definition 3. The value \(\nu(\lambda)\) defined by 9 is called the class at \(\lambda\)* of the regular abnormal extremal
passing through \(\lambda\). The value \(m(q)\) defined by 10 is called the class of \(D\) at \(q\).*
The relation between the class at \(\lambda\) of the regular abnormal extremal passing through \(\lambda\) and the corank of the corresponding abnormal extremal trajectory is given in
section 4; see 24 . Note that 7 and 8 imply that \(\nu(\lambda)\leq n-3\), and therefore
\(m(q)\leq n-3\). The equality \(\nu(\lambda)=n-3\) holds if and only if \(\mathcal{J}^{(n-3)}(\lambda)=H(\lambda)\).
Definition 4. We say that \(D\) is of maximal class* at \(q\in M\), if its class \(m(q)\) is equal to \(n-3\). On the other hand, we say that \(D\) is of minimal class at \(q\in M\) if \(m(q)=1\).*
The following lemma is proven as Remark 3.4 in [4] and will be used in the proof of Theorem 2.
Lemma 1. Let \(D\) be a bracket generating rank 2 distribution on an \(n\)-dimensional manifold \(M\), \(n>5\). For each \(q_0\in M\), \(D\) is of minimal class at \(q_0\) if and only if \(D^{3}(q)\) has dimension \(4\) for all \(q\) in a neighborhood of \(q_0\).
Proposition 3.4 of [4] demonstrates that germs of \((2,n)\) distributions of maximal class are generic. Theorem
[max95class95conjecture] is a much stronger statement that we want to prove.
Note that the class of the distribution is constant in a neighborhood of a generic point of \(M\). To prove the first sentence of Theorem 2, it suffices to restrict our considerations to such neighborhoods, as we do in Theorem 7 below. Therefore, instead of
introducing special notation for these neighborhoods, we will, from now on, assume that the distribution \(D\) has constant class \(m\) on \(M\). Then the
set \[\mathcal{R}_{m}=\{\lambda\in \mathcal{M}: \nu(\lambda)=m\}.\] is open and dense in the space \(\mathcal{M}\).
Now let \(D\) be a rank 2 distribution of constant class \(m\) on an \(n\)-dimensional manifold \(M\), \(n\geq5\). Then by 6 for any \(q\in M\) and any \(\lambda\in \mathcal{R}_{m}\), the flag \[\mathcal{J}(\lambda)\subseteq\mathcal{J}^{(1)}(\lambda)\subseteq \cdots\subseteq \mathcal{J}^{(m)}(\lambda) \subseteq H(\lambda)\subseteq T_\lambda\mathcal{R}_{m}\] has the property that \(\text{rank}(\mathcal{J}^{(i+1)})= \text{rank}(\mathcal{J}^{(i)})+1\) for each \(0\leq i \leq m-1\). Further, from the assumption that the class is equal to \(m\)
it follows that \[\label{X95stab}[\mathcal{C},\mathcal{J}^{(m)}]\subseteq \mathcal{J}^{(m)}.\tag{11}\] We can use the conformal class of 2-forms \(\overline{\sigma}\) defined in section 2.1 to continue this flag: for each \(i\geq 1\) and each \(\lambda\in
\mathcal{R}_{m}\), define \[\mathcal{J}_{(i)}(\lambda) = \big\{v\in T_\lambda \mathcal{R}_{m} : \overline{\sigma}(v,w) = 0\;\forall\;w\in \mathcal{J}^{(i)}(\lambda)\big\},\] the skew complement of \(\mathcal{J}^{(i)}(\lambda)\) with respect to \(\overline{\sigma}\). From the definition of \(\mathcal{M}\), it follows quickly that \(\overline{\sigma}\left(\mathcal{J},\mathcal{J}\right) = 0\); we obtain a flag: \[\label{calJFlag} \mathcal{C}(\lambda)\subseteq \mathcal{J}_{(m)}(\lambda)\subseteq
\cdots \subseteq \mathcal{J}_{(1)}(\lambda)\subseteq \mathcal{J}(\lambda)\subseteq \mathcal{J}^{(1)}(\lambda)\subseteq \cdots \subseteq \mathcal{J}^{(m)}(\lambda)\subseteq H(\lambda)\subseteq T_\lambda\mathcal{R}_{m}.\tag{12}\] By 6 , we have for each \(0 < i \leq m\) that \[\label{flag32dims} \dim\big(\mathcal{J}_{(i)}(\lambda)\big) = n-2-i\quad\text{and}\quad
\dim\big(\mathcal{J}^{(i)}(\lambda)\big) = n-2+i.\tag{13}\]
At each \(\lambda\in \mathcal{R}_{m}\), one can show that \[\mathcal{J}^{(1)}(\lambda) = \big\{v\in T_\lambda \mathcal{M} : \pi_*(v)\in D^{2}(\lambda)\big\}\] which implies (with some
computation) that \(\mathcal{J}_{(1)}(\lambda) = \ker(T_\lambda\pi)\oplus \mathcal{C}(\lambda)\). Define \(V_1(\lambda) = \ker(T_\lambda\pi)\), the vertical subspace over \(\lambda\). For \(i=0\) and for \(2\leq i\leq m,\) define \[\label{Splitting} V_i(\lambda) =
\mathcal{J}_{(i)}(\lambda)\cap V_1(\lambda),\tag{14}\] the vertical component of the \((n-2-i)\)-dimensional piece of the flag at \(\lambda\). From 12 it is clear that \[\label{vert95inclusion}
V_{i}(\lambda)\subset V_{i-1}(\lambda), \quad i\geq 1.\tag{15}\] Further, observe that \(V_0(\lambda)=V_1(\lambda)\). From 12 , one can also observe that for each \(1\leq i\leq m\)\[\label{J95i326132C324332V95i} \mathcal{J}_{(i)}(\lambda) = V_i(\lambda)\oplus \mathcal{C}(\lambda),\tag{16}\] The
\(V_i\) also satisfy involutivity conditions; in the following lemma, which is Lemma 2 in [1], recall that
\(V_0=V_1\).
Proposition 6. For each \(q\in M\) and any \(0\leq i \leq m\), we have involutivity conditions \[\begin{gather} \label{invol1} [V_i,V_i]
\subseteq V_i \\ \label{invol2} [V_i,\mathcal{J}^{(i)}] \subseteq \mathcal{J}^{(i)}
\end{gather}\] {#eq: sublabel=eq:invol1,eq:invol2} for the flag of distributions on \(\mathcal{R}_{m}\)
Although the statement is only proven for distributions of maximal class in [1], the proof does not rely on maximality of class. In
order to prove the main theorem, we shall need one more fact about the flag 12 .
The following lemma is a small extension of Remark 2 of [1]. In that remark, only one inclusion of the equality [lemma32Alt32J95i] is demonstrated. Later in that paper, the equality is shown assuming maximality of class. We provide a detailed proof here because the lemma is
crucial in the proof of Theorem 2.
Lemma 2. For each \(1\leq i \leq m-1\) and each \(\lambda\in \mathcal{R}_{m}\), we have that \[\label{Alt32J95i}
\mathcal{J}_{(i)}(\lambda)+ [\mathcal{C},\mathcal{J}_{(i)}](\lambda) = \mathcal{J}_{(i-1)}(\lambda)\qquad{(5)}\] Further, \[\label{Alt32J95m} [\mathcal{C},\mathcal{J}_{(m)}] (\lambda)
\subseteq \mathcal{J}_{(m)}(\lambda).\qquad{(6)}\]
Proof. Fix \(\lambda\in \mathcal{R}_{m}\) and choose a nonvanishing local section \(X\) of \(\mathcal{C}\) near \(\lambda\). Also choose a skew-symmetric 2-form \(\sigma\) on \(\mathcal{R}_{m}\) from the conformal class \(\overline{\sigma}\)
defined in Section 2.1. Since the Lie derivative \(L_X\sigma=f\sigma\) for some \(f\in C^\infty(\mathcal{R}_{m})\), we have for any sections
\(Y\) and \(Z\) of the contact distribution \(H\) satisfying \(\sigma(Y,Z)\equiv 0\) that \[\sigma\big([X,Y],Z\big) = -\sigma\big(Y,[X,Z]\big)\]
Now fix \(1\leq i \leq m-1\); let us begin with the rightward inclusion of ?? . Fix a section \(Y\) of \(\mathcal{J}_{(i)}\) and a section \(Z\) of \(\mathcal{J}^{(i-1)}\). Since \(\mathcal{J}_{(i)}\subseteq \mathcal{J}_{(i-1)}\), we have \(\sigma(Y,Z)\equiv 0\), so
that \[\sigma\big([X,Y],Z\big) = -\sigma\big(Y,[X,Z]\big) = 0\] where the last equality holds because \([X,Z]\) is a section of \(\mathcal{J}^{(i)}\). This
demonstrates the rightward inclusion for ?? .
For the leftward inclusion of (?? ), we show for arbitrary \(\lambda\in \mathcal{R}_{m}\) that \[\label{[C,J_i] not in J_i} \mathcal{J}_{(i)}(\lambda)+
[\mathcal{C},\mathcal{J}_{(i)}](\lambda) \not\subseteq \mathcal{J}_{(i)}(\lambda).\tag{17}\] so that the conclusion follows by the rightward inclusion of ?? and 13 . By 13 , we can
choose a section \(Z\) of \(\mathcal{J}^{(i)}\) so that \([X,Z](\lambda)\in \mathcal{J}^{(i+1)}(\lambda)\setminus\mathcal{J}^{(i)}(\lambda)\). Since \([X,Z](\lambda)\notin \mathcal{J}^{(i)}\), we can find a section \(Y\) of \(\mathcal{J}_{(i)}\) so that \(\sigma\big(Y(\lambda),[X,Z](\lambda)\big) \neq 0\). Since \(\sigma(Y,Z)\equiv 0\), we then have \[\sigma\big([X,Y](\lambda),Z(\lambda)) =
-\sigma\big(Y(\lambda),[X,Z](\lambda)\big) \neq 0\] Thus \([X,Y](\lambda)\) is not in \(\big(\mathcal{J}^{(i)}(\lambda)\big)^\angle=\mathcal{J}_{(i)}(\lambda)\), and we have proven ??
.
Finally, let us prove ?? . For any section \(Y\) of \(\mathcal{J}_{(m)}\) and any section \(Z\) of \(\mathcal{J}^{(m)}\),
we have that \(\sigma(Y,Z)\equiv 0\), so that \[\sigma\big([X,Y],Z\big) = -\sigma\big(Y,[X,Z]\big) = 0\] where the last equality follows because \([\mathcal{C},\mathcal{J}^{(m)}]\subseteq \mathcal{J}^{(m)}\). ◻
As was noted at the end of section 2.2, the class of a rank 2 distribution is constant in a neighborhood of a generic point of the base manifold. Therefore to prove the first sentence of Theorem 2, it suffices to prove the following
Theorem 7. If a bracket generating distribution with \(5\)-dimensional cube has constant class, then it is of maximal class.
Proof. Assume that \(D\) has constant class \(m\). Since \(\mathop{\mathrm{rank}}D^3 = 5\), we have by Lemma 1 that \(m>1\). Consider again the flag 12 on \(\mathcal{R}_{m}\). Each piece of this flag
has constant rank, as does \(V_i\) for each \(0\leq i \leq m\). Now fix a \(\lambda_0\) in \(\mathcal{R}_{m}\). Let
\[\label{E95def}
E:=\mathcal{J}_{(m-1)}.\tag{18}\]
Choose a nonvanishing section \(X\) of \(\mathcal{C}\) near \(\lambda_0\). By 16 , we have that \(E=\mathcal{C}\oplus V_{m-1}\). Choose a section \(\varepsilon_1\) of \(V_{m-1}\) so that \([X,\varepsilon_1](\lambda_0)
\notin\mathcal{J}_{(m-1)}(\lambda_0)\). Write \[\mathrm{pr}_V: \mathcal{J}_{(1)}=\mathcal{C}\oplus V_1\to V_1\] for the projection onto the vertical subspace \(V_1\) parallel to
\(\mathcal{C}\). For each \(2\leq i \leq 2m\), define \[\varepsilon_i= \begin{cases} \mathrm{pr}_V([X,\varepsilon_{i-1}]) & \text{if }2\leq i \leq m-1 \\
[X,\varepsilon_{i-1}] &\text{if }m\leq i\leq 2m \end{cases}\] Then \[\begin{align} &\mathcal{J}_{(i)} = \mathcal{J}_{(m)}\oplus\langle \varepsilon_1,\ldots, \varepsilon_{m-i}\rangle\text{ for all }0\leq i \leq m \\
&\mathcal{J}^{(i)} = \mathcal{J}_{(m)}\oplus \langle \varepsilon_1,\ldots, \varepsilon_{m+i}\rangle\text{ for all }0\leq i \leq m
\end{align}\] Notice that by Proposition 6, we have \[\label {w_i involutivity} V_{i} = V_{m}\oplus
\langle\varepsilon_1,\ldots, \varepsilon_{m-i}\rangle\text{ is involutive for each }1\leq i \leq m\] Since \(D\) is bracket generating, so too is the distribution \(\mathcal{J}_{(0)}\) (\(=\mathcal{J}\)) on \(T\mathcal{R}_{m}\), as \(\mathcal{J}\) is the lift of \(D\). By ?? , this implies that the distribution \(E\), defined by 18 , is also bracket generating. We claim that \[\label{E32osc32flag} E^{i} = \begin{cases} \mathcal{J}_{(m-i)} & \text{for }1\leq i \leq m \\ \mathcal{J}^{(i-m)}& \text{for }m+1\leq i \leq 2m \end{cases}\tag{19}\] The claim holds for \(i=1\) by construction of \(E\). To prove the claim for \(2\leq i\leq m\), use induction, relations 16 , 15 , ?? , and the first involutivity condition ?? of Proposition 6, to get: \[\begin{gather} E^i =
E^{i-1}+[E,E^{i-1}] = \mathcal{J}_{(m-i+1)}+[\mathcal{J}_{(m-1)},\mathcal{J}_{(m-i+1)}] \\
\stackrel{\eqref{J95i326132C324332V95i}}{=} \mathcal{J}_{(m-i+1)}+[\mathcal{C}\oplus V_{m-1}, \mathcal{C}\oplus V_{m-i+1}] \stackrel{\eqref{vert95inclusion} \&\eqref{invol1}}{=}
\mathcal{J}_{(m-i+1)}+[\mathcal{C},\mathcal{J}_{(m-i+1)}]\stackrel{\eqref{Alt32J95i}}{=}
\mathcal{J}_{(m-i)},
\end{gather}\] where in the first line we used the induction hypothesis. Similarly, for each \(m+1\leq i \leq 2m\), use induction, relations 16 , 15 , 5 , and the second involutivity condition ?? of Proposition 6 to get \[\begin{gather} E^i = E^{i-1}+[E,E^{i-1}] = \mathcal{J}^{(-m+i-1)} + [\mathcal{J}_{(m-1)},\mathcal{J}^{(-m+i-1)}] \\ \stackrel{\eqref{J95i326132C324332V95i}}{=}\mathcal{J}^{(-m+i-1)} + [\mathcal{C}\oplus
V_{m-1},\mathcal{J}^{(-m+i-1)}] \stackrel{\eqref{vert95inclusion} \&\eqref{invol2}}{=}\mathcal{J}^{(-m+i-1)} + [\mathcal{C},\mathcal{J}^{(-m+i-1)}] \stackrel{\eqref{geod95flag}}{=}\mathcal{J}^{(-m+i)},
\end{gather}\] where again in the first line we used the induction hypothesis. This demonstrates 19 .
Now define \[\label{eta95def}
\eta = [\varepsilon_1,\varepsilon_{2m}].\tag{20}\] Because \(\varepsilon_1(\lambda)\notin\mathcal{J}_{(m)}(\lambda)= \big(\mathcal{J}^{(m)}(\lambda)\big)^\angle\) for each \(\lambda\in\mathcal{M}\), we have that \(\eta(\lambda)\notin H(\lambda)\), where \(H\) is the even contact distribution defined in 3 . Therefore, \(\eta(\lambda)\notin\mathcal{J}^{(m)}(\lambda)\subseteq H(\lambda)\). Since \(D\) has constant class \(m\), we
have that \([\mathcal{C},\mathcal{J}^{(m)}]\subseteq \mathcal{J}^{(m)}\). Along with the second involutivity condition ?? of Proposition 6, this implies \[E^{2m+1} = [E,\mathcal{J}^{(m)}] = [V_{m}\oplus \langle X,\varepsilon_1\rangle, \mathcal{J}^{(m)}] = \mathcal{J}^{(m)}\oplus \langle \eta\rangle.\]
We ultimately aim to show that \[[E,E^{2m+1}]\subseteq E^{2m+1},\] so that \(E^{2m+1}\) is involutive. To demonstrate this, we prove a sequence of lemmas.
Lemma 3. For each \(1\leq i\leq m\), \[[\varepsilon_i,\varepsilon_{2m+1-i}] \equiv (-1)^{i+1}\eta \mod E^{2m}\]
Proof. The claim holds by definition 20 of \(\eta\) for \(i=1\) . Now for the induction step, apply the Jacobi identity to obtain \[\begin{align} [\varepsilon_{i},\varepsilon_{2m+1-i}] &\equiv \big[[X,\varepsilon_{i-1} \text{ mod } \langle X \rangle],\varepsilon_{2m+1-i}\big] \\ &\equiv \big[X,[\varepsilon_{i-1},\varepsilon_{2m+1-i}]\big]+
\big[[X,\varepsilon_{2m+1-i}],\varepsilon_{i-1}\big] \\ &\equiv \big[X,[\varepsilon_{i-1},\varepsilon_{2m+1-i}]\big] + [\varepsilon_{2m+2-i},\varepsilon_{i-1}] \\ &\equiv \big[X,[\varepsilon_{i-1},\varepsilon_{2m+1-i}]\big] + (-1)^{i+1}\eta \\
&\equiv (-1)^{i+1}\eta \mod E^{2m}
\end{align}\] where the last equivalence holds because, first, ?? implies \([\varepsilon_{i-1},\varepsilon_{2m+1-i}]\subset \mathcal{J}^{(m)}\) and, second, by ?? , we have \[\big[X,[\varepsilon_{i-1},\varepsilon_{2m+1-i}]\big]\subseteq [\mathcal{C},\mathcal{J}^{(m)}]\subseteq \mathcal{J}^{(m)} = E^{2m}.\] ◻
Proof. Observe that for each \(2\leq i \leq m+1\), the Jacobi identity and Lemma 3 give
\[\begin{align} [\varepsilon_i,\varepsilon_{2m+2-i}] &\equiv \big[\varepsilon_{i},[X,\varepsilon_{2m+1-i}]\big] \\ &\equiv \big[[\varepsilon_{i},X],\varepsilon_{2m+1-i}\big] +
\big[X,[\varepsilon_i,\varepsilon_{2m+1-i}]\big] \\ &\equiv -[\varepsilon_{i+1} \text{ mod } \langle X\rangle ,\varepsilon_{2m+1-i}] + (-1)^{i+1}[X,\eta] \mod E^{2m+1} \\ &\equiv -[\varepsilon_{i+1},\varepsilon_{2m+1-i}] + (-1)^{i+1}[X,\eta] \mod
E^{2m+1}
\end{align}\] It is then easy to show by induction that for each \(2\leq i \leq m+1,\)\[\begin{align} [\varepsilon_2,\varepsilon_{2m}] &\equiv
(-1)^i[\varepsilon_i,\varepsilon_{2m+2-i}] - (i-2)[X,\eta]\mod E^{2m+1}
\end{align}\] In particular, for \(i=m+1,\) we obtain the result, as \([\varepsilon_{m+1}, \varepsilon_{m+1}]=0\). ◻
Remark 8. Note that for distributions of minimal class \(m=1\), so that Lemma 4 becomes trivial, and cannot be
used to prove the next lemma. Based on Lemma 1, this is the place, where we use the condition that \(\mathop{\mathrm{rank}}\,
D^3=5\).
Lemma 5. \[[X,\eta] \subset E^{2m+1}\]
Proof. We again apply the Jacobi identity to obtain \[\begin{align} [\varepsilon_2,\varepsilon_{2m}] &\equiv \big[[X,\varepsilon_1]\text{ mod } \langle X\rangle,\varepsilon_{2m}\big] \\ & \equiv
\big[[X,\varepsilon_{2m}],\varepsilon_1\big] + \big[X,\underbrace{[\varepsilon_1,\varepsilon_{2m}]}_{\eta \text{ by \eqref{eta95def}}}\big] \\ & \equiv [X,\eta]\mod E^{2m+1},
\end{align}\] where we use in each equivalence the fact that \([X, \varepsilon_{2m}] \subset E^{2m}\), which follows from 11 . Using Lemma 4, this then implies that \[(1-m)[X,\eta] \equiv [X,\eta]\mod E^{2m+1}.\] Since \(m>1\), we have that \([X,\eta] \subset
E^{2m+1}\). ◻
Lemma 6. \[[V_{m},E^{2m+1}]\subseteq E^{2m+1}\]
Proof. Recall that \(E^{2m+1} = \mathcal{J}^{(m)}\oplus \langle \eta\rangle\); the second involutivity condition ?? of Proposition 6 gives that \([V_{m},\mathcal{J}^{(m)}]\subseteq E^{2m+1}\). Also, applying both involutivity conditions ?? ?? and the Jacobi identity give \[\begin{align} [V_{m},\eta] &= \big[V_{m},[\varepsilon_1,\varepsilon_{2m}]\big] \\ &\subseteq \big[[V_{m},\varepsilon_1],\varepsilon_{2m}\big] + \big[\varepsilon_1,[V_{m},\varepsilon_{2m}]\big] \\ & \subseteq
\big[[V_{m},V_{m-1}],\mathcal{J}^{(m)}\big] + \big[\varepsilon_1,[V_{m},\mathcal{J}^{(m)}]\big] \\ &\subseteq [V_{m-1},\mathcal{J}^{(m)}] + [\mathcal{J}_{(m-1)},\mathcal{J}^{(m)}] \\ &= [E,E^{2m}] \subseteq E^{2m+1}.
\end{align}\] ◻
Proof. Note that by the involutivity condition ?? of Proposition 6 with \(i=m-2\), \[[\varepsilon_1,\varepsilon_2] \subset E^2.\] Similarly, by the involutivity condition ?? with \(i=m-1\), \[[\varepsilon_1,\varepsilon_{2m-1}] \subset
E^{2m-1}\] Therefore, applying Lemma 3 then the Jacobi identity yields \[\begin{align} [\varepsilon_1,\eta] &\equiv
-\big[\varepsilon_1,[\varepsilon_2,\varepsilon_{2m-1}] \text{ mod } E^{2m}\big] \\ & \equiv -\big[[\varepsilon_1,\varepsilon_2],\varepsilon_{2m-1}\big] - \big[\varepsilon_2,[\varepsilon_1,\varepsilon_{2m-1}]\big] \\ &\equiv 0 \mod E^{2m+1}.
\end{align}\] ◻
Now, in order to prove the theorem, note that \[\begin{align} [E,E^{2m+1}] &= E^{2m+1} + [E,\langle \eta\rangle] \\ &= E^{2m+1} + [\langle X\rangle,\langle \eta\rangle] + [V_{m},\langle \eta\rangle] + [\langle
\varepsilon_1\rangle,\langle \eta\rangle].
\end{align}\] However, the terms on the right-hand side are included in \(E^{2m+1}\) by Lemmas 5, 6, and 7, respectively. Therefore, \([E,E^{2m+1}]\subseteq E^{2m+1}\), and
the distribution \(E^{2m+1}\) is involutive.
Because \(E\) is bracket generating, this implies that \(E^{2m+1}=T\mathcal{R}_m\). Comparing ranks, we have that \[\text{Rank}(E^{2m+1}) = n+m-1 =
\text{Rank}(T\mathcal{R}_m) = 2n-4.\] Therefore, \(m=n-3\), and \(D\) has maximal class at \(p\), which completes the proof of Theorem 7 and therefore the first sentence of Theorem 2. ◻
Now prove the second sentence of Theorem 2. Let \(q_0\) be a point where \(D\) is
equiregular and assume that \(\mu\) is the minimal integer such that \(D^\mu(q_0)=T_{q_0}M\). Denote \[\label{Tanaka95comp}
\mathfrak{g}_{-1}(q_0):=D(q_0), \quad \mathfrak{g}_{-i}(q_0) : =D^{i}(q_0)/D^{i-1}(q_0), \,\, \forall 1\leq j\leq \mu,\tag{21}\] then the graded space \(\mathfrak{m}(q_0)=
\displaystyle{\bigoplus_{j=-\mu}^{-1}\mathfrak{g}_j(q_0)}\), associated with the filtration 1 , is endowed with the structure of the graded nilpotent Lie algebra, called the Tanaka symbol of the distribution \(D\) at the point \(q_0\). The flat distribution \(D_{\mathfrak m(q_0)}\) of constant symbol (or type) \(\mathfrak
m(q_0)\) is defined to be the left-invariant distribution corresponding to the \((-1)\)-graded component \(\mathfrak g_{-1}(q_0)\) on the simply connected Lie group with Lie
algebra \(\mathfrak{m}(q_0)\). Since the flat distribution \(D_{\mathfrak m(q_0)}\) is left-invariant, its germs at different points are equivalent 7, and therefore \(D_{\mathfrak m(q_0)}\) has constant class. Therefore, by Theorem 7
it is of maximal class at every point. Then by [6], the original distribution \(D\) at \(q_0\) is of maximal class.
We are going to prove the following stronger theorem:
Theorem 9. Let \(D\) be a bracket generating distribution with \(5\)-dimensional cube on a manifold \(M\) and let \(D\) have maximal class at \(q_0\in M\); i.e., \(m(q_0)=n-3\) (the set of such points is generic by Theorem 2 and contains all points where the distribution is equiregular). Then there exists a regular abnormal extremal trajectory of corank \(1\) starting at
\(q_0\). Moreover, the set of regular abnormal extremal trajectories of corank \(1\) starting at \(q_0\), considered as unparametrized curves, is not only
nonempty but also open and dense—in fact, Zariski open—in the space of regular abnormal extremal trajectories starting at \(q_0\), under the identification of this space with the space \(\mathbb{P}\Bigl((D^2)^\perp(q_0)\backslash (D^3)^\perp(q_0)\Bigr)\) (here \((D^\ell)^\perp(q_0):=(D^\ell)^\perp\cap T_{q_0}^*M\)).
Proof. First, consider a parametrized regular abnormal extremal \(\Gamma\colon [0, T]\to \mathcal{M}\) and let \(X\) be a vector field in a neighborhood of \(\Gamma\) in \(\mathcal{M}\) that generates the characteristic line distribution \(\mathcal{C}\) and such that \(\Gamma\) is an
integral curve of \(X\). As before, let \(\pi:\mathcal{M} \to M\) be the canonical projection. Set \(\gamma:=\pi(\Gamma)\) and let \(e^{tX}\) be the flow generated by \(X\). Then [16], more specifically,
relations (4.6) and (4.7) there, imply the following relation between the differential of the endpoint map \(F_{\gamma(0), T}\) at \(\gamma\) and the lift \(\mathcal{J}\) of the distribution \(D\), defined by 4 :
\[\label{diff95end95point95jac}
\operatorname{Im} \left( d_\gamma F_{q_0,T} \right)=\mathrm{span}\left\{ d(\pi\circ e^{(T-t)X})_{\Gamma(t)}\mathcal{J}\left(\Gamma(t)\right): 0\leq t\leq T\right\}\subset d\pi_{\Gamma(T)}H\bigl(\Gamma(T)\bigr).\tag{22}\] The latter inclusion
follows from the fact that \(X\) generates the Cauchy characteristic distribution \(\mathcal{C}\) of \(H\) and it shows why \(\gamma\) is an abnormal extremal trajectory in the sense of Definition 1. From relation 22 , the definition of
\(\mathcal{J}^{(i)}\) as in 5 , and the properties of Lie derivatives, it follows that \[\label{class95corank95ineq951}
d\pi_{\Gamma(T)}\mathcal{J}^{\bigl(\nu\bigl(\Gamma(T)\bigr)\bigr)}\bigl(\Gamma(T)\bigr)\subset \operatorname{Im} \left( d (F_{q_0,T})_\gamma \right),\tag{23}\] where \(\nu(\lambda)\) is defined by 98. By construction \(\dim\, \mathcal{J}^{(i)}(\lambda)=n-2+i\) for \(0\leq i\leq
\nu(\lambda)\) and the fiber of the bundle \(\mathcal{M}\) is \((n-4)\)-dimensional we have that \(\dim\, d\pi_{\lambda}
\mathcal{J}^{(i)}(\lambda)=i+2\) in this range of \(i\). Consequently, 23 implies that \(\dim\, \operatorname{Im} \left( d_\gamma F_{q_0,T}
\right)\geq \nu\bigl(\Gamma(T)\bigr)+2\). Hence, \[\label{class95corank95ineq953}
\mathrm{corank}(\gamma)\leq n-2-\nu\bigl(\Gamma(T)\bigr).\tag{24}\] In particular, if \(\nu\bigl(\Gamma(T)\bigr)=n-3\), (i.e., if \(\nu\) takes its maximal value), then \(\mathrm{corank}(\gamma)\leq 1\). On the other hand, \(\gamma\) is an abnormal extremal in the sense of Definition 1,
\(\mathrm{corank}(\gamma)>0\), so we conclude that \(\mathrm{corank}(\gamma)=1\) in this case. If we assume that \(\nu\bigl(\Gamma(0)\bigr) = n-3\), then
we can take sufficiently small \(T\) so that \(\nu\big(\Gamma(T)\bigr)=n-3\), so the corresponding \(\gamma\) has corank 1. This proves our theorem, because
if \(m(q_0) = n - 3\), then the set of points \(\lambda\) in the fiber \(\pi^{-1}(q_0)\) such that \(\nu(\lambda) = n - 3\)
is a nonempty Zariski open subset, so, by the above arguments, the projection of a sufficiently small segment of a regular abnormal extremal starting at such a \(\lambda\) will have corank 1. ◻
Remark 10. Finally, note that if instead of assuming that the class of the distribution \(D\) at \(q_0\) is maximal, we assume that there exists a regular abnormal
extremal starting at \(q_0\) such that the class of the distribution at its endpoint is maximal, then—by the same arguments as in the proof of the previous theorem—the corank of this abnormal extremal trajectory is equal to
\(1\). However, we do not know how to exclude the possibility that any regular abnormal extremal trajectory starting at a point where the class of the distribution \(D\) is not maximal
remains entirely within the locus of points with the same property. Therefore we do not know yet how to remove the assumption on the starting point \(q_0\) in Theorems 4 and 9. Perhaps the fact that all points in this locus are points where
the distribution \(D\) is not equiregular can be used in some way.
Boris Doubrov and Igor Zelenko. On local geometry of non-holonomic rank 2 distributions. Journal of the London Mathematical Society, 80(3):545–566, 2009.
[2]
Nicklas Day, Boris Doubrov, and Igor Zelenko. Symplectification of rank 2 distributions, normal Cartan connections, and Cartan prolongations, 2025.
arXiv:2506.09232[math.DG].
[3]
Elie Cartan. Les systèmes de Pfaff,à cinq variables et les équations aux dérivées partielles du second ordre. Annales scientifiques de l’École Normale Supérieure, 3e
série, 27:109–192, 1910.
[4]
Igor Zelenko. On variational approach to differential invariants of rank two distributions. Differential Geometry and its Applications, 24(3):235–259, may 2006.
[5]
Boris Doubrov and Igor Zelenko. A canonical frame for nonholonomic rank two distributions of maximal class. C. R. Math. Acad. Sci. Paris, 342(8):589–594, 2006.
[6]
Eric Duong Ba Wendel. On the maximality of the class of rank-2 distributions with 5-dimensional cube. Texas A&M University Master’s Thesis, 2015.
[7]
Ian Anderson and Boris Kruglikov. Rank 2 distributions of monge equations: Symmetries, equivalences, extensions. Advances in Mathematics, 228(3):1435–1465, 2011.
[8]
Noboru Tanaka. . Journal of Mathematics of Kyoto University, 10(1):1 – 82, 1970.
[9]
Igor Zelenko. On Tanaka’s prolongation procedure for filtered structures of constant type. SIGMA Symmetry Integrability Geom. Methods Appl., October 2009.
[10]
Richard Montgomery and Michail Zhitomirskii. Points and curves in the Monster tower. Mem. Amer. Math. Soc., 203(956):x+137, 2010.
[11]
L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko. The mathematical theory of optimal processes. Interscience Publishers John Wiley & Sons, Inc.,
New York-London, 1962.
[12]
Andrei A. Agrachev and Yuri L. Sachkov. Control theory from the geometric viewpoint, volume 87 of Encyclopaedia of Mathematical Sciences. Springer-Verlag, Berlin, 2004.
Control Theory and Optimization, II.
[13]
Richard Montgomery. A tour of subriemannian geometries, their geodesics and applications, volume 91 of Mathematical Surveys and Monographs. American Mathematical
Society, Providence, RI, 2002.
[14]
Wensheng Liu and Héctor J. Sussman. Shortest paths for sub-Riemannian metrics on rank-two distributions. Mem. Amer. Math. Soc., 118(564):x+104, 1995.
[15]
Igor Zelenko. Nonregular abnormal extremals of \(2\)-distribution: existence, second variation, and rigidity. J. Dynam. Control
Systems, 5(3):347–383, 1999.
[16]
Andrei A. Agrachev and Andrei V. Sarychev. Abnormal sub-Riemannian geodesics: Morse index and rigidity. Ann. Inst. H. Poincaré C Anal. Non Linéaire,
13(6):635–690, 1996.
[17]
André Bellaïche. The tangent space in sub-Riemannian geometry. In Sub-Riemannian geometry, volume 144 of Progr. Math., pages 1–78. Birkhäuser,
Basel, 1996.
[18]
Frédéric Jean. Control of nonholonomic systems: from sub-Riemannian geometry to motion planning. SpringerBriefs in Mathematics. Springer, Cham, 2014.
[19]
S. Yu. Ignatovich. Realizable growth vectors of affine control systems. J. Dyn. Control Syst., 15(4):557–585, 2009.
[20]
I. Zelenko. [book review of MR3308372]. Bull. Amer. Math. Soc. (N.S.), 53(1):151–158, 2016.
A shorter optimal control description of this notion of maximal class via minimal corank of abnormal extremals is given by the conclusion of Theorem 4 below, if one takes into account Remark 5.↩︎
A heuristic explanation of this is that generic germs of \((2,n)\)-distributions are described, up to local equivalence, by \(\dim\mathrm {Gr}(2,n)-n=n-4\) functions of \(n\) variables, whereas a Monge distribution is described by only one such function. Here, \(\mathrm {Gr}(2,n)\) denotes the Grassmannian of planes in \(\mathbb{R}^n\)↩︎
We believe that for \(n\geq 6\) non-Monge distributions are generic even among distributions with Tanaka symbol isomorphic to a fixed Monge symbol.↩︎
We also use this observation in the proof of the second sentence of Theorem 2.↩︎
A distribution with \(5\) dimensional cube has class greater than \(1\); see Lemma 1.↩︎
Note that the deprolongation degree is defined at a generic point. Since \(D\) is nowhere Goursat, we have that \(s\leq n-5\).↩︎
Note that for non-equiregular points, there is a notion of flat distribution (or nilpotent approximation) as well ([17]–[20]): it can be seen as a distribution on a homogeneous space, but in contrast to equiregular case there are pairs of points at which the germs of the distribution
are not equivalent, so the present arguments do not work.↩︎
Note that in the real-analytic category the inclusion in 23 is an equality.↩︎