Universal vector and matrix optimal transport


Abstract

In this paper we propose a gauge-theoretic approach to the problems of optimal mass transport for vector and matrix densities. This resolves both the issues of positivity and action transitivity constraints. Bures-type metrics on the corresponding semi-direct product groups of diffeomorphisms and gauge transformations are related to Wasserstein-type metrics on vector half-densities and matrix densities via Riemannian submersions. We also describe their relation to Poisson geometry and demonstrate how the momentum map allows one to prove the Riemannian submersion properties. The obtained geodesic equations turn out to be vector versions of the Burgers equations.

1 Introduction↩︎

The geometric setting of optimal transport relies on the differential geometry of diffeomorphism groups. In particular, the problem of moving one mass to another by a diffeomorphism while minimizing a certain cost can be understood as a construction of geodesics for an appropriate metric in the space of normalized densities; see, for example, Benamou and Brenier [1] or Villani [2] and references therein. Many applications include similar problems where one needs to measure the proximity between different shapes; see, for example, Michor and Mumford [3] or Younes [4]. However, the diffeomorphism action by itself does not allow one to change the total mass of the density. One arrives at the problem of constructing a natural extension of the action which would allow one to connect in the most economical way densities of different total masses, thus leading to the theory of unbalanced optimal transport [5]. Furthermore, other applications, for instance to the theory of spin liquids, require distances between densities that are not scalar- but vector- or matrix-valued. The latter problems are the main subject of this work.

There are several obstructions even to properly posing the problem of the mass transport for vector/matrix densities. Namely, a unique diffeomorphism might not allow one to move simultaneously all coordinate densities in a required way. Moreover, another major difficulty is the positivity constraint for each coordinate density. This constraint is a serious obstacle for many previously suggested scenarios for admissible paths between densities.

In this paper we resolve these difficulties by proposing the gauge-theoretic approach to the problem. Furthermore, we regard the vector densities as half-densities, with the \({\rm SO}(k)\)-gauge transformations acting on them along with diffeomorphisms, which resolves both the positivity and action transitivity issues. The corresponding group acting on densities is enlarged from diffeomorphisms to a semidirect product group. Such groups differ from one problem to another, and they are described below in the various cases of balanced and unbalanced vector and matrix densities.

Next, we propose a tight connection to Poisson geometry and demonstrate how the momentum map helps compute the horizontal distributions and prove the Riemannian submersion properties. Finally, it turned out that natural Bures-type metrics on the acting groups and vector/matrix densities lead to a familiar form of the geodesic equations. The geodesic equations turn out to be various ramifications of the Burgers equations of fully compressible fluids.

Overview of the main results↩︎

Here is a quick overview of the main results, first for the vector half-densities and then for the matrix densities.

Vector-valued optimal transport

To present a sample of the results we start with a manifold \(M\) of any dimension \(n\) and consider the space of normalized vector-valued half-densities \[\mathrm{VHProb} =\{ \mathbf{v} \in C^\infty(M,\mathbb{R}^k ) \otimes \operatorname{HProb}(M)~|~ \int_M \lvert \mathbf{v} \rvert^2 = 1 ~{\rm and ~} \mathbf{v}\not=0 ~{\rm on ~} M\}\] where \(k\) and \(n\) are unrelated. These half-densities can be represented as \(\mathbf{v}={v}\otimes \sqrt\varrho\in \mathrm{VHProb}\) with \(\int_M\varrho=1\) and \(\varrho > 0\). The group acting on half-densities in \(\mathrm{VHProb}\) is the semidirect product \(\widetilde{G}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\ni (\varphi, A)\), where the diffeomorphism \(\varphi\) acts on half-densities \(\mathbf{v}\) by the pullback with its inverse and with the square root of its Jacobian appearing as a factor, while the \({\rm SO}(k)\)-current \(A\) on \(M\) is applied point-wise to the obtained vector-valued half-density. This action turns out to be transitive and preserves positivity of the densities, see Section 3.1.

One can introduce natural (weak) Riemannnian metrics on both the group \(\widetilde{G}\) and the density space \(\mathrm{VHProb}\), and the natural projection becomes a Riemannian submersion (Theorem 8). In particular, the vector Wasserstein distance \(\mathit{WV}(\mathbf{v}_0,\mathbf{v}_1)\) between densities \(\mathbf{v}_0,\mathbf{v}_1 \in \mathrm{VHProb}\) is \[\mathit{WV}^2(\mathbf{v}_0,\mathbf{v}_1) = \inf_{u,a,\mathbf{v}}\int_0^1 \int_M \left( \lvert u\rvert^2 + \operatorname{tr}(aa^\top) \right) \lvert\mathbf{v}\rvert^2 \, dt\,,\] over time-dependent vector fields \(u(t)\in \mathfrak{X}(M)\), vector half-densities \(\mathbf{v}(t) \in \mathrm{VHProb}\), and matrix valued functions \(a(t) \in C^\infty(M, \mathfrak{so}(k))\) related by the constraints \[\dot{\mathbf{v}} = -L_u \mathbf{v} + a \mathbf{v} ,\quad \mathbf{v}(0) = \mathbf{v}_0, \quad \mathbf{v}(1) = \mathbf{v}_1,\] see Definition 3. Essentially, the distance between any two vector densities increases whenever more mass has been moved and more mass was necessary to “rotate” in the gauge group to relate their vector values.

The corresponding geodesic equations are given by the following:

Theorem 1 (= Theorem 9\('\)). The reduced geodesic equations on the corresponding (co)tangent bundle \(T^{(*)}\widetilde{G}= T^{(*)}(\operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k)))\), corresponding to an optimal transport of vector densities, are given by \[\dot{u} + \nabla_{{u}}{u} = 0, \quad \dot{a} + \nabla_u a = 0, \quad \dot{\mathbf{v}} + L_{{u}} \mathbf{v} = a \mathbf{v}\] where \(\mathbf{v} = (\varphi,A)\cdot \mathbf{v}_0 = {v}\otimes\sqrt\varrho\). In particular, the scalar density \(\varrho\) defined by \(\varrho = \lvert\mathbf{v}\rvert^2\) is transported by \(u\): \[\dot{\varrho} +L_u \varrho=0.\]

These geodesic equations become natural extensions of the classical Burgers equation for the velocity \(u\) and transported scalar or gauge field \(a\). However, due to complexity of the horizontal distribution in the group \(\operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) intertwining different components, one cannot split out completely the velocity part and describe solutions to these seemingly simple equations solely via gradient fields as in the classical scalar optimal transport.

Analogous equations describe the unbalanced vector transport, but with the \({\rm Conf}(k)\)-gauge group of linear conformal transformations replacing \(\operatorname{SO}(k)\), see Section 4.2. Since we are interested in the local questions and the corresponding geodesic equations, everywhere in the paper we assume that vector half-densities on \(M\) under consideration are homotopy equivalent and that the corresponding purely topological obstructions for their global gauge equivalence vanish, as we discuss below in Remark 12.

Matrix-valued optimal transport

Similar metrics can be defined in the matrix density case. Namely, we consider the space of normalized matrix densities on \(M\) (of any dimension \(n\)) \[\mathrm{MProb}:=\{\mathbf{S} \in \Omega^n(M, {\rm Sym}_+(k))~|~ \, \int_M \operatorname{tr}(\mathbf{S})=1 \}\] of unit total mass with values in positive-definite quadratic forms of unit trace in \({\mathbb{R}}^k\). Here the \({\rm Sym}_+(k)\) stands for the cone of positive-definite quadratic forms of \(k\) variables. The scalar density associated with \(\mathbf{S}\) is \(\operatorname{tr}(\mathbf{S})\), and the total mass is normalized: \(\int_M \operatorname{tr}(\mathbf{S})=1\).

The semidirect product group acting on this space is \[\widehat{PG}={\rm Diff}(M)\ltimes C^\infty(M, {\rm PGL}(k, \mathbb{R})),\] with possibly different \(k\) and \(n\). The group action is the composition of the diffeomorphism action and a pointwise “rescaling” as follows: For an element \[(\varphi, A)\in {\rm Diff}(M)\ltimes C^\infty(M, {\rm PGL}(k, \mathbb{R}))\] the action on a matrix density \(\mathbf{S}\) is \((\varphi, A)\cdot \mathbf{S} = \pi\circ A (\varphi_*\mathbf{S}) A^\top\), where \(\varphi_*\) acts on each matrix element thought of as a top-form, \(A\) acts by pointwise change of basis in the quadratic form, and the pointwise rescaling \(\pi\) takes positive-definite quadratic forms to those of unit trace. Similarly to the above, the action of the group \(\widehat{PG}\) on matrix densities \(\mathrm{MProb}\) of unit total masses is transitive.

As above, let now \(M\) be a Riemannian manifold with metric \((\cdot\,, \cdot)\). We fix an initial matrix density \(\mathbf{S}_0 \in \mathrm{MProb}\). A matrix analogue of the Bures-type \(L^2\)-metric on the group leads to the following definition of the matrix Wasserstein distance \(\mathit{WM}(\mathbf{S}_0,\mathbf{S}_1)\) between two matrix densities \(\mathbf{S}_0,\mathbf{S}_1 \in \mathrm{MProb}\): \[\mathit{WM}^2(\mathbf{S}_0, \mathbf{S}_1) = \inf_{u,a,\mathbf{S}}\int_0^1 \int_M \left( \lvert u\rvert^2 + \operatorname{tr}(aa^\top) \right) \operatorname{tr}(\mathbf{S}) \, dt\,,\] where the infimum is taken over time-dependent vector fields \(u(t)\in \mathfrak{X}(M)\), matrix densities \(\mathbf{S}(t) \in \mathrm{MProb}\), and matrix valued functions \(a(t) \in C^\infty(M, \mathfrak{gl}(k))\) related by the constraints \[\dot{\mathbf{S}} = -L_u\mathbf{S} + a\mathbf{S} + \mathbf{S} a^\top ,\quad \mathbf{S}(0) = \mathbf{S}_0, \quad \mathbf{S}(1) = \mathbf{S}_1, \quad \operatorname{tr}(a\mathbf{S})=0\] Here again this metric naturally extends the classical Wasserstein metric by penalizing both whenever some mass is to be moved over \(M\) and whenever two matrix valued masses need an additional transformation in the fibers to identify them. An appropriate metric on the group \(\widehat{PG}\) projects to the above one via Riemannian submersion. By introducing factorization \(\mathbf{S} = S\otimes\varrho\) where \(\operatorname{tr}S = 1\) one can write the corresponding reduced geodesic equations on (the cotangent space of) the group \(\widehat{PG}\) as follows:

Theorem 2 (= Theorem 19\('\)). The Hamiltonian equations of motion, corresponding to the reduced geodesic equations on \(\widehat{PG}\) for an optimal transport of matrix densities, are \[\begin{align} & \dot{u} + \nabla_{u}{u} = 0, \qquad ~\qquad [\dot{a}] + \nabla_u [a] = 0, \\ & \dot{S} + \nabla_u S = a S + Sa^\top , \quad \dot{\rho} + \operatorname{div}(\rho u) = 0\,, \end{align}\] with the constraint \(\operatorname{tr}(aS)\equiv 0\) for selecting \(a\) from the coset \([a]\in C^\infty(M,\mathfrak{pgl}(k))\).

Note that the geodesic equations turn out to be rather explicit and can be regarded as a Burgers-type intertwined version of compressible chiral fluids. As before, the difficulty of solving them explicitly is in the complexity of the horisontal distribution for the corresponding Riemannian submersion. We discuss the equations in more detail in Section 5.2, as well as an unbalanced version of this transport in Section 5.1. It is worth mentioning that unlike the vector case, all matrix densities on \(M\) are homotopy equivalent since the space of positive-definite quadratic forms is contractible. One might still encounter topological obstructions for their gauge equivalence which we assume vanishing, see Remark 12.

Notations Throughout the paper we adopt the following notations: \(\mathrm{Prob}\) and \(\mathrm{HProb}\) stand respectively for normalized densities and half-densities, while for non-normalized ones we reserve \(\mathrm{Dens}\) and \(\mathrm{HDens}\). The vector and matrix objects are marked by V or M, so that, e.g., normalized vector half-densities are \(\mathrm{VHProb}\), while all matrix densities are \(\mathrm{MDens}\).

Motivation One of the most straightforward applications of optimal transport of vector densities includes colored medical imaging, for example multi parametric magnetic resonance imaging where the color channels correspond to different types of imaging data [6]. By making a colored picture of an organ, one can see “how far” it is from its healthier shape, size, and color. Medical imaging is also relevant for matrix-valued densities, for example, for diffusion tensor imaging where the imaging data consists of point-wise symmetric matrices that describe the diffusion of water in the tissue.

As another application we mentioned the theory of quantum spin liquids in condensed matter physics. It describes a phase of matter that can be formed by interacting quantum spins in certain magnetic materials. This state is intuitively described as a “liquid” of disordered spins, much in the way liquid water is in a disordered state compared to crystalline ice. In this paper we propose a possible measure of the distance between any two such vector-valued states. We discuss several other motivations in the last section.

The paper is organized as follows. In section 2 we give background on the geometric setting of classical optimal transport. In section 3 we introduce vector half-densities and their corresponding transport maps (based on a semi-direct product), and we give a geometric description of the corresponding transport problem, including the geodesic equations. In section 4 we first review the framework of unbalanced optimal transport, and then discuss how to generalize it to vector half-densities. Finally, in section 5 we present a universal point-of-view, which deals with matrix densities and encapsulates all the other frameworks. In section 6 we discuss various open questions and further directions. In appendices we describe how the moment map manifests itself in optimal transport (appendix 7) and demostrate the geodesic equations for an alternative metric on matrix densities (appendix 8).

Acknowledgements

The authors are indebted to Sergei Tabachnikov, Anton Izosimov and François-Xavier Vialard, as well as other participants of the workshops “Math en plein air", for fruitful discussions. B.K. was partially supported by an NSERC Discovery Grant. K.M. was supported by the Swedish Research Council (grant number 2022-03453), the Knut and Alice Wallenberg Foundation (grant numbers WAF2019.0201), and the Göran Gustafsson Foundation for Research in Natural Sciences and Medicine.

2 Optimal transport and semidirect products↩︎

2.1 Geometric setting of scalar optimal transport↩︎

Let \(\mu\) be a smooth reference volume form (or density) of unit total mass on a manifold \(M\), and consider the projection \(\pi : \operatorname{Diff}(M) \to \operatorname{Prob}(M)\) of diffeomorphisms onto the space \(\operatorname{Prob}(M)\) of smooth probability (i.e. normalized) densities on \(M\). The diffeomorphism group \(\operatorname{Diff}(M)\) is fibered over \(\operatorname{Prob}(M)\) by means of this projection \(\pi\) as follows: the fiber over a volume form \(\varrho\) consists of all diffeomorphisms \(\eta\) that push \(\mu\) to \(\varrho\), \(\eta_*\mu=\varrho\). In other words, two diffeomorphisms \(\eta_1\) and \(\eta_2\) belong to the same fiber if and only if \(\eta_1=\eta_2\circ\varphi\) for some diffeomorphism \(\varphi\) preserving the volume form \(\mu\). Diffeomorphisms from \(\operatorname{Diff}(M)\) act transitively on smooth densities, according to the Moser theorem.

Suppose now that \(M\) is a compact \(n\)-dimensional Riemannian manifold with metric \((~,~)\). Then one can define (weak) Riemannian metrics on both the diffeomorphism group \(\operatorname{Diff}(M)\) and the density space \(\operatorname{Prob}(M)\). Namely, at each point \(\eta\in\operatorname{Diff}(M)\) of the diffeomorphism group is defined in the following straightforward way: given \(X,Y\in \mathfrak{X}(M)\), the inner product of two vectors \(X\circ\eta,Y\circ\eta\in T_\eta\operatorname{Diff}(M)\) is \[\label{eq:diff95metric} \langle\langle X\circ\eta,Y\circ\eta\rangle\rangle_{\eta} =\int_M \left(X\circ\eta(x),Y\circ\eta(x)\right)\,\mu(x).\tag{1}\] This metric is right-invariant when restricted to the subgroup \(\operatorname{Diff}_\mu(M)\) of volume-preserving diffeomorphisms, although it is not right-invariant on the whole group \(\operatorname{Diff}(M)\). It is worth mentioning that for a flat manifold \(M\) this metric is a flat metric on \(\operatorname{Diff}(M)\): a neighborhood of the identity \(e\in \operatorname{Diff}(M)\) with the metric \(\langle\langle~,~\rangle\rangle_{\operatorname{Diff}}\) is isometric to a neighborhood in the pre-Hilbert space of smooth “vector-functions” \(\eta: M\to M\) with the \(L^2\) inner product \(\langle~,~\rangle_{L^2(M)}\).

On the other hand, on the density space \(\operatorname{Prob}(M)\) there also exists a natural metric inspired by the following optimal mass transport problem: find a (sufficiently regular) map \(\eta:M\to M\) that pushes the measure \(\mu\) forward to \(\varrho\) and attains the minimum of the \(L^2\)-cost functional \(\int_M \operatorname{dist}^2(x,\eta(x))\mu\) among all such maps, where \(\operatorname{dist}\) is the Riemannian distance on \(M\). The minimal cost of transport defines the \(L^2\) Wasserstein distance \({\operatorname{Dist}}\) on densities \(\operatorname{Prob}(M)\): \[{\operatorname{Dist}}^2(\mu, \varrho) :=\inf_\eta\Big\{\int_M \operatorname{dist}^2(x,\eta(x))\mu~ |~\eta_*\mu=\varrho\Big\}\,.\] One can see that the Wasserstein distance is formally generated by the (weak) Riemannian metric on the space \(\operatorname{Prob}(M)\) of smooth densities: \[\label{eq:dens95metric} \langle\dot{\varrho},\dot{\varrho}\rangle_\varrho = \int_M |{\nabla {\theta}}|^2 \varrho \quad\text{for } \dot{\rho} + {\rm div}_\mu({\rho} \nabla {\theta})=0,\tag{2}\] where \(\dot{\rho} \in C^\infty(M)\) is a tangent vector to \({\rm Prob}(M)\) at the point \(\varrho=\rho\mu\). (The gradient vector field \(\nabla{\theta}\) has the minimal \(L^2\) energy among all vector fields moving the volume form \(\varrho\) in the direction \(\dot{\varrho}\) due to the Hodge decomposition.) Thus both \(\operatorname{Diff}(M)\) and \({\rm Prob}(M)\) can be regarded as infinite-dimensional Riemannian manifolds.

Theorem 3 ([7]). The bundle map \(\pi:\operatorname{Diff}(M)\to {\rm Prob}(M)\) is a Riemannian submersion of the metric 1 on the diffeomorphism group \(\operatorname{Diff}(M)\) to the metric 2 on the density space \({\rm Prob}(M)\). The horizontal distribution consists of right-translated gradient fields.

Recall that for two Riemannian manifolds \(P\) and \(B\) a Riemannian submersion \(\pi: P\to B\) is a mapping onto \(B\) that has maximal rank and preserves lengths of horizontal tangent vectors to \(P\). For a bundle \(P\to B\) this means that on \(P\) there is a distribution of horizontal spaces that are orthogonal to fibers and projected isometrically to the tangent spaces to \(B\).

2.2 Geodesics in the scalar transport↩︎

One of the main problems in optimal mass transport is to describe geodesics on the space \(\operatorname{Prob}(M)\) of densities, that is, on the base of the bundle \(\operatorname{Diff}(M)\to \operatorname{Prob}(M)\). However, it is easier to start instead with the description of geodesics on \(\operatorname{Diff}(M)\) and then restrict to horizontal geodesics, which correspond to geodesics on the base.

Define the (inviscid) Burgers equation as the evolution equation \[\label{eq:burgers} \partial_t u+\nabla_u u=0\,\tag{3}\] for a vector field \(u\) on \(M\), where \(\nabla_u u\) stands for the covariant derivative on \(M\). Consider the flow \((t,x)\mapsto \eta(t,x)\) corresponding to velocity field \(u\): \[\partial_t \eta (t,x) = u(t,\eta(t,x)), \qquad \eta(0,x)=x.\]

Proposition 4 ([7]). (1) Solutions of the Burgers equation are time-dependent vector fields on \(M\) that describe the following flows of fluid particles: each particle moves with constant velocity (defined by the initial condition) along a geodesic in \(M\).

(2) Geodesics in the group \(\operatorname{Diff}(M)\) with respect to the above \(L^2\)-metric correspond to solutions of the Burgers equation. Horizontal geodesics in \(\operatorname{Diff}(M)\) are normal to the submanifold \(\operatorname{Diff}_\mu(M)\) and have potential initial conditions.

Remark 5. Recall that the Euler equation \(\partial_t u+\nabla_u u=-\nabla p\) on a divergence-free field \(u\) on \(M\), \({\rm div} \,u=0\), describes the motion of an ideal incompressible fluid filling \(M\). It corresponds to the equation of the geodesic flow of the right-invariant \(L^2\) metric on the group \(\operatorname{Diff}_\mu(M)\), regarded as a submanifold \(\operatorname{Diff}_\mu(M)\subset \operatorname{Diff}(M)\). Here the term \(-\nabla p\) stands for the constraining force orthogonal to \(\operatorname{Diff}_\mu(M)\) and keeping the geodesic on this submanifold.

2.3 Semi-direct products↩︎

Let \(G\) be a Lie group acting from the left on another Lie group \(L\) such that the action map \(g\cdot L\to L\) is an automorphism: \(g\cdot (\ell_1 \ell_2) = (g\cdot \ell_1)(g\cdot \ell_2)\). Then we can form the semi-direct product group \(G\ltimes L\) with product \[(g_1,\ell_1)\cdot (g_2, \ell_2) = \big(g_1g_2, \ell_1(g_1\cdot \ell_2)\big).\] Assume further that both \(G\) and \(L\) act from the left on the vector space \(V\). An action of \(G\ltimes L\) on \(V\) is then given by \((g,\ell)\cdot v = \ell \cdot (g\cdot v)\). Assume that this action is transitive.

Example 1. The group \[{\rm Aff}(n) = {\rm GL}(n)\ltimes {\mathbb{R}}^n\] of affine transformations of the \(n\)-dimensional space is a semidirect product. In this group, an affine transformation is described by a pair \((A,b)\) of a linear operator \(A\in {\rm GL}(n)\) and a translation by a vector \(b \in {\mathbb{R}}^n\), so that the group multiplication between two such pairs, \[(A_1, b_1)\circ (A_2, b_2) :=(A_1 A_2, A_1b_2+b_1)\,,\] is determined by the consecutive application of transformations on elements of \({\mathbb{R}}^n\).

In particular, the semidirect group \(E(3)= {\rm SO}(3)\ltimes {\mathbb{R}}^3\) of motions of the 3-dimensional space is related to the Kirchhoff equations for the motion of a rigid body in a fluid.

Example 2. Similarly, the equations of gas dynamics are related to the semidirect product Lie group \[\widetilde{G}=\operatorname{Diff}(M)\ltimes C^\infty(M, \mathbb{R})\,,\] which consists of pairs \((\phi, f)\) with a diffeomorphism \(\phi\) of the manifold \(M\), and a smooth function \(f\) on \(M\), see e.g. [8]. The product in the group \(\operatorname{Diff}(M)\ltimes C^\infty(M, \mathbb{R})\) is given by \[(\phi, f) \cdot (\psi, g) := (\phi\circ\psi, \phi_*g+f)\,,\] where \(\phi_*g = g \circ \phi^{-1}\) denotes the pushforward of the function \(g\) by the diffeomorphism \(\phi\) (or, equivalently, the pullback by \(\phi^{-1}\)).

The corresponding Lie algebra is the semidirect product \(\tilde{\mathfrak g}=\mathfrak{X}(M)\ltimes C^\infty(M, \mathbb{R})\,,\) which is \(\mathfrak{X}(M)\otimes C^\infty(M, \mathbb{R})\) as a vector space with the commutator given by \[[(u,a), (v,b)]=(L_u v ,-L_u b + L_v a),\] where \(L_u\) stands for the Lie derivative of functions or commutator of vector fields on the manifold \(M\).

This is the type of semidirect Lie groups and algebras to be used below, with the group of \(\operatorname{Diff}(M)\) acting on the spaces of functions \(C^\infty(M, R)\) with values in various vector and matrix spaces \(R\).

3 Mass transport of vector half-densities↩︎

3.1 The group action on vector half-densities↩︎

Now we consider the corresponding setup for vector densities and describe a similar semi-direct product group acting on them in a transitive way.

Given a compact orientable manifold \(M\) of dimension \(n\) define the semi-direct product group \(\widetilde{G}=G\ltimes N\) where \(G= \operatorname{Diff}(M)\) and \(N = C^\infty(M,\operatorname{SO}(k))\).3 We define its action on the space \[\mathrm{VHProb} =\{ \mathbf{v} \in C^\infty(M,\mathbb{R}^k ) \otimes \operatorname{HProb}(M)~\Big|~ \int_M \lvert \mathbf{v} \rvert^2 = 1 ~{\rm and ~} \mathbf{v}\not=0 ~{\rm on ~} M\}\] of normalized vector-valued half-densities. (Half-densities mean top degree forms which under the action of diffeomorphisms are getting multiplied by the square root of the Jacobian, as we discuss below.) They can be represented as \(\mathbf{v}={v}\otimes \sqrt\varrho\in \mathrm{VHProb}\) with \(\int_M\varrho=1\) and \(\varrho > 0\). Here the tensor product is over \(C^\infty(M, \mathbb{R})\), and one can also assume that \({v}(x)\in S^{k-1}\subset \mathbb{R}^k\) for all \(x\in M\).

The action of \(\varphi\in \operatorname{Diff}(M)\) on \(\mathbf{v} \in \mathrm{VHProb}\) is the component-wise half-density pushforward: \[\varphi\cdot \mathbf{v} = \varphi_*\mathbf{v} = {v}\circ\varphi^{-1} \sqrt{\varphi_*\varrho}\] The corresponding infinitesimal action of \(u\in \mathfrak{X}(M)\) is the component-wise Lie derivative: \(u\cdot \mathbf{v} = -L_u \mathbf{v}\). In particular, relative to the representation \(\mathbf{v} = v\otimes \sqrt{\varrho}\), we have \[L_u (v\otimes \sqrt{\varrho}) = (L_u v) \otimes \sqrt{\varrho} + v\otimes \frac{1}{2}\frac{L_u\varrho}{\sqrt{\varrho}}= (L_u v + \frac{L_u\varrho}{2\varrho}v )\otimes\sqrt{\varrho}.\]

The action of the gauge transformation \(A \in N\) on \(\mathbf{v}\in \mathrm{VHProb}\) is pointwise: \[(A\cdot \mathbf{v})(x) = A(x)\mathbf{v}(x).\] The corresponding infinitesimal action is \((a\cdot \mathbf{v})(x) = a(x)\mathbf{v}(x)\) for \(a \in T_e N = C^\infty(M,\mathfrak{so}(k))\).

Definition 1. The combined group action for \(\widetilde{G}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) on half-densities in \(\mathrm{VHProb}\) is as follows. The action of an element \((\varphi, A)\in \widetilde{G}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) is \[(\varphi, A):\, \mathbf{v}={v}\otimes \sqrt\varrho \mapsto A(\varphi_*\mathbf{v}) = A(v\circ\varphi^{-1})\sqrt{\varphi_*\varrho}\, ,\] where the latter expression at a point \(x\in M\), relative to a reference density \(\mu\), is \[\Big(A(v\circ\varphi^{-1})\sqrt{\varphi_*\varrho}\Big)(x) =A(x)v(\varphi^{-1}(x)) \sqrt{\operatorname{Jac}_\mu(\varphi^{-1})(x)}\,\rho(\varphi^{-1}(x))\mu(\varphi^{-1}(x))\,.\] In other words, the diffeomorphism acts on half-densities by the pullback with its inverse and with the square root of the Jacobian appearing as a factor, while the \({\rm SO}(k)\)-current \(A\) on \(M\) is applied point-wise to the obtained vector-valued half-density.

The combined infinitesimal action of \((u,a) \in T_{(id,e)}(G\ltimes N)\) is \[\label{eq:inf95action95on95v} (u,a)\cdot \mathbf{v} = -L_u \mathbf{v} + a \mathbf{v} .\tag{4}\]

Lemma 1. The action of the group \(\widetilde{G}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) on the half-densities \(\mathrm{VHProb}\) is transitive.

Proof. Indeed, by Moser’s theorem, the action of a diffeomorphsim on the densities \(\varrho\) is transitive, as all such densities in \(\operatorname{Prob}(M)\) have the same total volume. This means that any vector half-density \(\mathbf{v}={v}\otimes \sqrt\varrho\) can be mapped to any other while matching the density \(\varrho\) and obtaining another vector current \(\tilde{v}(x)\) of the same pointwise length as \({v}(x)\). The currents \(A\in C^\infty(M, {\rm SO}(k))\), being pointwise orthogonal transformations, do not change the vector length and act transitively on vectors of the same length at each point \(x\in M\), which completes the proof. (Here we use the assumption on the absence of purely topological obstructions, see Remark 12, so the local transitivity in the bundle extends to a global one.) ◻

Note that the fibers of the corresponding fibration \(\widetilde{G}\to \mathrm{VHProb}\) depend on the choice of the reference density \(\mathbf{v}\in \mathrm{VHProb}\) and are given by its stabilizer. For instance, for a vector density \(\mathbf{v}={v}\otimes \sqrt\varrho\) whose \(v\)-components are all the same in a certain coordinate system in \(\mathbb{R}^k\), its stabilizer is the subgroup \(\mathrm{Stab}_\mathbf{v}:=\operatorname{Diff}_\mu(M) \ltimes C^\infty(M, \operatorname{SO}(k-1))\). Indeed, it consists of those diffeomorphisms \(\varphi\) which preserve density \(\varrho=\lvert\mathbf{v}\rvert^2\) and at each point of \(x\in M\) the subgroup \({\rm SO}(k-1)\subset {\rm SO}(k)\) fixes a given nonzero vector \({v}(x)\).

Consider now the principal bundle \(\pi\colon G\ltimes N \to \mathrm{VHProb}\) given by the action on a fixed element \(\mathbf{v}_0\in \mathrm{VHProb}\), i.e., \(\pi(\varphi,A) = A\, \varphi_* \mathbf{v}_0\). This fixed element corresponds, as we shall see, to the initial vector half-density in the transport problem.

Lemma 2. The vertical distribution at \((\varphi,A)\in G\ltimes N\) is given by \[V_{(\varphi,A)}(G\ltimes N) = \{(u,a)\cdot (\varphi,A)\mid -L_u \mathbf{v} + a \mathbf{v} = 0, \; \mathbf{v} = A\, \varphi_* \mathbf{v}_0 \}.\] In particular, if \((u,a)\cdot (\varphi,A)\) is vertical, then \[L_u \lvert\mathbf{v}\rvert^2 = 0,\] i.e., \(u\) is divergence-free with respect to the density \(\lvert\mathbf{v}\rvert^2\).

Proof. The first part follows since the projection \(\pi\) is by left action on \(\mathrm{VHProb}\). For the second part, notice that if \((u,a)\) is vertical one has \[L_u \lvert\mathbf{v}\rvert^2 = 2 \mathbf{v}\cdot L_u \mathbf{v} = \mathbf{v}\cdot (L_u \mathbf{v} - a \mathbf{v} ) = 0\,,\] since \(a\) is skew-symmetric. ◻

3.2 Riemannian metrics and submersion↩︎

Now we define a Riemannian metric on \(\widetilde{G}= G\ltimes N\).

Figure 1: Illustration of the relation between the group, the vector half-densities, and the densities. The projections, first by the action of \widetilde{G} on \mathbf{v}_0 and then by taking the squared norm of the vector half-density, give rise to Riemannian submersions.

Definition 2. The Bures-type \(L^2\)-metric on the group \(\widetilde{G}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) associated with the fixed initial vector half-density \(\mathbf{v}_0\) is given by \[\label{eq:Bures95metric95vectors} \langle (u,a)\cdot (\varphi,A),(u,a)\cdot (\varphi,A)\rangle_{(\varphi,A)} = \int_M \left( \lvert u\rvert^2 + \operatorname{tr}(aa^\top) \right) \varphi_* \lVert \mathbf{v}_0 \rVert^2 \,.\tag{5}\]

This is a natural combination of the non-invariant \(L^2\) metric on \(\mathrm{Diff}(M)\) and the Bures metric on matrices. The first term of this metric evaluates how much mass has been moved, while the second term measures how much mass was necessary to rotate in the gauge group to relate their vector values.4

Proposition 6. The Bures-type \(L^2\)-metric on the group \(\widetilde{G}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) is nondegenerate.

Proof. If two group elements move the reference vector density to two densities which differ by the underlying scalar one, the distance between them is positive due to the first term in the distance function. Indeed, the first term in the distance 5 corresponds to the standard \(L^2\) metric on diffeomorphisms, and the above metric projects to the classical Wasserstein metric on scalar densities, while the latter is known to be nondegenerate.

Now assume that two different group elements belong to the same fiber, and hence differ at some point on \(M\) by a nontrivial gauge transformation. Due to the smooth depence on the point, there is some neighborhood on \(M\) where they differ. Now the nondegeneracy of the metric 5 along the fiber, i.e. in its second term, follows from that in the finite-dimensional setting, the nondegeneracy of the metric on \(\operatorname{SO}(k)\) induced by the Killing form. ◻

Once the metric on the group is fixed, one can give a dynamical formulation of the corresponding optimal transport problem on the space of vector densities.

Definition 3. The vector Wasserstein distance \(\mathit{WV}(\mathbf{v}_0,\mathbf{v}_1)\) between densities \(\mathbf{v}_0,\mathbf{v}_1 \in \mathrm{VHProb}\) is given by the following formula: \[\mathit{WV}^2(\mathbf{v}_0,\mathbf{v}_1) = \inf_{u,a,\mathbf{v}}\int_0^1 \int_M \left( \lvert u\rvert^2 + \operatorname{tr}(aa^\top) \right) \lvert\mathbf{v}\rvert^2 \, dt\,,\] over time-dependent vector fields \(u(t)\in \mathfrak{X}(M)\), vector half-densities \(\mathbf{v}(t) \in \mathrm{VHProb}\), and matrix valued functions \(a(t) \in C^\infty(M, \mathfrak{so}(k))\) related by the constraints \[\dot{\mathbf{v}} = -L_u \mathbf{v} + a \mathbf{v} ,\quad \mathbf{v}(0) = \mathbf{v}_0, \quad \mathbf{v}(1) = \mathbf{v}_1.\]

The corresponding horizontal distribution on the group is described as follows.

Proposition 7. The horizontal distribution at \((\varphi,A)\in G\ltimes N=\operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) is given by \[\begin{gather} H_{(\varphi,A)}(G\ltimes N) = \\ \left\{ \left( \langle \nabla\theta,{v}\rangle - \langle \nabla{v},\theta\rangle , \theta\otimes{v}-{v}\otimes\theta \right) \mid {\it for ~all~} \,\theta\in C^\infty(M,\mathbb{R}^k) \right\}, \end{gather}\] where \(\mathbf{v} = (\varphi,A)\cdot \mathbf{v}_0 = {v}\otimes\sqrt\varrho\).

Note that this description of the horisontal distribution is much less explicit than the one for the group \(\operatorname{Diff}(M)\) with the \(L^2\)-metric fibered over scalar densities: in that case \(H_{\varphi}(G)\) is given by gradient vector fields \(\{\nabla\theta \mid {\it for ~all~} \,\theta\in C^\infty(M,\mathbb{R})\}\) and it does not depend on the base point. In the semidirect product setting above, the horisontal distribution can be understood as a continuous combination of gradients and, moreover, it depends on the base point via the direction vector \(v\), but not on the density part \(\varrho\).

We are proving this proposition below, but now note that such a description allows one to define a metric on vector densities inherited from the metric on the group via a Riemannian submersion. As its corollary we obtain the following key statement.

Theorem 8. The metric defined on the group \(\widetilde{G}=G\ltimes N\) descends to the space of vector half-densities \(\mathrm{VHProb}\) as a Riemannian submersion, where, for \(\dot{\mathbf{v}}\in T_\mathbf{v} \mathrm{VHProb}\) with \(\mathbf{v} = {v}\otimes\sqrt\varrho\), it is given by \[\langle \dot{\mathbf{v}}, \dot{\mathbf{v}}\rangle_{v} = \int_M \left( \lvert \langle \underbrace{\nabla\theta,{v}\rangle - \langle \nabla{v},\theta\rangle}_{u}\rvert^2 + \lvert \theta\rvert^2 \right)\varrho,\] where \(\theta\in C^\infty(M,\mathbb{R}^k)\) is (implicitly) defined by \[\dot{\mathbf{v}} = -L_{u}\mathbf{v} + a \mathbf{v}\] with \(u\) and \(a\) given, respectively, by \(u=\langle\nabla\theta,{v}\rangle - \langle \nabla{v},\theta\rangle\) and \(a=\theta\otimes{v}-{v}\otimes\theta\).

Proof. Let \((\eta,B) \in \widetilde{G}\) be an element of the isotropy group for \(\mathbf{v}\in \mathrm{VHProb}\). Then, from Lemma 2 it follows that \(\eta\) is volume preserving with respect to the density \(\lvert\mathbf{v}\rvert^2\), i.e., \(\eta_*\lvert\mathbf{v}\rvert^2 = \lvert\mathbf{v}\rvert^2\). Then from the formula 5 it follows that the metric descends to the quotient and therefore corresponds to a Riemannian submersion. From Proposition 7 we then get an explicit form of the descending metric. ◻

A proof of Proposition 7 on the description of horizontal spaces, implying the Riemannian submersion property, can be obtained from the following explicit form of the momentum map for that group.

Lemma 3. The momentum map \(\Phi\colon T^* \mathrm{VHProb}\to \big(\mathfrak{X}(M)\ltimes C^\infty(M,\mathfrak{so}(k))\big)^*\) for the Hamiltonian action of \(\widetilde{G}=G\ltimes N\) on \(T^*\mathrm{VHProb}\), under the natural \(L^2\)-pairing of \(\mathrm{VHProb}\) with itself, is given by \[\label{eq:half} \Phi({v}\otimes \sqrt\varrho, \theta\otimes\sqrt\varrho) = \Bigg( \frac{1}{2}\Big(\langle d\theta,{v}\rangle - \langle d{v},\theta\rangle \Big)\otimes\varrho, \frac{1}{2}\Big(\theta\otimes{v}-{v}\otimes\theta\Big)\otimes\varrho\Bigg).\tag{6}\]

Proof. Recall that elements \(\mathbf{v}\in \mathrm{VHProb}\) can be expressed as a vector current \({v}\otimes \sqrt{\varrho}\) for \({v}\colon M\to S^{k-1}\) and normalized density \(\varrho\in\operatorname{Prob}(M)\). Correspondingly, we write \(\boldsymbol{\theta} \in T_\mathbf{v}^*\mathrm{VHProb}\) as \(\boldsymbol{\theta} = \theta\otimes\sqrt\varrho\) for some function \(\theta\colon M\to \mathbb{R}^k\). Take now \((u,a) \in \mathfrak{g}\ltimes\mathfrak{n}\). Recall that the cotangent lifted action is always Hamiltonian, i.e., given by Hamiltonian vector fields (cf. Appendix 7). The momentum map is then defined by \[\langle (u,a), \Phi(\mathbf{v},\boldsymbol{\theta})\rangle = \langle (u,a)\cdot \mathbf{v}, \boldsymbol{\theta} \rangle \qquad \forall \, (u,a) \in \mathfrak{g}\ltimes \mathfrak{n}.\] Since \((u,a)\cdot \mathbf{v} = -L_u ({v}\otimes \sqrt\varrho) + a {v}\otimes\sqrt\varrho\), we get \[\begin{gather} \langle (u,a)\cdot \mathbf{v}, \boldsymbol{\theta}\rangle = \int_M \left( -L_u{v} + \frac{-L_u\varrho}{2\varrho} + a{v} \right)\cdot \theta\; \varrho = \\ \int_M \langle u, -\frac{1}{2}(d {v}\cdot \theta - d\theta\cdot {v})\rangle \, \varrho - \int_M \operatorname{tr}\big(a(\theta\otimes{v})\big)\, \varrho . \end{gather}\] Since \[-\operatorname{tr}(a(\theta\otimes v)) = \frac{1}{2}\operatorname{tr}(a (\theta\otimes v - v\otimes \theta)^\top)\] the result now follows. ◻

Proof of Proposition 7 now immediately follows from the above description of the momentum map \(\Phi\) since, by construction, the horizontal distribution \(H_{(\varphi,A)}(G\ltimes N)\) is given by the condition \[\langle (u,a), \Phi(\mathbf{v},\boldsymbol{\theta}) \rangle = 0\] if and only if \((u,a)\cdot (\varphi,A)\) is vertical. \(\square\)

3.3 Geodesics and Hamiltonian equations↩︎

We start by deriving the coadjoint operator for the Lie algebra \(\tilde{\mathfrak{g}}=\mathfrak{g}\ltimes\mathfrak{n}\) of the group \(\widetilde{G}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\).

Lemma 4. The \(\operatorname{ad}^*\) map for the Lie algebra \(\tilde{\mathfrak{g}}=\mathfrak{g}\ltimes\mathfrak{n}\) is given by \[\operatorname{ad}^*_{(u,a)}(m\otimes\varrho, \beta\otimes\varrho) = \left( -L_u (m\otimes\varrho) + \operatorname{tr}(\beta d a)\otimes\varrho, -L_u (\beta\otimes\varrho), \right)\] where \((m\otimes\varrho, \beta\otimes\varrho)\in \tilde{\mathfrak{g}}^*\simeq\mathfrak{g}^*\oplus\mathfrak{n}^*\) and \(m\in \Omega^1(M)\), \(\beta \in C^\infty(M, {\rm SO}(k))\), while \(\varrho\in \Omega^n(M)\), and the tensor products are over \(C^\infty(M,\mathbb{R})\).

Proof. We have that \(\operatorname{ad}_{(u_1,a_1)}(u_2,a_2) = (L_{u_1} u_2, L_{u_1}a_2 - L_{u_2}a_1)\). Thus, for variables \((m\otimes\varrho, \beta\otimes\varrho)\) dual to the Lie algebra, \[\begin{align} &\langle \operatorname{ad}_{(u_1,a_1)}^*(m\otimes\varrho, \beta\otimes\varrho), (u_2,a_2)\rangle = \langle L_{u_1}u_2, m\otimes\varrho \rangle + \langle L_{u_1} a_2 - L_{u_2}a_1 , \beta\otimes\varrho \rangle = \\ & \langle u_2, -L_{u_1}(m\otimes \varrho)\rangle + \langle a_2, -L_{u_1}(\beta\otimes\varrho)\rangle - \langle \iota_{u_2} da_1, \beta\otimes\varrho\rangle = \\ & \langle u_2, -L_{u_1}(m\otimes \varrho) + \operatorname{tr}(\beta da_1)\otimes\varrho\rangle + \langle a_2, -L_{u_1}(\beta\otimes\varrho)\rangle \,, \end{align}\] which implies the result. ◻

The geodesic equations for the Bures-type metric in Definition 2, as a Hamiltonian system on the Lie–Poisson space \(\tilde{\mathfrak{g}}^*\simeq\mathfrak{g}^*\oplus\mathfrak{n}^*\), is given by the following theorem.

Theorem 9. Consider the Hamiltonian on \(T^*\widetilde{G}= T^*(\operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k)))\) of the form \[H(u^\flat\otimes\varrho, a\otimes\varrho, \mathbf{v}) = \frac{1}{2}\int_M \frac{\lvert u^\flat\otimes \varrho \rvert^2}{\lvert\mathbf{v}\rvert^2} + \frac{1}{2}\int_M \frac{\varrho}{\lvert\mathbf{v}\rvert^2}\operatorname{tr}\left(aa^\top\right) \varrho .\] corresponding to the kinetic energy Lagrangian on \(T\widetilde{G}\) and given in reduced coordinates \(u = \dot{\varphi}\circ\varphi^{-1}\), \(\varrho = \varphi_*|\mathbf{v}_0|^2\), \(\mathbf{v} = \varphi_*\mathbf{v}_0\), and \(a = \dot{A} A^{-1}\). Then the Hamiltonian equations of motion corresponding to the geodesic equations on \(\widetilde{G}\) for the optimal transport on vector densities are \[\dot{u} + \nabla_{{u}}{u} = 0 , \quad \dot{a} + \nabla_u a = 0, \quad \dot{\mathbf{v}} + L_{{u}} \mathbf{v} = a \mathbf{v} .\] In particular, the scalar density \(\varrho\) defined by \(\varrho = \lvert\mathbf{v}\rvert^2\) is transported by \(u\): \[\dot{\varrho} +L_u \varrho=0.\]

Remark 10. The geodesic equations in Theorem 9 are natural vector extensions of the classical Burgers equation \(\dot{u} + \nabla_{{u}}{u} = 0\) and the passive scalar transport \(\dot{a} + \nabla_u a = 0\). Indeed, the \(L^2\)-metric of Definition 2 does not depend on the orientation of the vectors \(v\), so the geodesic path in the \(\operatorname{Diff}(M)\)-part of the group should also be independent of that orientation, and hence it does not affect the evolution of \(u\). Note however, that the horizontal distribution \(H_{(\varphi,A)}(G\ltimes N)\) in the group \(\operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{SO}(k))\) found in Proposition 7 is rather complicated, which does not allow one to describe the geodesics on vector densities, i.e. solutions of those equations, as explicitly as for scalar ones as the flows of gradient fields.

Note also that the equations in Theorem 9 are on \(\tilde{\mathfrak{g}}^*\times \mathrm{VHProb}\), which is a reduced form of \(T^*\widetilde{G}=\tilde{\mathfrak{g}}^*\times (\operatorname{Diff}(M)\ltimes C^\infty(M,\mathbb{R}^k))\) taking into account the reduction of base isometries.

Proof. In the normalization \(\varrho = \lvert\mathbf{v}\rvert^2\) one has \[\dot{\varrho} = 2 \langle \mathbf{v}, \dot{\mathbf{v}}\rangle = - 2\langle \mathbf{v},L_u \mathbf{v} \rangle = - L_u \lvert\mathbf{v}\rvert^2 = -L_u \varrho,\] which implies that \(\varrho\) is transported by \(u\).

The variational derivatives of the Hamiltonian are \[\frac{\delta H}{\delta (u^\flat\otimes\varrho)} = u, \quad \frac{\delta H}{\delta (a\otimes\varrho)} = a, \quad \frac{\delta H}{\delta \mathbf{v}} = -\lvert u \rvert^2 \mathbf{v} - \operatorname{tr}(a a^\top) \mathbf{v}\,.\] Indeed, the explicit form of these variational derivatives is obtained by rewriting the Hamiltonian as \[H = \frac{1}{2}\langle u^\flat\otimes\varrho, u \rangle + \frac{1}{2}\langle a\otimes\varrho, a \rangle = \frac{1}{2}\langle u^\flat\otimes\lvert\mathbf{v}\rvert^2, u \rangle + \frac{1}{2}\langle a\otimes \lvert\mathbf{v}\rvert^2, a \rangle .\]

Next, the general form of the corresponding Hamiltonian equations is obtained in Proposition 21 in Appendix, namely \[\label{eq:ham95eq95abstract} \frac{d}{dt}(u^\flat\otimes\varrho, a\otimes\varrho) = \operatorname{ad}^*_{(\frac{\delta H}{\delta u^\flat\otimes\varrho},\frac{\delta H}{\delta a\otimes\varrho})}(u^\flat\otimes\varrho, a\otimes\varrho) - \Phi(\mathbf{v}, \frac{\delta H}{\delta \mathbf{v}}).\tag{7}\] Then from Lemma 3 we have \[\begin{align} &\Phi(\mathbf{v}, \frac{\delta H}{\delta \mathbf{v}}) = \Phi\left( v\otimes\sqrt{\varrho}, -(|u|^2 {v}+ \operatorname{tr}(aa^\top) {v}) \otimes\sqrt{\varrho}\right) = \\ &\left(-\frac{1}{2}\left(\langle d (|u|^2 v+ \operatorname{tr}(aa^\top) v) ,{v}\rangle - \langle d v, (|u|^2 v+ \operatorname{tr}(aa^\top) v) \rangle \right) \otimes\varrho, \right. \\ &\left. -\frac{1}{2} \left((|u|^2 v+ \operatorname{tr}(aa^\top) v)\otimes v - v\otimes (|u|^2 v+ \operatorname{tr}(aa^\top) v) \right) \right) \\ &= \left(-\frac{1}{2} d \left(|u|^2 + \operatorname{tr}(aa^\top)\right)\otimes\varrho , 0\right), \end{align}\] and from Lemma 4 \[\begin{align} &\operatorname{ad}^*_{(\frac{\delta H}{\delta u^\flat\otimes \varrho}, \frac{\delta H}{\delta a\otimes\varrho})}(u^\flat\otimes\varrho,a\otimes\varrho) = \big( -L_u (u^\flat\otimes\varrho) + \operatorname{tr}(a \, d a), -L_u (a\otimes\varrho) \big) = \\ & \big( -L_u (u^\flat\otimes\varrho) -\frac{1}{2} d \operatorname{tr}(aa^\top), -L_u (a\otimes\varrho) \big). \end{align}\] Since \(\varrho = \lvert\mathbf{v}\rvert^2\) is transported by \(u\), it decouples from the equation, so from equation 7 we get \[\dot{u}^\flat = \underbrace{- L_u u^\flat - \frac{1}{2} d \operatorname{tr}(aa^\top)}_{\text{from }\operatorname{ad}^*} \underbrace{+\frac{1}{2}d|u|^2 + \frac{1}{2}d\operatorname{tr}(aa^\top)}_{\text{from }-\Phi} = -L_u u^\flat - \frac{1}{2}d\lvert u\rvert^2 .\] Thus, the equation for the one-form \(u^\flat\) is \[\dot{u}^\flat + \underbrace{L_u u^\flat - \frac{1}{2}d \lvert u \rvert^2}_{(\nabla_u u)^\flat} = 0,\] which yields the equation for \(u\) as \[\dot{u} + \nabla_u u = 0.\] For the evolution of \(a\otimes\varrho\) we get \[\frac{d}{dt}a\otimes\varrho = - L_u (a\otimes\varrho)\] which decouples as \[\dot{a} + \nabla_u a = 0 .\] The evolution for \(v\) is obtained from the infinitesimal action formula 4 . ◻

Problems 11. The above consideration was related to a particular Bures-type \(L^2\)-metric. It would be interesting to see if other metrics give natural equations of geodesics or the corresponding transport problem. In particular, do the \(\dot{H}^1\)- and the Fisher–Rao metrics manifest themselves in the vector setting?

Remark 12. To spell out the assumptions on homotopy equivalence of vector half-densities, note that by considering the normalization \(v\in S^{k-1}\) the question of fixing a connected component for \({\rm VHProb}\) boils down to the homotopy classes of maps \(M^n \to S^{k-1}\). For \(n < k-1\), all maps are homotopically trivial, and hence one has a unique connected component of vector half-densities on \(M\). When \(n=k-1\) the connected components are distinguished by the degree of the map \(M^{n}\to S^n\). If \(n > k-1\) the description of such connected components is, in general, an open question. For example, let \(M=S^3\) and \(k=3\), the connected components of vector half-densities are enumerated by \(\pi_3(S^2) = \mathbb{Z}\), the Hopf invariant. And, more generally, if \(M\) is a sphere, one deals with the homotopy groups of spheres that, in general, are still not known.

An extension of the map between vector functions (or half-densities) to their \({\rm SO}(k)\)-gauge equivalence is a more subtle question. Now one needs to lift vector half-densities from \({\rm VHProb}\), which are \(S^{k-1}\)-sections over \(M\), to gauge transformations between them, which are \({\rm SO}(k)\)-sections over \(M\). The latter can be regarded as sections of the \({\rm SO}(k-1)\)-bundle over \({\rm VHProb}\). The corresponding theory implies that obstructions to such an extension belong to the groups \(H^j(M, \pi_{j-1}(F))\) for \(F={\rm SO}(k-1)\). In the paper, as we study local questions and geodesic equations, we assume that all those obstructions are trivial, so that the extensions are topologically possible.

Note also that for matrix densities discussed in detail below, the first extension holds automatically. Indeed there is just one connected component in \({\rm MProb}\), i.e. all maps \(M\to {\rm Sym}_+(k)\) are homotopically equivalent, since positive-definite quadratic forms form a contractible set. On the other hand, the second part of extending identification of matrix densities to a gauge equivalence is based on essentially the same obstruction theory as above, since the stabilizer of a positive quadratic form in \(k\) variables is isomorphic to \({\rm SO}(k)\).

Remark 13. In the past a typical approach to define optimal transport on vector densities has been as follows (see e.g. [9]): one considers an \(k\)-tuple (or equivalently, a vector) of densities \((\varrho_1, ..., \varrho_k)\) on \(M\) and an \(k\)-tuple of diffeomorphisms \((\varphi_1, ..., \varphi_k)\) acting on them component-wise. In addition, one introduces a matrix of allowed density exchanges between the components, while making sure that the densities after the exchanges do not lose their positivity. The corresponding distance between \(k\)-tuples of densities defined as the cost of such an exchange is, generally speaking, only a divergence: it is not symmetric and requires an additional symmetrization.

Apparently, this setting might turn out to be related to the flows of \(k\)-phase fluids, where the densities and flows are introduced separately for each phase but related by a common constraint, such as the total incompressibility, see [10]. Symmetries of multiphase fluids are described by the corresponding multiphase diffeomorphism groupoid. The groupoid approach also allows one to introduce metrics and distances on the space of \(k\)-tuples of densities, which have the fluid dynamical background. It would be interesting to find a precise relation between multiphase fluid groupoids and more general optimal transport on vector densities.

One should also note that in the approach of the present paper via the semidirect group, where a common diffeomorphism is acting on vector half-densities, the exchanges between components are achieved by gauge transformations. It requires neither additional positivity adjustment (as it is automatically achieved by squaring half-densities to obtain the densities themselves) nor additional symmetrization (as it follows from the invariance of the metric introduced on the group and the Riemannian submersion).

Futhermore, geodesics in the semidirect group setting are naturally related to Burgers-type fluid equations, while the approach itself easily extends to other cases, such as, e.g., matrix-valued densities, as we explain in Section 5.

4 Unbalanced mass transport↩︎

4.1 Various settings of the unbalanced transport↩︎

Unbalanced mass transport considers geodesics between densities of different total masses. Let \({\rm Dens}(M)=\Omega^n(M, {\mathbb{R}}_+)\) stand for all densities with arbitrary total masses on an \(n\)-dimensional manifold \(M\).

It became natural now to associate an unbalanced transport with consideration of the semidirect product group \[G=\operatorname{Diff}(M)\ltimes C^\infty(M, \mathbb{R}_+),\] see [11], Section 3.2.1. Then \((\varphi, \lambda)\in \operatorname{Diff}(M)\ltimes C^\infty(M, \mathbb{R}_+)\) acts on a density \(\mu \in {\rm Dens}(M)\) as follows: \[\mu\mapsto (\varphi^*\mu)\cdot \lambda^2.\] This action is transitive: one can get an arbitrary density from any initial one by choosing an appropriate factor \(\lambda\). (Here and in [11] one uses the multiplicative group of nonvanishing functions, but there is also an additive version: \((\varphi, f)\in \operatorname{Diff}(M)\ltimes C^\infty(M, \mathbb{R}),\) where \(\lambda:=\exp(f)\) and the action is \(\mu\mapsto \varphi^*\mu\cdot \exp(2f)\).)

One introduces a metric on the group \(G\), which projects (as a Riemannian submersion) to a metric on \({\rm Dens}(M)\), see details in [11]. The obtained metric on \({\rm Dens} (M)= \operatorname{Prob}(M) \times \mathbb{R}_+\) is a cone over the metric on the space of normalized densities \(\operatorname{Prob}(M)\) on \(M\) and quadratic scaling in \(\mathbb{R}_+\), see formula (3.22) in [11] (see also earlier papers [5], [12]).

Remark 14. In [13] a simple unbalanced optimal transport was described, where one uses much smaller conic group \(G=\operatorname{Diff}(M)\times \mathbb{R}_+\), which projects with a Riemannian submersion to the conic extension of densities \(\operatorname{Prob}(M) \times \mathbb{R}_+\). However, to see the similarity with vector densities described above one needs the “large" group \(G=\operatorname{Diff}(M)\ltimes C^\infty(M, \mathbb{R}_+)\).

4.2 Vector unbalanced transport↩︎

By replacing \({\rm SO}(k)\) with the conformal linear group \({\rm Conf}(k):={\rm SO}(k)\times {\mathbb{R}}_+\) (rotations and dilations) as the main gauge group, one can consider an unbalanced vector optimal transport. The standard unbalanced transport becomes a particular case of this more universal framework for \(k=1\).

Namely, in the unbalanced setting we define the semi-direct product group \({\boldsymbol{G}}= \operatorname{Diff}(M) \ltimes C^\infty(M, {\rm Conf}(k))\) It acts on the space of all (non-normalized) vector-valued half-densities \[\mathrm{{\boldsymbol{V}}HDens} =\{ \mathbf{v} \in C^\infty(M,\mathbb{R}^k ) \otimes \operatorname{HDens}(M)~|~ \mathbf{v}\not=0 ~{\rm on ~} M\}\] We still represent those densities as \(\mathbf{v}={v}\otimes \sqrt\varrho\in \mathrm{{\boldsymbol{V}}HDens}\) with \(v(x)\in S^{k-1}\) in a unique way. The combined group action is given by the same formula as before.

Definition 4. The combined group action for \({\boldsymbol{G}}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{Conf}(k))\) on half-densities in \(\mathrm{{\boldsymbol{V}}HDens}\) is as follows: for \((\varphi, A)\in \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{Conf}(k))\) \[(\varphi, A):\, \mathbf{v}={v}\otimes \sqrt\varrho \mapsto (A{v})\circ \varphi^{-1} \cdot\sqrt{ \varphi_*\varrho}\,.\]

In the same way one can see that the action of the group \({\boldsymbol{G}}= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{Conf}(k))\) half-densities in \(\mathrm{{\boldsymbol{V}}HDens}\) is transitive.

The corresponding combined infinitesimal action of \((u,a) \in T_{(id,e)}\boldsymbol{G}\) is given by the same formula \[(u,a)\cdot \mathbf{v} = -L_u \mathbf{v} + a \mathbf{v} \,,\] where \(a\in C^\infty(M, \frak{conf}(k))\), i.e. \(a\) has the scalar diagonal part, since the Lie algebra \(\frak{conf}(k)\) is a direct product: \(\frak{conf}(k)=\frak{so}(k)\oplus I \mathbb{R}\), where \(I\) is the \(k\times k\) identity matrix.

All the above considerations, including the geodesic equations in Theorem 9, have the same form, where the matrix term \(a\), which before was skew-symmetric and hence zeros on the diagonal, may now have a scalar diagonal part: \[\dot{u} + \nabla_{{u}}{u} = 0 , \quad \dot{a} + \nabla_u a = 0, \quad \dot{\mathbf{v}} + L_{{u}} \mathbf{v} = a \mathbf{v} .\] Here, however, the scalar density \(\varrho = \lvert\mathbf{v}\rvert^2\) is not transported by \(u\) any longer, but changes as in the scalar unbalanced transport: \[\dot{\varrho} +L_u \varrho= -\frac{2}{k}\operatorname{tr}( a) \varrho\] due to the present diagonal part of \(a\).

5 Universal mass transport of matrix densities↩︎

The setting of matrix-valued densities requires a different acting group. Let \(M\) be a manifold of dimension \(n\), and fix a (normalized) reference density \(\mu\).

It turns out that, unlike the vector case, it is more natural to start with the unbalanced transport of matrix densities.

5.1 Unbalanced optimal transport of matrix-valued densities↩︎

Definition 5. The space of matrix densities is the space \({\rm MDens}:=\Omega^n(M, {\rm Sym}_+(k))\) of (non-normalized) densities with values in positive-definite quadratic forms in \({\mathbb{R}}^k\).

Now we consider the semidirect product group \[\widehat G={\rm Diff}(M)\ltimes C^\infty(M, {\rm GL}(k, \mathbb{R})),\] where \(k\) and \(n\) could be different. This group acts on the spaces of matrix densities \({\rm MDens}=\Omega^n(M, {\rm Sym}_+(k))\) as follows. For \[(\varphi, A)\in {\rm Diff}(M)\ltimes C^\infty(M, {\rm GL}(k, \mathbb{R}))\] the action on a matrix density \[\mathbf{S} \in \Omega^n(M, {\rm Sym}_+(k))\subset \Omega^n(M) \otimes {\rm Sym}(k)\] is \[(\varphi, A)\cdot \mathbf{S} = A (\varphi_*\mathbf{S}) A^\top \,,\] where \(\varphi_*\) acts on each matrix element thought of as a top-form. The density associated with \(\mathbf{S}\) is thus \(\operatorname{tr}(\mathbf{S})\), so the total mass is \[{\rm Mass}(\mathbf{S}):=\int_M \operatorname{tr}\mathbf{S}.\]

Remark 15. Note that Definition 5 is consistent with the definition of vector half-densities: if \(\mathbf{S}:=\mathbf{v}\otimes \mathbf{v}\), a matrix density of rank 1 (and hence, being only positive semi-definite, it belongs to the closure of \({\rm Sym}_+(k)\)), then the vector half-density \(\mathbf{v}\) gives rise to a matrix density \(\mathbf{S}\). Furthermore, the action of the group \(\widetilde{G}\) on vector half-density \(\mathbf{v}\) is consistent with the action of the group \(\widehat G\) on the corresponding matrix density \(\mathbf{S}=\mathbf{v}\otimes \mathbf{v}\). See Figure 2 for an illustration.

Lemma 5. The action of the group \(\widehat G= {\rm Diff}(M) \ltimes C^\infty(M, {\rm GL}(k, \mathbb{R}))\) on matrix densities \({\rm MDens}\) of arbitrary total masses is transitive.

Proof. The proof is similar to that of Lemma 1 above. Transitivity on densities of fixed total mass is provided by Moser’s theorem. On the other hand, given two matrix densities with the same scalar density part the statement follows from transitivity of the pointwise \({\rm GL}(k)\)-action on positive-definite quadratic forms: any such form is reduced to the sum of squares in an appropriate basis. (Here again we use the assumption on the absence of purely topological obstructions, see Remark 12.) ◻

Remark 16. It is worth noticing that such a group action naturally leads to an unbalanced version of the matrix optimal transport. Preservation of the total mass \({\rm Mass}(\mathbf{S})=\int_M \operatorname{tr}(\mathbf{S})\) would require the action of currents preserving the trace of matrices. This could be achieved by the action of \({\rm SO}(k)\)-currents (i.e. the group \(\widetilde{G}\)) on matrix densities, but their action would not be transitive. Indeed, the action of such a group \(\widetilde{G}\) (instead of \(\widehat G\), but on matrix densities rather than on vector ones) would preserve not only the trace, but the whole pointwise spectrum of \(\mathbf{S}\), and hence would reduce a positive-definite quadratic form to the sum of squares with positive coefficients, where the coefficients, being the quadratic form’s eigenvalues, remain invariants of the density under this action. The forms with different eigenvalues (depending on a point in \(M\)) prevent transitivity of the group \(\widetilde{G}\)-action. Also note that the stabilizer subgroup for a matrix-valued density \(\mu\otimes \Lambda\in {\rm MDens}\) again depends on the choice of this reference density, since for a quadratic form \(\Lambda\in {\rm Sym}_+(k)\) it is the group \({\rm SO}_\Lambda(k)\), the special orthogonal group for the inner product defined by \(\Lambda\).

Lift the action of the group \(\widehat G\) from the space of densities \(\mathrm{MDens}\) to the Hamiltonian action on its cotangent bundle \(T^*\mathrm{MDens}\). We identify \(T^*_{\mathbf{S}}\mathrm{MDens}\) with \(C^\infty(M, \mathfrak{gl}(k)/\mathfrak{so}(k))\) via the point-wise Frobenius inner product. For an element \(P\in C^\infty(M, \mathfrak{gl}(k))\) consider its coset \([P]\in C^\infty(M, \mathfrak{gl}(k)/\mathfrak{so}(k))\) defined by pointwise addition of an arbitrary skew-symmetric matrix.

Lemma 6. Let \((\mathbf{S},[P])\in T^*\mathrm{MDens}\). Then the momentum map \[\Phi\colon T^* \mathrm{MDens}\to \big(\mathfrak{X}(M)\ltimes C^\infty(M,\mathfrak{gl}(k))\big)^*\] for the action of \(\widehat G\) on \(T^*\mathrm{MDens}\) is given by \[\Phi(\mathbf{S}, [P]) = \Big( \operatorname{tr}(\mathbf{S} dP), (P+P^\top)\mathbf{S} \Big).\]

Proof. The infinitesimal action of \((u,a)\in \mathfrak{X}(M)\ltimes C^\infty(M,\mathfrak{gl}(k))\) on \(\mathbf{S}\in \mathrm{MDens}\) is \[(u,a)\cdot \mathbf{S} = - L_u \mathbf{S} + a\mathbf{S} + \mathbf{S}a^\top .\] By definition, the momentum map fulfills \[\langle \Phi(\mathbf{S}, [P]), (u,a)\rangle = \langle [P], (u,a)\cdot \mathbf{S} \rangle = \int_M \operatorname{tr}\Big(P\big(-L_u \mathbf{S} + a \mathbf{S} + \mathbf{S}a^\top\big) \Big),\] which gives \[\begin{align} &= \int_M \operatorname{tr}\Big(P\big( -d\iota_u \mathbf{S} + (a \mathbf{S} + \mathbf{S}a^\top) \big)\Big) \\ &= \int_M \iota_u \operatorname{tr}(\mathbf{S} dP) + \int_M \operatorname{tr}\big( (P^\top+P) \mathbf{S} a^\top\big) \\ &= \Big\langle \big(\operatorname{tr}(\mathbf{S} d P), (P^\top+P)\mathbf{S} \big), (u,a)\Big\rangle . \end{align}\] The latter expression of the momentum map implies the result. ◻

As above, let now \(M\) be a Riemannian manifold with metric \((\cdot\,, \cdot)\). We fix an initial matrix density \(\mathbf{S}_0 \in \mathrm{MDens}\).

Definition 6. The Bures-type \(L^2\)-metric on the group \(\widehat G= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{GL}(k))\) is given by \[\label{eq:metric95Bures95MDens} \langle (u,a)\cdot (\varphi,A),(u,a)\cdot (\varphi,A)\rangle_{(\varphi,A)} = \int_M \left( \lvert u\rvert^2 + \operatorname{tr}(aa^\top) \right) \operatorname{tr}(A (\varphi_*\mathbf{S}_0 ) A^\top) \,.\tag{8}\]

The consideration mimicking Proposition 6 shows that this metric is nondegenerate.

Definition 7. The matrix Wasserstein distance \(\mathit{WM}(\mathbf{S}_0,\mathbf{S}_1)\) between two matrix densities \(\mathbf{S}_0,\mathbf{S}_1 \in \mathrm{MDens}\) is given by \[\mathit{WM}^2(\mathbf{S}_0, \mathbf{S}_1) = \inf_{u,a,\mathbf{S}}\int_0^1 \int_M \left( \lvert u\rvert^2 + \operatorname{tr}(aa^\top) \right) \operatorname{tr}(\mathbf{S}) \, dt\,,\] over time-dependent vector fields \(u(t)\in \mathfrak{X}(M)\), matrix densities \(\mathbf{S}(t) \in \mathrm{MDens}\), and matrix valued functions \(a(t) \in C^\infty(M, \mathfrak{gl}(k))\) related by the constraints \[\dot{\mathbf{S}} = -L_u\mathbf{S} + a\mathbf{S} + \mathbf{S} a^\top ,\quad \mathbf{S}(0) = \mathbf{S}_0, \quad \mathbf{S}(1) = \mathbf{S}_1.\]

As before, these two metrics are defined as natural extensions of the classical Wasserstein metric: For any path between two matrix densities the first term penalizes if much mass is to be moved over \(M\), while the second evaluates how much two matrix valued masses differ by measuring how far one needs to rotate the coordinates to identify them.

For convenience, we now introduce the factorization \(\mathbf{S} = S\otimes\varrho\) where \(\operatorname{tr}S = 1\) (this corresponds to the factorization \(\mathbf{v} = v\otimes \sqrt{\varrho}\) of the vector half-density above). In this factorization, we have that \[\label{eq:rho95evolution95MDens} \dot{\varrho} = \frac{d}{dt}\operatorname{tr}(\mathbf{S}) = - L_u\varrho + 2\operatorname{tr}(a S)\varrho\tag{9}\] and \[\dot{S} = -L_u S + a S + S a^\top - 2 S \operatorname{tr}(a S).\]

Theorem 17. Consider the projection \(\pi\colon \widehat G \to \mathrm{MDens}\) given by the action \(\pi(\varphi,A) = A(\varphi_* \mathbf{S}_0)A^\top\). The projection \(\pi\) is a Riemannian submersion of two Riemannian manifolds: the group \(\widehat G= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{GL}(k))\) equipped with the metric 8 descends to the space of matrix densities \(\mathrm{MDens}\) equipped with the Riemannian metric corresponding to the distance function \(WM\).

Namely, for \(\mathbf{S} = S\otimes \varrho\) and \(\dot{\mathbf{S}}\in T_{\mathbf{S}} \mathrm{MDens}\), the Riemannian metric on \(\mathrm{MDens}\) for the distance function \(WM\) is given by \[\label{eq:metric95MDens} \langle \dot{\mathbf{S}}, \dot{\mathbf{S}}\rangle_{\mathbf{S}} = \int_M \Big( | \underbrace{\operatorname{tr}(S\nabla P)}_{u}|^2 + \operatorname{tr}\big( \underbrace{(P+P^\top)S}_{a} \underbrace{S(P+P^\top)}_{a^\top} \big) \Big)\varrho,\tag{10}\] where \([P]\in T_{\mathbf{S}}^*\mathrm{MDens}\) is (implicitly) defined by \[\dot{\mathbf{S}} = -L_u\mathbf{S} + a\mathbf{S} + \mathbf{S} a^\top\] with \(u\) and \(a\) as given in equation 10 .

Proof. The statement that the metric 8 decends to the base \(\mathrm{MDens}\) is implied by the fact that its dependence on the group element \((\varphi,A)\) only comes via the projection to the base point \(\mathbf{S}\). The Legendre transformation for the metric 8 is given by the inertia operator \[\label{eq:legendre95MDens} \widehat{\mathfrak{g}} \ni (u,a) \mapsto (u^\flat\otimes\varrho, a \otimes \varrho)\in \widehat{\mathfrak{g}}^*.\tag{11}\] The formula 10 for the metric then follows from the momentum map in Lemma 6 and the definition of the metric 8 . ◻

We are now in a position to study geodesics. To this end, consider the quadratic Hamiltonian on \(\widehat{\mathfrak{g}}^*\times \mathrm{MDens}\) obtained via the Legendre transform 11 :

\[\label{eq:Hamiltonian95MDens} \widehat{H}(u^\flat\otimes\varrho, a\otimes\varrho, S\otimes\varrho) = \frac{1}{2}\int_M \Big( |u|^2 + \operatorname{tr}(aa^\top) \Big)\varrho\tag{12}\]

Theorem 18. The Hamiltonian equations of motion, corresponding to the reduced geodesic equations on \(\widehat{G}\) for an unbalanced optimal transport of matrix densities, are \[\begin{align} & \dot{u} + \nabla_{u}{u} = 2\operatorname{tr}(a S)u , \quad \quad \dot{a} + \nabla_u a = (|u|^2 + \operatorname{tr}(aa^\top))S+ 2\operatorname{tr}(a S)a , \\ & \dot{\rho} + \operatorname{div}(\rho u) = 2\operatorname{tr}(aS)\rho, \quad \dot{S} + \nabla_u S = a S + Sa^\top - 2 \operatorname{tr}(aS) S. \end{align}\]

Note that if \(a\) is skew-symmetric then all the terms with \(\operatorname{tr}(aS)\) vanish (as well as the term \((|u|^2 + \operatorname{tr}(aa^\top))S\)), and thus the equations simplify noticeably. We will also observe similar simplifications for the balanced version of matrix transport in the next section. Again, the equations are on \(\hat{\mathfrak{g}}^*\times \mathrm{MDens}\), which is a reduced form of \(T^*\hat{G}=\hat{\mathfrak{g}}^*\times (\operatorname{Diff}(M)\ltimes C^\infty(M,\mathrm{Sym}_+(k)))\) taking into account the reduction of base isometries.

Proof. The variational derivatives of the Hamiltonian are \[\frac{\delta \widehat H}{\delta (u^\flat\otimes\varrho)} = u, \quad \frac{\delta \widehat H}{\delta (a\otimes\varrho)} = a, \quad \frac{\delta \widehat H}{\delta \mathbf{S}} = -\frac{1}{2}\big( \lvert u \rvert^2 + \operatorname{tr}(aa^\top)\big) [I] \,,\] where \([I] \in T^*_{\mathbf{S}}\mathrm{MDens}\) is the co-set of the identity matrix \(I\).

Next, the general form of the corresponding Hamiltonian equations is obtained in Proposition 21 in Appendix, namely \[\label{eq:ham95eq95abstract95MDens} \frac{d}{dt}(u^\flat\otimes\varrho, a\otimes\varrho) = \operatorname{ad}^*_{(\frac{\delta H}{\delta u^\flat\otimes\varrho},\frac{\delta H}{\delta a\otimes\varrho})}(u^\flat\otimes\varrho, a\otimes\varrho) - \Phi(\mathbf{S}, \frac{\delta H}{\delta \mathbf{S}}).\tag{13}\] Then from Lemma 6 we have \[\begin{align} \Phi(\mathbf{S}, \frac{\delta H}{\delta \mathbf{S}})& = \Big( -\frac{1}{2}\operatorname{tr}\Big(\mathbf{S} d \big( \lvert u \rvert^2 + \operatorname{tr}(aa^\top)\big) \Big), -\big( \lvert u \rvert^2 + \operatorname{tr}(aa^\top)\big)\mathbf{S} \Big) \\ &= \left(-\frac{1}{2} d \left(|u|^2 + \operatorname{tr}(aa^\top)\right)\otimes\varrho , -\left(|u|^2 + \operatorname{tr}(aa^\top)\right)\mathbf{S}\right), \end{align}\] and a calculation as in Lemma 4 shows that \[\operatorname{ad}^*_{(u,a)}(m\otimes\varrho, \beta\otimes\varrho) = \Big( -L_u (m\otimes\varrho) - \operatorname{tr}(\beta d a^\top)\otimes\varrho, -L_u (\beta\otimes\varrho) \Big).\] Thus, \[\begin{align} \operatorname{ad}^*_{(\frac{\delta H}{\delta u^\flat\otimes \varrho}, \frac{\delta H}{\delta a\otimes\varrho})}(u^\flat\otimes\varrho,a\otimes\varrho) &= \big( -L_u (u^\flat\otimes\varrho) - \operatorname{tr}(a \, d a^\top), -L_u (a\otimes\varrho) \big) \\ & =\big( -L_u (u^\flat\otimes\varrho) -\frac{1}{2} d \operatorname{tr}(aa^\top), -L_u (a\otimes\varrho) \big). \end{align}\] From equation 13 we then get \[\begin{align} \frac{d}{dt}(u^\flat\otimes\varrho) &= -L_u(u^\flat\otimes\varrho) + \frac{1}{2}d|u|^2 \otimes\varrho \\ &= -(\nabla_u u)^\flat \otimes\varrho - u^\flat\otimes L_u\varrho . \end{align}\] Combined with equation 9 this yields \[\begin{align} \dot{u} + \nabla_u u = 2\operatorname{tr}(a S)u. \end{align}\] The term \(\operatorname{tr}(a S)\) can be regarded as a Lagrange multiplier for the constraint \(\operatorname{tr}(S)=1\) related to our choice of normalization of \(\mathbf{S}\).

Likewise, for the \(a\otimes\varrho\) component we get \[\begin{align} \frac{d}{dt}(a\otimes\varrho) = -L_u(a\otimes\varrho) + (|u|^2 + \operatorname{tr}(aa^\top))S\otimes\varrho, \end{align}\] which yields \[\begin{align} \dot{a} + \nabla_u a = 2\operatorname{tr}(a S)a + (|u|^2 + \operatorname{tr}(aa^\top))S. \end{align}\] Finally, the evolutions for \(S\) and \(\varrho\) are obtained from the infinitesimal action formula as derived earlier. ◻

5.2 Balanced matrix transport↩︎

While the matrix version intrinsically favors the unbalanced version, one can develop the balanced matrix transport by normalizing the matrix-valued densities and changing the gauge group from \(\mathrm{GL}(k)\)-currents to \(\mathrm{PGL}(k)\)-currents.

Definition 8. The space of normalized matrix densities is \[\mathrm{MProb}:=\{\mathbf{S} \in \Omega^n(M, {\rm Sym}_+(k))~|~ \, \int_M \operatorname{tr}\mathbf{S}=1. \}\] In other words, the space of densities with values in positive-definite quadratic forms on \({\mathbb{R}}^k\) of unit trace.

The semidirect product group acting on this space is \[\widehat{PG}={\rm Diff}(M)\ltimes C^\infty(M, {\rm PGL}(k, \mathbb{R})).\] The group action is the composition of the previously defined (unbalanced) action and a pointwise rescaling as follows: For an element \[(\varphi, A)\in {\rm Diff}(M)\ltimes C^\infty(M, {\rm PGL}(k, \mathbb{R}))\] the action on a matrix density \(\mathbf{S}\) is \[(\varphi, A)\cdot \mathbf{S} = \pi\circ A (\varphi_*\mathbf{S}) A^\top \,,\] where \(\varphi_*\) acts on each matrix element thought of as a top-form, \(A\) acts by pointwise change of basis in the quadratic form, and the pointwise rescaling \(\pi\) takes positive-definite quadratic forms to those of unit trace. As before, the action of the group \(\widehat{PG}= {\rm Diff}(M) \ltimes C^\infty(M, {\rm PGL}(k, \mathbb{R}))\) on matrix densities \(\mathrm{MProb}\) of unit total masses is transitive.

The unbalanced results in Section 5.1 are readily adapted to the balanced case. In particular, the geodesic equations, given by Theorem 18 in the unbalanced case, now look as follows:

Theorem 19. The Hamiltonian equations of motion, corresponding to the reduced geodesic equations on \(\widehat{PG}\) for a balanced optimal transport of matrix densities, are \[\begin{align} & \dot{u} + \nabla_{u}{u} = 0, \quad \quad \dot{a} + \nabla_u a = \lambda I, \quad \operatorname{tr}(aS) = 0, \\ & \dot{\rho} + \operatorname{div}(\rho u) = 0, \quad \dot{S} + \nabla_u S = a S + Sa^\top \,, \end{align}\] where \(\lambda \in \mathbb{R}\) is a Lagrange multiplier for the constraint \(\operatorname{tr}(aS)=0\).

Proof. The Lie algebra element \([a]\in C^\infty(M,\mathfrak{pgl}(k))\) is the coset \([a] = a + \mathbb{R}I\) for \(a\in C^\infty(M,\mathfrak{gl}(k))\). The action on \(\mathbf{S}\) of a coset \([a]\) with representative \(a'\in [a]\) is given by \((a'+cI) \mathbf{S} + \mathbf{S}(a'+cI)^\top\), where \(c\) is determined by the condition \(\operatorname{tr}((a'+cI) \mathbf{S} + \mathbf{S}(a'+cI)^\top) = 0\). Thus, we can uniquely choose \(a := a'+cI \in [a]\) such that \(\operatorname{tr}(a\mathbf{S}) = 0\). By the same calculations as in Theorem 18, but taking the constraint \(\operatorname{tr}(aS)=0\) into account, we get the results. ◻

Note that for the balanced matrix transport we have the same alternative as with the scalar one, namely one can use only one parameter, the total mass, to normalize the density, as in simple unbalanced transport [13]. Above, for \(\mathrm{PGL}(k)\)-currents, we use the pointwise normalization.

Remark 20. One can also consider a different action of diffeomorphisms on matrix-valued densities for the case \(k=n\). Namely, if matrix valued densities are defined on the tangent space to \(M\) then one can regard elements of \({\rm MDens} := \Omega^n(M, Sym_+(n))\) as metric-valued densities. The action of diffeomorphisms in that case will be not only by multiplication by the Jacobian, but via the action of the change of coordinates on the metric on \(M\). The corresponding metric will involve the Ebin metric on densities and differ from the one described above. This setting is carefully studied in [14].

6 Further ramifications of transport↩︎

Figure 2: Commutative diagram for the relation between spaces for scalar, vector, and matrix valued optimal transport, in both balanced (upper) and unbalanced (lower) settings. (More precisely, the map from vector densities to matrix ones takes the former to positive semi-definite matrices {\rm Sym}_+(k), and hence to the closure of \rm MDens, as discussed in Remark 15.)

The relations between different parts of this paper are summarized in Figure 2. We continue here with several further directions which, in our opinion, would be interesting to explore.

  1. Do there exist interesting finite-dimensional models of the vector optimal transport, extending the action of linear transformations on Gaussians, cf. [15]?

  2. By changing the discrete index (the density coordinates) to a continuous parameter, the velocity \(v\in C^\infty(M, \mathbb{R}^k)\) changes to \(v\in C^\infty(M\times \mathbb{R}, \mathbb{R})\), and this covers a new variety of the equations, with only cosmetic changes. It would be interesting to develop this setting and its applications, as it might be related to generalized flows as suggested by Brenier [16].

  3. Adding a potential \(U\) to the Lagrangian we extend the domain of applicability of transports of vector densities. Namely, they are now governed not by geodesics, but by the Newton equation. The potential can encode various constraints and obstructions for the transport.

  4. The groups considered above were related to densities and gauge transformations in a trivial bundle over an orientable manifold. A similar optimal transport theory should exist for a topologically non-trivial bundle, as well as for densities over a non-orientable manifold.

  5. Is there a relation to the vector-, 1-form-, or matrix- densities considered in [17], [18]?

  6. It would be intersting to see if the relativistic Burgers described by Brenier [19] could be recovered as a particular case of the matrix Burgers equation (or, rather, of a compressible fluid equation) described above.

  7. Consider the Kähler setting, with vector half-densities over complex scalars, which gives a generalized Madelung transform via Fusca’s map [20]. It would be interesting to unify those two approaches in matrix transport and Kähler vector transport. They present essentially the same group action in two similar ways, in the real and complex directions. A natural question arises whether there is an analogue of Fusca’s momentum map for the unbalanced mass transport. And if so, what are its Riemannian submersion and Kähler properties?

More on motivation In the introduction we mentioned applications of optimal transport of vector and matrix densities to colored medical imaging and quantum spin liquids in physics. One more motivation comes from the spectral analysis of multi-variable time series (e.g., [9]). The related spectral content is assessed based on certain statistics related to moments of the corresponding power spectral density. It may drift across frequencies over time as well as across principle directions (e.g., echo locating a moving target using antenna arrays), as captured by vector-valued time series. Therefore, in the transport between matrix-valued densities, we consider both the cost of shifting power across frequencies and the cost of rotating the corresponding principle axes.

Another application is related to problems of optimal distribution in economics. Consider for instance the task of optimal distribution of vacation packages by wholesalers: assuming that one can change the length of stays at several resorts between a few standard ones, at a cost (i.e., the personnel’s salary) as compared to the set of various packages originally bought by the wholesaler in bulk. Minimization of the corresponding cost becomes a problem of optimal transport of vector densities, where the coordinates are packages of a given length and at a given place. The metrics proposed in this paper are straightforward analogs of the Wasserstein metric, being based on the differential geometry of the problem.

7 Semi-invariant Hamiltonians for Lie groups↩︎

The purpose of this appendix is to demonstrate, in an abstract setting, how the momentum map in symplectic geometry naturally appear in the reduced governing equations for Hamiltonian systems on the cotangent bundle of a Lie group. We start with the definition of the momentum map.

Definition 9. Let \((M,\omega)\) be a symplectic manifold and let \(G\) be a Lie group acting on \(M\) by symplectic diffeomorphisms. The action is called Hamiltonian if there exists a momentum map \(\Phi\colon M\to \mathfrak{g}^*\) defined as follows. Assume that the infinitesimal action for the Lie algebra element \(\xi\in\mathfrak{g}\) is a Hamiltonian vector field, i.e. \(x\mapsto \xi\cdot x \in T_xM\) is given by a Hamiltonian vector field \(X_{H_\xi}\) for some Hamiltonian \(H_\xi\) which depends linearly on \(\xi\). Then the existence of the momentum map \(\Phi\) means a consistent choice of Hamiltonians so that \[\langle \Phi(x), \xi \rangle := H_\xi(x)\] and \(\Phi\colon M\to \mathfrak{g}^*\) is equivariant with respect to the coadjoint action on \(\mathfrak{g}^*\).

The result below, which we use in specific settings throughout the paper, is concerned with the special case of cotangent bundles \(M=T^*Q\) with a cotangent lifted action of \(G\) originating from an action of \(G\) on \(Q\). We recall that cotangent lifted actions are always Hamiltonian, and that the momentum map \(\Phi\colon T^*Q \to \mathfrak{g}^*\) in this case is defined by \[\langle \Phi(q, p), \xi \rangle = \langle p,\xi\cdot q\rangle .\]

Proposition 21. Consider a Lie group \(G\) acting on a space \(S\) from the left. Take a Hamiltonian on \(T^*G\) of the form \(\bar H(g,m_g) = H(m, g\cdot s_0)\), where \(m_g = m \cdot g\), \(s_0\in S\), and \(s = g\cdot s_0\). Then the Poisson reduced motion is given by \[\dot{m} + \operatorname{ad}^*_{\xi}m = -\Phi(s,\frac{\delta H}{\delta s}),\qquad \dot{s} + \xi\cdot s = 0,\qquad \xi = \frac{\delta H}{\delta m}\] where \(\Phi\colon T^*S \to \mathfrak{g}^*\) is the momentum map for the cotangent lifted action of \(G\) on \(T^*S\).

Proof. The canonical form of the equations are \[\begin{align} \dot{g} &= \frac{\delta \bar H}{\delta m_g}, \\ \dot{m}_g &= -\frac{\delta \bar H}{\delta g}. \end{align}\] By standard calculations we have \[\dot{m}_g = (\dot{m} + \operatorname{ad}^*_\xi m)\cdot g \qquad\text{and}\qquad \dot{g} = \frac{\delta H}{\delta m} g .\] Let \(g_\epsilon = \operatorname{exp}(\epsilon\xi)g\) be a variation. Then \[\frac{d}{d\epsilon} H(m,g_\epsilon\cdot s_0) = \langle \frac{\delta H}{\delta s}, \xi\cdot s \rangle = \langle \Phi(s,\frac{\delta H}{\delta s}),\xi\rangle = \langle \Phi(s,\frac{\delta H}{\delta s})\cdot g,\xi\cdot g\rangle\] Thus, \[\frac{\delta \bar H}{\delta g} = \Phi(s,\frac{\delta H}{\delta s})\cdot g \,,\] which proves the result. ◻

8 An alternative metric on matrix densities↩︎

In addition to extending optimal transport from scalar to matrix valued densities, there is another approach to matrices by restricting optimal transport on \(\mathbb{R}^n\) to multivariate Gaussian densities (represented by positive definite matrices) and linear transport maps (cf. [15] and references therein). As was suggested by F.-X. Vialard [21], this approach provides the following alternative to the Bures-type \(L^2\)-metric 8 on the group \(\widehat G= \operatorname{Diff}(M)\ltimes C^\infty(M,\operatorname{GL}(k))\ni (\varphi,A)\), which is consistent with its finite-dimensional analog: \[\label{eq:metric95gaussian} \begin{align} & \langle (\dot{\varphi},\dot{A}), (\dot{\varphi},\dot{A})\rangle_{(\varphi,A)} = \int_M \left(|\dot{\varphi}\circ\varphi^{-1}|^2 \operatorname{tr}(\underbrace{A\varphi_*\mathbf{S}_0A^\top}_{\mathbf{S}}) + \operatorname{tr}(\dot{A} \varphi_*\mathbf{S}_0 \dot{A}^\top) \right) \\ & = \int_M \left(|u|^2 \operatorname{tr}(\mathbf{S}) + \operatorname{tr}(\dot{A} A^{-1}\underbrace{A\varphi_*\mathbf{S}_0A^\top}_{\mathbf{S}} (\dot{A} A^{-1})^\top) \right) \\ & = \int_M \left(|u|^2 \operatorname{tr}(\mathbf{S}) + \operatorname{tr}(a \mathbf{S} a^\top) \right)\,, \end{align}\tag{14}\] where \((u,a)\in \mathfrak{X}(M)\ltimes C^\infty(M,\mathfrak{gl}(k))\) are the reduced coordinates, \(u = \dot{\varphi}\circ\varphi^{-1}\) and \(a = \dot{A} A^{-1}\), while \(\mathbf{S}_0 \in \mathrm{MDens}\) is a fixed initial matrix density.

For this metric, the Legendre transform under the \(L^2\)-Frobenius pairing is given by \[(u,a) \mapsto \left( u^\flat\otimes \operatorname{tr}(\mathbf{S}), [a\mathbf{S}] \right).\] In turn, this gives the Hamiltonian on \(\widehat{\mathfrak{g}}^*\times \mathrm{MDens}\) as

\[\label{eq:Hamiltonian95gaussian} \widehat{H}(u^\flat\otimes \operatorname{tr}(\mathbf{S}), [a\mathbf{S}], \mathbf{S}) = \frac{1}{2}\int_M \left(\frac{|u^\flat\otimes\operatorname{tr}(\mathbf{S})|^2 }{\operatorname{tr}(\mathbf{S})} + \operatorname{tr}( (a \mathbf{S}) \mathbf{S}^{-1}(a \mathbf{S})^\top) \right).\tag{15}\]

Proposition 22. The Hamiltonian equations of motion, corresponding to the reduced geodesic equations on \(\widehat{G}\), are \[\begin{align} & \dot{u} + \nabla_u u = 2\operatorname{tr}(aS) - \operatorname{tr}\big([a,S]\nabla a^\top\big), \\ & \dot{\mathbf{S}} + L_u \mathbf{S} = a\mathbf{S} + \mathbf{S} a^\top, \end{align}\] where, as before, we use the decomposition \(\mathbf{S} = S\otimes\varrho = S \otimes \operatorname{tr}\mathbf{S}\).

Note that if \(a\) is skew-symmetric then all the terms in the right-hand side of the equation for \(\dot{u}\) vanish. This reflects that the alterative metric 14 on \(\mathrm{MDens}\) descends to the vector half-density metric 5 .

Proof. The variational derivatives of the Hamiltonian are \[\begin{gather} \frac{\delta \widehat H}{\delta (u^\flat\otimes\operatorname{tr}(\mathbf{S}))} = u, \quad \frac{\delta \widehat H}{\delta (a\mathbf{S})} = a, \\ \frac{\delta \widehat H}{\delta \mathbf{S}} = -\frac{1}{2}\big( \lvert u \rvert^2 I + \operatorname{tr}(\mathbf{S}^{-1} (a\mathbf{S})^\top (a\mathbf{S})\mathbf{S}^{-1}\delta\mathbf{S} )\big) = -\frac{1}{2}\big( \lvert u \rvert^2 I + a^\top a \big)\,, \end{gather}\] where \(I \in T^*_{\mathbf{S}}\mathrm{MDens}\) is the identity matrix. Next, the general form of the corresponding Hamiltonian equations is obtained in Proposition 21 in Appendix, namely \[\label{eq:ham95eq95abstract95gaussian} \frac{d}{dt}(u^\flat\otimes\operatorname{tr}(\mathbf{S}), a\mathbf{S}) = \operatorname{ad}^*_{(\frac{\delta H}{\delta u^\flat\otimes\operatorname{tr}(\mathbf{S})},\frac{\delta H}{\delta a \mathbf{S}})}(u^\flat\otimes\operatorname{tr}(\mathbf{S}), a\mathbf{S}) - \Phi(\mathbf{S}, \frac{\delta H}{\delta \mathbf{S}}).\tag{16}\] Then from Lemma 6 we have \[\begin{align} &\Phi(\mathbf{S}, \frac{\delta H}{\delta \mathbf{S}}) = \Big( -\frac{1}{2}\operatorname{tr}\Big(\mathbf{S} d \big( \lvert u \rvert^2 + a^\top a \big) \Big), -\big( \lvert u \rvert^2 + a^\top a\big)\mathbf{S} \Big) \\ &= \Big( - \frac{1}{2}d \lvert u \rvert^2\otimes \operatorname{tr}(\mathbf{S}) -\frac{1}{2}\operatorname{tr}\big((a\mathbf{S} + \mathbf{S} a) d a^\top \big), -\big( \lvert u \rvert^2 + a^\top a\big)\mathbf{S} \Big) , \end{align}\] and a calculation as in Lemma 4 shows that \[\operatorname{ad}^*_{(u,a)}(\boldsymbol{m}, \boldsymbol{\beta}) = \Big( -L_u (\boldsymbol{m}) - \operatorname{tr}(\boldsymbol{\beta} d a^\top), -L_u (\boldsymbol{\beta}) \Big).\] Thus, \[\begin{align} &\operatorname{ad}^*_{(\frac{\delta H}{\delta u^\flat\otimes \operatorname{tr}(\mathbf{S})}, \frac{\delta H}{\delta a\otimes\operatorname{tr}(\mathbf{S})})}(u^\flat\otimes\operatorname{tr}(\mathbf{S}),a\mathbf{S}) = \big( -L_u (u^\flat\otimes\operatorname{tr}(\mathbf{S})) - \operatorname{tr}(a\mathbf{S} \, d a^\top), -L_u (a\mathbf{S}) \big). \end{align}\] From equation 16 we then get \[\begin{align} \frac{d}{dt}(u^\flat\otimes\operatorname{tr}(\mathbf{S})) &= -L_u(u^\flat\otimes\operatorname{tr}(\mathbf{S})) + \frac{1}{2}d|u|^2 \otimes\operatorname{tr}(\mathbf{S}) -\frac{1}{2}\operatorname{tr}\big((a\mathbf{S} - \mathbf{S} a) d a^\top \big) . \end{align}\] Likewise, for the \(a\mathbf{S}\) component we obtain \[\begin{align} \frac{d}{dt}(a\mathbf{S}) = -L_u(a\mathbf{S}) + (|u|^2 + a^\top a)\mathbf{S}. \end{align}\] Furthermore, the transported variables are \[\dot{\mathbf{S}} = -L_u \mathbf{S} + a\mathbf{S} + \mathbf{S} a^\top.\] Thus, the trace density evolves as \[\frac{d}{dt}\operatorname{tr}(\mathbf{S}) = -L_u \operatorname{tr}(\mathbf{S}) + 2\operatorname{tr}(a\mathbf{S}).\] Moreover, the evolution of the vector field \(u\) is given by \[\dot{u} + \nabla_u u = 2 \frac{\operatorname{tr}(a \mathbf{S}) u}{\operatorname{tr}(\mathbf{S})} - \frac{1}{\operatorname{tr}(\mathbf{S})} \operatorname{tr}\big((a\mathbf{S} - \mathbf{S} a) \nabla a^\top \big).\] In the coordinates for matrix density \(\mathbf{S} = S\otimes\varrho\) as before, this gives \[\dot{u} + \nabla_u u = 2\operatorname{tr}(aS) - \operatorname{tr}\big([a,S]\nabla a^\top\big) .\] Notice that the right-hand side term vanishes when \(a\) is skew-symmetric, since \(\operatorname{tr}(aSb) = \operatorname{tr}(Sab)\) whenever \(S\) is symmetric and \(a,b\) are skew-symmetric. ◻

References↩︎

[1]
J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem. Numer. Math., 84(3):375–393, 2000.
[2]
C. Villani. Optimal Transport: Old and New, volume 338 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, 2009.
[3]
P. W. Michor and D. Mumford. An overview of the Riemannian metrics on spaces of curves using the Hamiltonian approach. Appl. Comput. Harmon. Anal., 23(1):74–113, 2007.
[4]
L. Younes. Shapes and Diffeomorphisms. Springer-Verlag, Berlin, 2010.
[5]
L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. Unbalanced optimal transport: Dynamic and Kantorovich formulations. J. Funct. Anal., 274(11):3090–3123, 2018.
[6]
J. N. Kather, A. Weidner, U. Attenberger, Y. Bukschat, C.-A. Weis, M. Weis, L. R. Schad, and F. G. Zöllner. Color-coded visualization of magnetic resonance imaging multiparametric maps. Scientific Reports, 7(1):41107, 2017.
[7]
F. Otto. The geometry of dissipative evolution equations: the porous medium equation. Comm. Partial Differential Equations, 26(1-2):101–174, 2001.
[8]
B. Khesin, G. Misiolek, and K. Modin. Geometric hydrodynamics and infinite-dimensional Newton’s equations. Bulletin of the Amer. Math. Soc., 58(3):377–442, 2021.
[9]
Y. Chen, T. T. Georgiou, and A. Tannenbaum. Vector-valued optimal mass transport. SIAM Journal on Applied Mathematics, 78(3):1682–1696, 2018.
[10]
A. Izosimov and B. Khesin. Geometry of generalized fluid flows. Calc. Var. Partial Diff. Equations, 63(3):30pp., 2023.
[11]
T. Gallouet, R. Ghezzi, and F.-X. Vialard. Regularity theory and geometry of unbalanced optimal transport. Journal of Functional Analysis, 289(7):111042, 2025.
[12]
S. Kondratyev, L. Monsaingeon, and D. Vorotnikov. A new optimal transport distance on the space of finite Radon measures. Advances in Differential Equations, 21(11/12):1117–1164, 2016.
[13]
B. Khesin, K. Modin, and L. Volk. Simple unbalanced optimal transport. Int. Math. Research Notices (IMRN), 2024(10):8839–8855, 2024.
[14]
M. Bauer, P. Michor, and F.-X. Vialard. The Wasserstein-Ebin metric: A geometric lift of unbalanced optimal transport to the space of Riemannian metrics. in preparation.
[15]
K. Modin. Geometry of matrix decompositions seen through optimal transport and information geometry. Journal of Geometric Mechanics, 9(3):335–390, 2017.
[16]
Y. Brenier. Minimal geodesics on groups of volume-preserving maps and generalized solutions of the euler equations. Comm. Pure and Appl. Math., 52(4):411–452, 1999.
[17]
Y. Brenier and D. Vorotnikov. On optimal transport of matrix-valued measures. SIAM J. Math. Anal., 52(3):2849–2873, 2020.
[18]
K. Ciosmak. Optimal transport of vector measures. Calc. Var. Partial Diff. Equations, 60(6):1–22, 2021.
[19]
Y. Brenier. Une interprétation variationnelle de la relativité générale dans le vide en termes de transport optimal. Comptes Rendus. Mathématique, 360:25–33, 01 2022.
[20]
D. Fusca. The Madelung transform as a momentum map. J. Geom. Mech., 9:157–165, 2017.
[21]
F.-X. Vialard. Metrics on matrix densities. personal communication.

  1. Department of Mathematics, University of Toronto, Canada; e-mail: khesin@math.toronto.edu↩︎

  2. Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, Sweden; e-mail: klas.modin@chalmers.se↩︎

  3. One can generalize the setting to non-compact manifolds, e.g.\(M = \mathbb{R}^n\), by imposing decaying conditions on the objects involved.↩︎

  4. One can also introduce a family of Bures-type metrics with different weighting of the these terms for specific applications, but it can be absorbed by rescaling of the second term and is omited below.↩︎