News

It's spring break! I'm relaxing back at home.

[May 13, 2025:] Geometry and Probability
[May 1, 2025:] Words I Pretend to Know
[April 1, 2025:] Langevin on Manifolds

Notes

Working on notes on the quantum mechanics, derivatives (AKA tangent spaces vs. algebraic approaches), and uploading my course notes onto this blog!

Projects

Finally started a projects page! I've recently made some nice upgrades to my post component, so it looks pretty clean! ;)

🌊

I'm considering whether or not to continue this project using WebGL or Three.js.

I'm also researching methods for generating the 3D scenes I want for this project automatically.

In the meantime, I've decided to proceed with some preliminary prototypes of the other interactive parts of this project.

Orange Juice

I like orange juice. :)

Mlog


Geometry and Probability

May 13, 2025
By Aathreya Kadambi

When thinking about what to do for my final project in my Riemannian geometry class, I had two potentialy topics in mind:

  • Information geometry, the fisher information metric, etc.
  • A review of recent work on optimal transport, based on the material in Cédric Villani’s books.

I became interested in the first topic because it has the word “information” in it. Supposedly this may be a misnomer, but my gut instinct is that it’s not. I was particularly interested because Fisher information is a Hessian, which is essentially curvature. In my mind (especially after studying a tad bit of physics) curvature is all that matters since we usually only care about second-order Taylor expansions and local studies. So this felt like a mix of my favorite things: information and curvature. Not only that, but one of my friends had been learning about Fisher information in his statisics class, so I was a little jealous and curious. After enrolling in Bayesian statistics and Riemannian geometry this semester, I learned a bit more about Fisher information and also the Fisher information metric, and the field of information geometry has been stuck in the back of my mind ever since.

I became interested in the second topic when I was reading about TRELLIS and a method called rectified flow. Supposedly, optimal transport finds amazing applications in computer vision and graphics, and it’s really manifested in some amazing technologies. I wanted to learn more about these. In addition, I noticed that optimal transport seemed to have an extremely rich mathematical theory behind it. I found Cédric Villani’s books on the subject and decided to start reading. Again, it seemed to fit right in with Riemannian geometry, and I loved that it also considered measure theory (it’s always felt very strange to me that there are almost two theories of integration and interpretations of symbols like “dx”).

After watching this talk by Villani, I have a strong gut feeling that there is a much deeper connection between all of these ideas. After all, it seems to me that probability theory and geometry set out to do the same thing: measure the world. I think that in some way, they should yield extremely similar theory and ideas.

In this post, I will give an overview of what I am learning about both subjects, and maybe some small details about why I think they are so interesting and related. My Riemannian geometry professor mentioned RCD spaces, so I’ll focus on a paper from 2009 by John Lott and Cédric Villani, Ricci curvature for metric-measure spaces via optimal transport. In this paper, the authors show how to extend optimal transport theory to metric-measure spaces, and how to develop a notion of Ricci curvature bounded below.

Remark. For transparency, I used an LLM to speedily convert my original LaTeX notes into this blog post in MDX (I was originally going to do it directly in MDX but decided to go with LaTeX for the better PDF convertibility). Once I refine them further, my notes will also be on my professional website! If you notice any errors in this post, however, let me know!

In a next post I’ll consider some of theorems and inequalities that can be generalized to metric spaces based on this work. Somehow I saw these theorems on nonsmooth spaces before I saw them on smooth spaces. :-) But they were theorems I’ve been looking forward to since I first heard about Fisher information and entropy.

For context, I will also discuss some background on optimal transport theory based on Villani’s book, Optimal transport: old and new and a lecture by Luigi Ambrosio. I will borrow notation and ideas from the references above. X\mathcal{X} will denote a complete, separable, measured length space.

Background on Optimal Transport

Definition (Image Measure/Push-Forward). μ\mu a Borel measure on X\mathcal{X}. If T:XYT : \mathcal{X} \rightarrow \mathcal{Y} is a Borel map, T#μT_\# \mu is the push-forward of μ\mu by TT: (T#μ)(A)=μ(T1(A))AB(Y).(T_\# \mu)(A) = \mu(T^{-1}(A)) \quad \forall A \in \mathcal{B}(\mathcal{Y}).

Definition (Law of a Random Variable). For a random variable on (Ω,P)(\Omega, P), law(X)=X#P.\text{law}(X) = X_\# P.

Definition (Coupling). Let (X,μ)(\mathcal{X}, \mu) and (Y,ν)(\mathcal{Y}, \nu) be probability spaces. A coupling is a collection of two random variables XX and YY such that law(X)=μ\text{law}(X) = \mu and law(Y)=ν\text{law}(Y) = \nu. Via notation abuse, (X,Y)(X,Y) and law(X,Y)\text{law}(X, Y) are also called couplings. The set of all couplings of μ\mu and ν\nu is denoted Π(μ,ν)\Pi(\mu, \nu).

Villani’s Remark (Story of Couplings).

  • Uninformative couplings always exist. Take the space (X×Y\mathcal{X} \times \mathcal{Y}, μν)\mu \otimes \nu): the trivial coupling. Here ”XX does not give any information about the value of YY” (Villani 6).
  • Really good couplings don’t always exist. When XX has all information about YY, we have a deterministic coupling: a measurable function T:XYT : \mathcal{X} \rightarrow \mathcal{Y} so that Y=T(X)Y = T(X).

Optimal Transport. Let c:X×YRc : \mathcal{X} \times \mathcal{Y} \rightarrow \R be a cost function. The Monge-Kantorovich problem is:

infΠ(μ,ν)Ec(X,Y)=infπ=law(X,Y)X×Yc(x,y)dπ(x,y)\inf_{\Pi(\mu, \nu)} \mathbb{E} c(X, Y) = \inf_{\pi = \text{law}(X,Y)} \int_{\mathcal{X} \times \mathcal{Y}} c(x, y) d\pi(x,y)
and the solution couplings (X,Y)(X,Y) are called optimal transports.

Wasserstein Distance. Let P(X)P(\mathcal{X}) be the space of Borel probability measures on metric space X\mathcal{X}. The square of the Wasserstein distance between two measures μ,νP(X)\mu, \nu \in P(\mathcal{X}) is the infimal cost of the Monge-Kantorovich problem with cost c(x,y)=d(x,y)2c(x, y)= d(x,y)^2.

Ricci Curvature Bounds on Nonsmooth Spaces

The trick to defining nonnegative NN-Ricci curvature is to consider this notion of weak displacement convexity: convexity along Wasserstein geodesics. Define P2(X)P_2(\mathcal{X}) as P(X)P(\mathcal{X}) equipped with the Wasserstein distance W2W_2.

Consider U:[0,)RU : [0,\infty)\rightarrow \R a continuous convex function with U(0)=0U(0) = 0.

Definition (DCN\mathcal{DC}_N, DC\mathcal{DC}_\infty). Consider N[1,)N \in [1, \infty). We define the following collections of “entropy-type” functions:
DCN:={U:λλNU(λN) is convex on [0,)},\mathcal{DC}_N := \{ U : \lambda \mapsto \lambda^N U(\lambda^{-N})\text{ is convex on }[0, \infty)\},
DC:={U:λeλU(eλ) is convex on (,)}.\mathcal{DC}_\infty := \{ U : \lambda \mapsto e^\lambda U(e^{-\lambda})\text{ is convex on }(-\infty, \infty)\}.

For any reference probability measure νP(X)\nu \in P(\mathcal{X}), μP2(X)\mu \in P_2(\mathcal{X}) can be decomposed as μ=ρν+μs\mu = \rho \nu + \mu_s where μs\mu_s is singular w.r.t. ν\nu. Then, the corresponding “entropy-type” functional to UU is:
Uν(μ):=XU(ρ(x))  dν(x)+U()μs(X)Maps(P2(X),R{})U_\nu(\mu) := \int_\mathcal{X} U(\rho(x))\;d\nu(x) + U'(\infty)\mu_s(\mathcal{X}) \quad \in \text{Maps}(P_2(\mathcal{X}), \R \cup \{ \infty\})
where U()=limrU(r)rU'(\infty) = \lim_{r\rightarrow \infty} \frac{U(r)}{r}.

Ricci Curvature Bounds on Nonsmooth Spaces. Compact measured length space (X,d,ν)(\mathcal{X}, d, \nu) has nonnegative NN-Ricci curvature for N[1,)N \in [1, \infty) if for every μ0,μ1P2(X)\mu_0,\mu_1 \in P_2(\mathcal{X}) with support contained in supp(ν)\text{supp}(\nu) we have weak displacement convexity for every UνU_\nu functional: there exists a Wasserstein geodesic μt\mu_t connecting μ0\mu_0 and μ1\mu_1 such that

Uν(μt)(1t)Uν(μ0)+tUν(μ1)t[0,1],UDCN.U_\nu(\mu_t) \le (1-t)U_\nu(\mu_0) + tU_\nu(\mu_1) \quad \forall t \in [0,1], U \in \mathcal{DC}_N.
For N=N = \infty, (X,d,ν)(\mathcal{X}, d, \nu) has \infty-Ricci curvature bounded below by KK if
Uν(μt)(1t)Uν(μ0)+tUν(μ1)12λ(U)t(1t)W2(μ0,μ1)2t[0,1],UDC,U_\nu(\mu_t) \le (1-t)U_\nu(\mu_0) + tU_\nu(\mu_1) - \frac{1}{2}\lambda(U)t(1-t)W_2(\mu_0, \mu_1)^2 \quad \forall t \in [0,1], U \in \mathcal{DC}_\infty,
for λ:DCR{}\lambda : \mathcal{DC}_\infty \rightarrow \R \cup \{-\infty\}, Uinfr>0KrU+(r)U(r)rU\mapsto \inf_{r > 0} K\frac{rU_+'(r) - U(r)}{r}.

Remark (Eulerian/Lagrangian duality). This definition has a heavy Lagrangian flavor in the sense that we consider convexity along geodesics, whereas the notion in the Riemannian case feels fundamentally more Eulerian and algebraic in nature. Maybe this provides another example of what Ambrosio mentions in his talk?

Theorem (Lott-Villani 0.12 with constant Ψ\Psi). For any smooth compact connected Riemannian manifold MM,

  • for N[n,)N \in [n, \infty), the space (M,g,dvolMvol(M))(M, g, \frac{d\mathrm{vol}_M}{\mathrm{vol}(M)}) has nonnegative NN-Ricci curvature if and only if Ric0\mathrm{Ric} \ge 0.
  • (M,g,dvolMvol(M))(M, g, \frac{d\mathrm{vol}_M}{\mathrm{vol}(M)}) has \infty-Ricci curvature bounded below by KK if and only if RicKg\mathrm{Ric} \ge K g.

Theorem (Lott-Villani 0.13). The properties of having nonnegative NN-Ricci curvature or \infty-Ricci curvature bounded below by KK commute with taking Gromov-Hausdorff limits.

Remark (Role of Optimal Transport). The role of optimal transport in all of this is providing a metric on P(X)P(\mathcal{X}) from which we get P2(X)P_2(\mathcal{X}), where we can formulate displacement convexity more carefully as convexity along Wasserstein geodesics.

RCD(K,NK,N) Spaces

Definition (CD(K,N)\mathrm{CD}(K,N)). A space is called CD(K,N)\mathrm{CD}(K,N) if it has Ricci curvature bounded below by KK and dimension at most NN.

With the theory above, we can at least define CD(K,)\mathrm{CD}(K, \infty) spaces.

Remark (Finsler Spaces). A Finsler space is a differentiable manifold with a Minkowski norm on tangent spaces. We don’t like them because they do not have as nice properties as Riemannian manifolds (“anisotropic” geometry, issues relating to flows, no Bochner inequality).

RCD(K,NK,N) spaces exclude Finsler spaces by requiring spaces to be “infinitesimally Hilbertian”:

Definition (RCD(K,N)\mathrm{RCD}(K,N)). A metric-measure space (X,d,μ)(\mathcal{X}, d, \mu) is called RCD(K,N)\mathrm{RCD}(K,N) if it is CD(K,N)\mathrm{CD}(K,N) and W1,2(X,d,μ)W^{1,2}(\mathcal{X}, d, \mu) is a Hilbert space.

Alternatively, in Ambrosio’s talk, he instead uses the definition for infinitesimally Hilbertian that the “Cheeger Dirichlet energy” is quadratic.

Remark (GH Convergence). In a work by Ambrosio, Gigli, and Savaré, they show that RCD(K,N)\mathrm{RCD}(K,N) spaces are also more stable under GH convergence than CD(K,N)\mathrm{CD}(K, N) spaces. Here they use the alternative definition for infinitesimally Hilbertian spaces.

One other result Lott and Villani show is the following:

Theorem (Lott-Villani 5.31). If a compact measured length space (X,d,ν)(X, d, \nu) has nonnegative NN-Ricci curvature for some N[1,)N \in [1, \infty), then for all xsupp(ν)x \in \text{supp}(\nu) and all 0<r1r20 < r_1 \le r_2,

ν(Br2(x))(r2r1)Nν(Br1(x))\nu (B_{r_2}(x)) \le \left(\frac{r_2}{r_1}\right)^N \nu(B_{r_1}(x))

and from class we had the following Bishop-Gromov result: if MM is complete and has nonnegative Ricci curvature,

vol(B(p,r))rN is non-increasing\frac{\text{vol}(B(p,r))}{r^N}\text{ is non-increasing}

References

  1. Ambrosio, Luigi. Optimal transport: old and new. 2015.
  2. Lott, John, and Cédric Villani. Ricci curvature for metric-measure spaces via optimal transport. Annals of Mathematics 169.3 (2009): 903—991.
  3. “Optimal Transport and Ricci Curvature.” YouTube, uploaded by Rio ICM2018, https://www.youtube.com/watch?v=JCHNQWhcSLs.
  4. Gigli, Nicola. On the differential structure of metric measure spaces and applications. Memoirs of the American Mathematical Society 236.1113 (2015): 1—91. DOI: https://doi.org/10.1090/memo/1113.

Comments

Not signed in. Sign in with Google to make comments unanonymously!




As a fun fact, it might seem like this website is flat because you're viewing it on a flat screen, but the curvature of this website actually isn't zero. ;-)

Copyright © 2024, Aathreya Kadambi

Made with Astrojs, React, and Tailwind.