The Statistics of Circular Optimal Transport

Hundrieser

Shayan Hundrieser

Klatt

Marcel Klatt

Munk

Axel Munk


Optimal transport (OT) based distances compare probability measures while incorporating the geometry of the underlying ground space. Therefore, OT has recently been recognized as a highly informative and effective tool for statistical data analysis and inferential purposes. For more details on OT and its applications we refer to (Villani, 2003,2008), (Santambrogio, 2015), (Peyré and Cuturi, 2019), (Panaretos and Zemel, 2020).

In this work, we discuss statistical properties of OT on the circle ${S^1}$. For this purpose, we parametrize ${S^1}$ by $[0,1)$, equipped with its geodesic distance $\rho_{S^1}(x,y) \;{:=}\;\min(\vert x-y\vert, 1-\vert x-y\vert)$, and consider the circular OT (COT) distance between probability measures $\mu$ and $\nu$ on ${S^1}$ defined by

$\displaystyle COT(\mu, \nu) \;{:=}\;\inf_{\pi}\int_{{S^1}\times {S^1}} \rho_{S^1}(x,y) d\pi(x,y).$

Herein, the infimum is taken over all couplings between $\mu$ and $\nu$. Intuitively, the COT distance quantifies the minimum effort of transporting mass from one distribution to another (Video 1).


Image 1_OT_Morphing
Video 1. Circular optimal transport (COT). Mass transportation from $\mu$ (blue) to $\nu$ (red) represented via displacement interpolation $\mu_t$ for $t \in [0,1]$ with respect to COT on the circle (left) and its corresponding cartesian plot (right).

Top: Probability measures $\mu$, $\nu$ with unimodal characteristics.
Bottom: Probability measures $\mu$, $\nu$ with multimodal characteristics.

For applications, the true population measure $\mu$ is typically not available and has to be estimated, e.g. by the empirical measure $\hat \mu_n \;{:=}\;\frac{1}{n}\sum_{i = 1}^{n}\delta_{X_i}$ where $X_1, \dots, X_n \stackrel{i.i.d.}{\sim}\mu$. This yields the random quantity $COT(\hat \mu_n, \nu)$ whose asymptotic $(n \rightarrow \infty)$ fluctuation around $COT(\mu, \nu)$ is characterized by a central limit theorem. This paves the way for a variety of statistical inference tasks based on empirical OT for circular data. In particular, we formulate an asymptotically consistent test for the assessment of goodness of fit of circular data, the circular optimal transport test (COTT). According to COTT a sample $X_1, \dots, X_n$ on ${S^1}$ cannot be drawn from $\mu_0$ if the statistic $COT(\hat \mu_n, \mu_0)$ is too large. Statistically speaking we investigate the null hypothesis

$\displaystyle \mathcal{H}_0: X_1, \dots, X_n \sim \mu_0$

for some prespecified significance level $\alpha >0$ (typically $\alpha = 0.05$) where $\mathcal{H}_0$ is rejected if $COT(\hat \mu_n, \mu_0)$ is larger than some quantile.


For testing uniformity $\big(\mu_0 =$   Unif$(S^1)\big)$ it turns out that the COTT performs particularly well for unimodal alternatives and is almost as powerful as Rayleigh's test known to be the most powerful invariant test in case of von Mises alternatives (Figure 2). For alternatives with many modes the COTT is found to be less powerful which is explained by the shape of the corresponding transport plan.


\includegraphics[width=1\textwidth, trim = 20 0 0 0, clip]{Figures/2_RejectionProbs_zoomed}
Figure 2. Statistical power of tests for uniformity under von Mises. Empirical rejection probabilities for tests on uniformity with significance level $\alpha = 0.05$ based on 10,000 repetitions of sample size $n = 30$ from von Mises distributions with mean $\gamma = 0$ and concentration parameter $\kappa\in \{0,0.1, \dots, 2.5\}$. The dashed black line represents the level $\alpha = 0.05$.

The performance of the COTT testing for uniformity is a typical artifact of the essence of OT. It is certainly more costly to transport mass uniformly around the circle that is concentrated almost exclusively around a single mode compared to mass that is sufficiently spread (Video 3). As a result, the associated COT distance is likely to be larger leading to a higher rejection probability.

Image 3_OT_MultipleModesTransition
Video 3. Circular optimal transport between a uniform distribution and single/multiple mode distributions. Circular density plot for distributions $\mu$ and $\nu$ (left) and their respective COT distance (right).

Top: Uniform distribution $\nu$ (red) as well as von Mises distribution $\mu$ (blue) with mean at 0.5 and concentration parameter $\kappa\in [0,5]$ (left).
Bottom: Uniform distribution $\nu$ (red) and mixture of four von Mises distributions $\mu$ (blue) with respective means at $1/8$, $3/8$, $5/8$, $7/8$ and concentration parameter $\kappa \in [0,25]$ (left).

We provide the R-package circularOT for circular data analysis with OT at the gitlab repository https://gitlab.gwdg.de/shundri/circularOT. Besides computation of COT distances the package includes an implementation of the COTT for uniformity as well as a bivariate bootstrap based COTT to assess whether two samples stem from the same distribution.

Code Examples

       
      ### Install package "circularOT"
      install.package("circularOT")
      ### Load package
      library(circularOT)
      library(circular)      # Package for random number generation of 
                           # von Mises distribution
                           
      ### Test for uniformity using COTT
      set.seed(0)
      cot.test_Uniformity(runif(15, 0.2, 0.8), typeOfData="UnitInt") 
      #       
      #          One-Sample COTT for Uniformity
      # Test statistic:  0.3717834  
      # P-Value:          0.046 

      ### Test if two samples stem from identical distribution
      set.seed(5)
      cot.test_Bivariate_Bootstrap(
               rvonmises(10, circular(0),     3, control.circular=list(units="radians")), 
               rvonmises(10, circular(pi), 3, control.circular=list(units="radians")), 
               typeOfData = "Radian")
      #
      #          Bivariate (Bootstrap) COTT for Goodness of Fit
      # Test statistic:  0.4943681 
      # P-Value:          0.0046
   

Bibliography

V. Panaretos and Y. Zemel.
An invitation to statistics in Wasserstein space.
Springer, Cham, 2020.
ISBN 978-3-030-38437-1.
URL https://www.springer.com/gp/book/9783030384371.

G. Peyré and M. Cuturi.
Computational optimal transport: With applications to data science.
Foundations and Trends in Machine Learning, 11 (5-6): 355-607, 2019.
URL https://www.nowpublishers.com/article/Details/MAL-073.

F. Santambrogio.
Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling.
Progress in Nonlinear Differential Equations and Their Applications. Springer International Publishing, 2015.
ISBN 9783319208282.
URL https://www.springer.com/gp/book/9783319208275.

C. Villani.
Topics in optimal transportation.
Graduate Studies in Mathematics. American Mathematical Society, 2003.
ISBN 9780821833124.
URL https://bookstore.ams.org/gsm-58.

C. Villani.
Optimal transport: old and new.
A Series of Comprehensive Studies in Mathematics. Springer, 2008.
ISBN 9783540710509.
URL https://www.springer.com/de/book/9783540710493.