Sliced-Regularized Optimal Transport
Sliced-Regularized Optimal Transport
In this post, we break down a new regularized optimal transport problem Sliced-Regularized Optimal Transport (SROT).
Overview
Optimal Transport (OT) is a powerful framework for comparing probability distributions, but it is often computationally expensive. A widely used workaround is entropic regularization (EOT), which makes OT scalable via Sinkhorn iterations. However, EOT introduces bias because it regularizes toward an uninformative prior — the independent coupling.
This post introduces a different idea:
What if we regularize OT toward a meaningful prior instead?
This leads to Sliced-Regularized Optimal Transport (SROT):
- Replace the independent coupling with a sliced OT (SOT) plan
- Keep the same KL regularization structure
- Retain Sinkhorn-style efficiency
The result is a method that is as scalable as EOT, but closer to true OT.
1. From OT to Regularized OT
Optimal Transport
Given distributions $\mu $ and $\nu $, OT solves:
\[\pi^\star = \text{arg}\min_{\pi \in \Pi(\mu,\nu)} \int c(x,y)\, d\pi(x,y)\]This finds the most efficient way to move mass from $\mu $ to $\nu $.
Entropic OT (EOT)
EOT adds a KL penalty:
\[\pi^\star_\varepsilon = \text{arg}\min_{\pi \in \Pi(\mu,\nu)} \int c(x,y)\, d\pi(x,y) + \varepsilon \, \mathrm{KL}(\pi \mid \mu \otimes \nu)\]Key idea:
- Encourages smooth plans
- Reference = independent coupling $\mu \otimes \nu $
Limitation:
The prior is not informative about the true OT structure.
2. Sliced OT as a Prior
Sliced OT (SOT) approximates OT by:
- Projecting data onto 1D directions
- Solving 1D OT (cheap, closed-form)
- Lifting back to high dimensions
- Averaging over projections
Final SOT plan:
\[\pi^{\mathrm{SOT}} = \mathbb{E}_\theta[\pi_\theta]\]Properties:
- Fast (sorting-based)
- Captures geometry better than independent coupling
- Still approximate
3. SROT: The Core Idea
Definition
SROT replaces the EOT prior with a SOT plan:
\[\pi^\star_{\varepsilon,\mathrm{SOT}} = \text{arg}\min_{\pi \in \Pi(\mu,\nu)} \int c(x,y)\, d\pi(x,y) + \varepsilon \, \mathrm{KL}(\pi \mid \pi^{\mathrm{SOT}})\]Interpretation
- EOT: shrink toward independence
- SROT: shrink toward a geometric proxy of OT
As $\varepsilon \to 0 $:
- Recover exact OT
As $\varepsilon \to \infty $:
- Recover SOT plan
Key Insight
In discrete form:
\[C'_{ij} = C_{ij} - \varepsilon \log P^{\mathrm{SOT}}_{ij}\]So SROT is equivalent to EOT with a modified cost:
- Pairs favored by SOT become cheaper
- Geometry is encoded directly in the optimization
4. Sinkhorn Algorithm for SROT
SROT keeps the same computational structure as EOT.
Kernel
\[K = P^{\mathrm{SOT}} \odot \exp(-C/\varepsilon)\]Iterations
\[u \leftarrow \alpha \oslash (K v), \quad v \leftarrow \beta \oslash (K^\top u)\]Transport Plan
\[P = \mathrm{diag}(u)\, K \, \mathrm{diag}(v)\]Enjoy Reading This Article?
Here are some more articles you might like to read next: