# Khai Nguyen

Ph.D. Candidate at Department of Statistics and Data Sciences, University of Texas at Austin

Hi! I’m Khai, a third-year Ph.D. candidate at Department of Statistics and Data Sciences, University of Texas at Austin. I am fortunate to be advised by Professor Nhat Ho and Professor Peter Müller. I am associated with Institute for Foundations of Machine Learning (IFML) and I am a visiting student at Statistical Information Lab at The University of Texas MD Anderson Cancer Center. I graduated from Hanoi University of Science and Technology with a Computer Science Bachelor’s degree. Before joining UT Austin, I was an AI Research Resident at VinAI Research under the supervision of Dr. Hung Bui.

**Research:** My current works are making Optimal Transport scalable in statistical inference (low time complexity, low space complexity, low sample complexity) via the one-dimensional projection approach which is known as sliced optimal transport (sliced Wasserstein distance).

## ** Research Summary**

My works focus on three aspects of the SW distance:

**The vanilla sliced Wasserstein (SW) distance naively treats all one-dimensional projections the same and independently by using the uniform distribution over projecting directions. To improve and generalize the SW, I propose to search for the best distribution over projecting distributions (or the slicing distribution) which can maximize the expected projected distance. In particular, a regularized implicit family of distributions is introduced in [ICLR'21] and explicit families (von Mises-Fisher and Power Spherical) are introduced in [ICLR'21]. Moreover, I introduce the usage of amortized optimization to predict the optimal slicing distribution given two input probability measures in the setting which has various pairs of probability measures in [NeurIPS'22] and [ICML'23]. To enhance further the quality of projecting directions, I break the independence between them by imposing the first order Markov structure in [NeurIPS'23]. To avoid unstable optimization and model misspecification in designing slicing model, I propose the energy-based slicing distribution that is parameter-free and has the density proportional to an energy function of the projected one-dimensional Wasserstein distance in [NeurIPS'23]. To push forward further the optimization-free direction, I propose the random-path projecting direction in [ICML'24].**

*Slicing distributions.***The vanilla sliced Wasserstein distance utilizes the Radon Transform as the projecting operator. The Radon Transform simply takes the inner product between the supports of a probability measure and a projecting direction as the supports of the one-dimensional projected probability measure. To generalize the projecting operator to tensor spaces, I use the convolution operator to project probability measures over tensors to one-dimension in [NeurIPS'22]. In addition, I connect deep learning (neural networks) techniques to sliced Wasserstein by proposing Overaparameterized Radon Transform and Hierarchical Radon Transform in [ICLR'23]. Recently, I proposed hierarchical hybrid Radon Transform and hierarchical hybrid sliced Wasserstein distance for dealing with heterogeneous joint distributions in [Arxiv'24].**

*Projecting operators.***The SW distance is usually estimated by Monte Carlo integration due to the intractable expectation with respect to the slicing distribution. To reduce the variance of the Monte Carlo estimator, I first propose control variates which are based on the closed-form of the Wasserstein-2 distance between two Gaussians in [ICLR'24]. Importantly, the proposed control variates have linear time complexity and space complexity. In addition, I propose to use low-discrepancy sequences on the sphere (Quasi-Monte Carlo) to approximate sliced Wasserstein in [ICLR'24]. Moreover, we propose Randomized Quasi-sliced Wasserstein, an unbiased estimation of sliced Wasserstein which are based on randomizing low-discrepancy sequences.**

*Numerical approximation.*Moreover, I aim to push forward the

**of optimal transport, Wasserstein distance, and sliced Wasserstein distance in probabilistic Machine Learning models such as point-clouds applications [ICML'23], 3D mesh deformation [ICLR'24], generative models (GANs, Diffusion Models) [NeurIPS'22] [Arxiv'24], domain adaptation [ICML'22], [ICML'22], multimodal representation learning [ICLR'24], 3D shape correspondence learning [CVPR'24], and other tasks that need to deal with probability measures.**

*application*## News

May 1, 2024 | 1 paper Sliced Wasserstein with Random-Path Projecting Directions is accepted at ICML2024. |
---|---|

Feb 27, 2024 | 1 paper Integrating Efficient Optimal Transport and Functional Maps For Unsupervised Shape Correspondence Learning is accepted at CVPR2024. |

Jan 19, 2024 | 2 papers Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts, On Parameter Estimation in Deviated Gaussian Mixture of Experts are accepted at AISTATS2024. |

Jan 16, 2024 | 4 papers Quasi-Monte Carlo for 3D Sliced Wasserstein - Spotlight Presentation, Sliced Wasserstein Estimation with Control Variates, Diffeomorphic Deformation via Sliced Wasserstein Distance Optimization for Cortical Surface Reconstruction, and Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation are accepted at ICLR2024. |

Sep 21, 2023 | 4 papers Energy-Based Sliced Wasserstein Distance, Markovian sliced Wasserstein distances: Beyond independent projections, Designing robust Transformers using robust kernel density estimation, and Minimax optimal rate for parameter estimation in multivariate deviated models are accepted at NeurIPS2023. |

Apr 24, 2023 | 1 paper Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction is accepted at ICML2023. |

Jan 20, 2023 | 1 paper Hierarchical Sliced Wasserstein Distance is accepted at ICLR 2023. |

Sep 14, 2022 | 4 papers Revisiting Sliced Wasserstein on Images: From Vectorization to Convolution, Amortized Projection Optimization for Sliced Wasserstein Generative Models, Improving Transformer with an Admixture of Attention Heads , and FourierFormer: Transformer Meets Generalized Fourier Integral Theorem are accepted at NeurIPS 2022. |

Apr 24, 2022 | 2 papers Improving Mini-batch Optimal Transport via Partial Transportation and On Transportation of Mini-batches: A Hierarchical Approach are accepted at ICML 2022. |

Jan 24, 2021 | 2 papers Distributional Sliced-Wasserstein and Applications to Generative Modeling - Spotlight Presentation and DImproving Relational Regularized Autoencoders with Spherical Sliced Fused Gromov Wasserstein are accepted at ICLR2021. |