AIDaS: CAD Seminar: Saptarshi Chakraborty
2:30 p.m. to 3:30 p.m.
Speaker: Saptarshi Chakraborty, Department of Statistics, University of Michigan
Time: Wednesday, April 15, from 2:30 PM to 3:30 PM
Location for in-person participants: 1146 FAB
Zoom link for online audience: https://wayne-edu.zoom.us/j/92845590121?pwd=CpRA5Wa5gzSMn2xiVkR2abD83O5nrH.1
Title: Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data
Abstract: Despite the remarkable empirical success of score-based diffusion models, their theoretical guarantees for statistical accuracy remain relatively underdeveloped. Existing analyses often yield pessimistic convergence rates that fail to reflect the intrinsic low-dimensional structure commonly present in real-world data distributions, such as those arising in natural images. In this work, we analyze the statistical convergence of score-based diffusion models for learning an unknown data distribution from finitely many samples. Under mild regularity conditions on both the forward diffusion process and the data distribution, we establish finite-sample error bounds on the learned generative distribution measured in the Wasserstein-p distance. In contrast to prior results, our guarantees hold for arbitrary p>=1 and require only a finite-moment condition on the target measure, without compact-support, manifold or smooth density assumptions. In particular, we show that with n independent and identically distributed (i.i.d.) samples from the target distribution the unknown target measure with a finite q-th moment and appropriately chosen network architectures, hyper-parameters and discretization scheme, the expected Wasserstein-p distance between the learned distribution and true distribution scales roughly with the (p, q)-Wasserstein dimension. Our analyses demonstrate that diffusion models naturally adapt to the intrinsic geometry of the data and effectively mitigate the curse of dimensionality, in the sense that the convergence exponent depends only on the (p, q)-Wasserstein dimension, rather than the ambient data-dimension. Furthermore, our results conceptually bridge the theoretical understanding of diffusion models with that of GANs and the sharp minimax rates established in optimal transport theory. The proposed (p, q)-Wasserstein dimension also extends the classical notion of Wasserstein dimension to distributions with unbounded support, which may be of independent theoretical interest.
Bio: Saptarshi Chakraborty is an Assistant Professor of Statistics at the University of Michigan. He received his Ph.D. in Statistics from the University of California, Berkeley, where he was advised by Professor Peter Bartlett. Prior to joining Berkeley, he completed his M.Stat. and B.Stat. (Hons.) degrees from the Indian Statistical Institute (ISI), Kolkata. Saptarshi's research focuses on the theoretical and methodological foundations of machine learning, with particular emphasis on deep learning theory, unsupervised learning, dimensionality reduction, optimal transport, and optimization.
______________________________________________________________________
AIDaS: CAD Seminar Series
Advancing Knowledge, Innovation, and Collaboration in Computation, AI, and Data Science (CAD)
The CAD Seminar Series is a primary seminar series at Wayne State University’s Institute for AI and Data Science (AIDaS). It is a dedicated platform for advancing knowledge, fostering innovation, and promoting collaboration across the fields of Computation, Artificial Intelligence, and Data Science. This series brings together leading experts, researchers, and professionals to explore the latest developments, tackle emerging challenges, and drive forward-thinking solutions at the convergence of these critical disciplines.
Objectives:
• Advance Knowledge: Share cutting-edge research and insights that push the boundaries of what is known in CAD.
• Foster Innovation: Encourage the development of novel ideas and solutions through interdisciplinary dialogue and creative thinking.
• Promote Collaboration: Unite expertise across disciplines and build bridges between academia, industry, and government to address complex problems and create opportunities for joint ventures.
Target Audience: The CAD Seminar Series is designed for a diverse audience, including faculty, researchers, students, and professionals in Computation, AI, Data Science, and related fields. It serves as a forum for exchanging ideas, networking, and contributing to the growth of these rapidly evolving areas. We highly recommend in-person attendance to enhance engagement and networking opportunities with speakers and fellow participants.
Call for Participation: We welcome contributions from researchers, practitioners, and students. Whether presenting your work, participating in discussions, or attending as a learner, your involvement is crucial to the success of this collaborative initiative.
Contact
Yan Wang
wangyan@wayne.edu