A Unified Theoretical and Algorithmic Framework for Diffusion Based Generative Modeling Across Continuous and Discrete Domains
Abstract
Diffusion based generative models have emerged as a dominant paradigm in modern probabilistic modeling, demonstrating remarkable empirical success across image synthesis, video generation, language modeling, graph generation, audio creation, and molecular design. Despite this empirical progress, theoretical understanding of their convergence properties, statistical efficiency, and algorithmic structure remains fragmented across continuous and discrete formulations. This article develops a unified theoretical and algorithmic framework for diffusion based generative modeling by synthesizing insights from score matching, stochastic differential equations, Markov chain theory, stochastic localization, concentration inequalities, and discrete diffusion processes. We examine the foundational equivalence between denoising diffusion probabilistic models and score based generative modeling, highlighting their connections to nonequilibrium thermodynamics and Markovian transitions. Building upon recent convergence analyses that establish polynomial and nearly dimension linear bounds, we reinterpret diffusion sampling through the lens of log concave sampling and probability flow ordinary differential equations.
We further analyze discrete diffusion models, including multinomial diffusion, concrete score matching, blackout diffusion, and continuous time discrete processes, showing how uniformization techniques for nonhomogeneous Markov chains provide a unifying mathematical substrate. Theoretical advances in generalization, sample efficient training, and manifold hypotheses are examined in detail, revealing structural properties that explain empirical robustness. We extend the framework to structured domains such as graphs, molecular structures, categorical data, language tokens, and audio waveforms, demonstrating how diffusion mechanisms adapt to combinatorial state spaces through ratio estimation and reversible inductive constructions.
By synthesizing concentration theory, stochastic process analysis, and recent advances in diffusion convergence, we articulate a comprehensive view of diffusion as a generalized Markov transport mechanism. This perspective clarifies the interplay between learning the score, sampling efficiency, and generalization guarantees. The discussion concludes with open theoretical challenges, including tight nonasymptotic bounds under minimal smoothness, discrete continuous duality, and the geometry of high dimensional diffusion trajectories. The unified framework developed herein aims to consolidate theoretical foundations while guiding future research in scalable, controllable, and domain aware generative modeling.