Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
1. Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
Notation: We denote probability density functions as , , and , with and , omitting the function arguments when clear from the context. is the space of continuously differentiable functions from to , is the space of twice continuously differentiable functions from to , and is the space of compactly supported functions from to that are continuously differentiable times.
1.1. Stochastic Interpolants
Stochastic interpolant: Given two probability density functions , a stochastic interpolant between and is a stochastic process defined as
where
-
satisfies the boundary conditions and , as well as
We can think of as a planned path from to that is smooth. This states that does not move too fast along the way from at to at , and as a result does not wander too far from either endpoint - this assumption is made for convenience but is not necessary for most arguments below.
- satisfies for all , and
- The pair is drawn from a probability measure that marginalizes on and , i.e. , . The measure allows for a coupling between the two densities and , which affects the properties of the stochastic interpolant, but a simple choice is to take the product measure , in which case and are independent.
- is a Gaussian random variable independent of , i.e. and
Given the above definition, we want to characterize the properties of the time dependent probability distribution 1 such that
and we have the following property:
1.2. Stochastic Interpolant Properties
The most important property of the probability distribution of the stochastic interpolant is:
Stochastic interpolant properties: The probability distribution of the stochastic interpolant is absolutely continuous with respect to the Lebesgue measure at all times and its time-dependent density satisfies and , for any , and for all . In addition, solves the transport equation (TE)
where we defined the velocity
This velocity is in for any , and such that
- For flow-based models (Objective), the objective is
-
For score-based/diffusion models (Score), the score is given by
and the objective is
-
For energy-based models (Energy), if we model ,
Having access to the score immediately allows us to rewrite the TE as forward and backward Fokker-Planck equations, which we state as:
Fokker-Planck equations (FPE): For any with for all , the probability density satisfies:
-
The forward Fokker-Planck equation
where we defined the forward drift
The forward Fokker-Planck equation is well-posed when solved forward in time from to , and its solution for the initial condition satisfies .
-
The backward Fokker-Planck equation
where we defined the backward drift
The backward Fokker-Planck equation is well-posed when solved backward in time from to , and its solution for the final condition satisfies .
We design generative models using the stochastic processes associated with the TE, the forward FPE, and the backward FPE:
At any time , the law of the stochastic interpolant coincides with the law of the three processes , , and , respectively defined as:
-
The solutions of the probability flow associated with the transport equation
solved either forward in time from the initial data or backward in time from the final data .
-
The solutions of the forward SDE associated with the FPE
solved forward in time from the initial data independent of .
-
The solutions of the backward SDE associated with the backward FPE
solved backward in time from the final data independent of ; the solution is by definition where satisfies
solved forward in time from the initial data independent of .2
1.3. Instantiation
We connect the diffusion bridge perspective to the stochastic interpolant perspective by setting , where is a standard Brownian bridge process3, independent of and . With some deduction we can know that and , i.e.
Using ItΓ΄ calculus and we can get the drift , but this requires many tedious calculations.
Given the stochastic interpolant perspective, we can write out
And using , we have
-
We use probability measure instead of density function because the latter is not well defined when there's no smooth density function (Like the Dirac delta function). But in most cases, you can think of as .
-
To avoid repeated applications of the transformation , it is convenient to directly use the reversed ItΓ΄ calculus rules stated in the following lemma:
Reverse ItΓ΄ Calculus: If solves the backward SDE:
-
For any and , the backward ItΓ΄ formula holds
-
For any and , the backward ItΓ΄ isometries hold:
where denotes expectation conditioned on the event .
-
-
A Brownian Bridge is a stochastic process that describes a random path, similar to a standard Brownian motion, but with the crucial constraint that it is "pinned" to a specific value (usually zero) at both its start and end times, which can be written as . Consequently, its randomness, or variance, is zero at the beginning and end, and reaches its maximum in the middle of the time interval.