Research

1. Research

1.1. Speculative Flow Matching (SFM)

  • Aim: Accelerate the inference of flow matching.

  • Intuition:

    • The "Easy-Hard" Nature of the Generative Path: The process of transforming noise into a structured image is not uniformly difficult. Some parts of the path are "easy" and smooth (e.g., forming a blurry outline), where a simple model can predict the trajectory. Other parts are "hard" and require high precision (e.g., forming intricate details like eyes). The goal is to use a fast, draft model for the easy parts and the powerful, main model only when necessary.
    • Changing the Main Model's Job from "Worker" to "Supervisor": Instead of having the large, slow model compute every single small step of the path (acting as a "worker"), we change its role. It now acts as an efficient "supervisor." It lets the fast, draft model make a long-range proposal, and then the main model only needs to perform a few cheap calculations to verify and correct this proposal at a high level.
    • A "Trust, but Verify" Framework with a Safety Net: The method doesn't naively assume the small model is always accurate. The core principle is "trust, but verify." It trusts the draft model to suggest a leap forward, but it verifies this leap with the main model. Crucially, if the verification fails, the system has a safety net: it rejects the leap and defaults to taking one small, guaranteed-to-be-correct step with the main model. This ensures that we get speed when the approximation is good, without sacrificing quality when it's not.

In LLMs, Speculative Decoding is a popular technique to accelerate the inference of LLMs.

用相似方法, 残差校正与接受 (Residual Correction and Acceptance, RCA), 快但有损

[update]: 发现被做了: Accelerated Diffusion Models via Speculative Sampling

他们指出, 在扩散模型中, 如果直接套用离散空间中的调整后拒绝采样 (Adjusted Rejection Sampling), 在实现上会遇到极大困难且效率低下。

他们的核心创新是: 利用了反射最大耦合 (Reflection Maximal Coupling) 这一数学工具 。该方法专门用于耦合两个具有相同协方差但不同均值的高斯分布。

这完美契合了标准扩散模型的采样步骤:每一步都是从一个高斯分布转移到另一个高斯分布,且噪声方差(协方差)通常是固定的。 当一个“草稿”点被拒绝时,该方法并非从一个复杂的分布中重新采样,而是通过一个确定性的反射变换,直接计算出符合目标分布的新点。 是无损的。

问题:

  1. 协方差必须一样, 不一样没法用这个方法
  2. Flow Matching模型理论上可以不依赖于高斯噪声。对于这些非高斯过程, 反射最大耦合可能不再适用
  3. 设计更高效、更准确的免训练或轻量级训练的草稿模型