弹性扩散Transformer:加速SOTA生成模型
英文摘要
Researchers propose Elastic Diffusion Transformer (E-DiT), an adaptive framework to accelerate Diffusion Transformers by exploiting sample-dependent sparsity. Each DiT block is equipped with a lightweight router that dynamically decides to skip the block or reduce its MLP width. A training-free block-level feature caching mechanism further eliminates redundant computations. Experiments on Qwen-Image, FLUX, and Hunyuan3D-3.0 achieve up to ~2× speedup with negligible quality loss. Code and paper are publicly available.
中文摘要
研究人员提出弹性扩散Transformer(E-DiT),一个利用样本相关稀疏性自适应加速扩散Transformer的框架。每个DiT模块配备轻量路由器,动态决定是否跳过该模块或缩减其MLP宽度,并引入无需训练的模块级特征缓存机制进一步消除冗余计算。在Qwen-Image、FLUX和Hunyuan3D-3.0上的实验实现了近2倍加速,质量损失可忽略不计。论文与代码已公开。
关键要点
E-DiT equips each DiT block with a lightweight router that dynamically predicts whether to skip the block or reduce MLP width based on input latents.
E-DiT为每个DiT模块配备轻量路由器,根据输入潜变量动态预测是跳过该模块还是缩减MLP宽度。
A training-free block-level feature caching mechanism uses router predictions to avoid redundant computations during inference.
引入无需训练的模块级特征缓存机制,利用路由器预测在推理过程中避免冗余计算。
The method achieves up to ~2× speedup on 2D image generation (Qwen-Image, FLUX) and 3D asset generation (Hunyuan3D-3.0) with negligible quality degradation.
在二维图像生成(Qwen-Image、FLUX)和三维资产生成(Hunyuan3D-3.0)上实现最高约2倍加速,质量下降可忽略。
Code is available at https://github.com/wangjiangshan0725/Elastic-DiT and paper at https://arxiv.org/abs/2602.13993.
代码开源:https://github.com/wangjiangshan0725/Elastic-DiT,论文:https://arxiv.org/abs/2602.13993。