弹性扩散Transformer：加速SOTA生成模型

英文摘要

Researchers propose Elastic Diffusion Transformer (E-DiT), an adaptive framework to accelerate Diffusion Transformers by exploiting sample-dependent sparsity. Each DiT block is equipped with a lightweight router that dynamically decides to skip the block or reduce its MLP width. A training-free block-level feature caching mechanism further eliminates redundant computations. Experiments on Qwen-Image, FLUX, and Hunyuan3D-3.0 achieve up to ~2× speedup with negligible quality loss. Code and paper are publicly available.

中文摘要

研究人员提出弹性扩散Transformer（E-DiT），一个利用样本相关稀疏性自适应加速扩散Transformer的框架。每个DiT模块配备轻量路由器，动态决定是否跳过该模块或缩减其MLP宽度，并引入无需训练的模块级特征缓存机制进一步消除冗余计算。在Qwen-Image、FLUX和Hunyuan3D-3.0上的实验实现了近2倍加速，质量损失可忽略不计。论文与代码已公开。

关键要点

E-DiT equips each DiT block with a lightweight router that dynamically predicts whether to skip the block or reduce MLP width based on input latents.
E-DiT为每个DiT模块配备轻量路由器，根据输入潜变量动态预测是跳过该模块还是缩减MLP宽度。
A training-free block-level feature caching mechanism uses router predictions to avoid redundant computations during inference.
引入无需训练的模块级特征缓存机制，利用路由器预测在推理过程中避免冗余计算。
The method achieves up to ~2× speedup on 2D image generation (Qwen-Image, FLUX) and 3D asset generation (Hunyuan3D-3.0) with negligible quality degradation.
在二维图像生成（Qwen-Image、FLUX）和三维资产生成（Hunyuan3D-3.0）上实现最高约2倍加速，质量下降可忽略。
Code is available at https://github.com/wangjiangshan0725/Elastic-DiT and paper at https://arxiv.org/abs/2602.13993.
代码开源：https://github.com/wangjiangshan0725/Elastic-DiT，论文：https://arxiv.org/abs/2602.13993。

打开原文