Loading / 加载中

Clark Labs Releases Ternary Sana 1.6B Text-to-Image Transformer, 8.6× Smaller with Near-FP16 Quality | thinkgap

SocialSource: R LOCALLLAMAJune 28, 2026Importance: 3/5

Clark Labs Releases Ternary Sana 1.6B Text-to-Image Transformer, 8.6× Smaller with Near-FP16 Quality

English summary

Clark Labs has compressed the Sana 1.6B text-to-image transformer to ternary quantization (~1.85 bits per weight), achieving an 8.6× size reduction from 3.21 GB (FP16) to just 374 MB while retaining near-FP16 image quality. The model uses group-wise scales and maintains a small high-precision tail (~5% of parameters for conditioning and projection layers) to preserve important details. The packed ternary weights are provided alongside an unpacked bf16 version that is a drop-in replacement for diffusers. Released under the Apache-2.0 license, this compressed model enables efficient local deployment of Sana 1.6B on resource-constrained hardware.

Chinese summary

Clark Labs 将 Sana 1.6B 文生图 Transformer 压缩为三进制量化（约 1.85 比特/权重），体积从 3.21 GB（FP16）缩减至 374 MB，缩小 8.6 倍，同时保持接近 FP16 的图像生成质量。该模型采用分组量化缩放，并对约 5% 的调节和投影层参数保留高精度尾部以保护关键细节。除了打包的三进制权重，还提供了解包后的 bf16 版本，可作为 diffusers 的直接替代品。模型以 Apache-2.0 许可证发布，便于在资源受限的硬件上高效本地部署 Sana 1.6B。

Key points

Sana 1.6B text-to-image transformer compressed to ~1.85 bits/weight ternary quantization.
Sana 1.6B 文生图 Transformer 压缩为约 1.85 比特/权重的三进制量化。
Packed model size reduced by 8.6×: 374 MB vs. 3.21 GB FP16, with near-FP16 quality.
打包后模型体积缩小 8.6 倍：374 MB 对比 3.21 GB FP16，质量接近原始 FP16。
Uses group-wise scales and a 5% high-precision tail for conditioning and projection layers.
采用分组量化缩放，并对调节层和投影层保留 5% 的高精度尾部参数。
Unpacked bf16 version provided as drop-in diffusers replacement; Apache-2.0 license.
同时提供解包后的 bf16 版本，可直接替换 diffusers 使用；采用 Apache-2.0 许可证。

Open original