ReleasesSource: HUGGINGFACEJuly 1, 2026Importance: 4/5

Nvidia Releases NVFP4-Quantized Version of Mistral-Medium-3.5-128B

English summary

Nvidia has published a quantized variant of the Mistral-Medium-3.5-128B large language model on Hugging Face. The model employs NVFP4, a 4-bit floating point precision format, to reduce memory footprint and potentially accelerate inference. It is labeled as conversational and text-generation compatible, using the safetensors format. The repository indicates the model is based on the original Mistral-Medium-3.5-128B from Mistral AI and is shared under a custom license.

Chinese summary

Nvidia 在 Hugging Face 上发布了 Mistral-Medium-3.5-128B 大语言模型的量化版本。该模型采用 NVFP4 4 位浮点精度格式，旨在减小内存占用并可能加速推理。它被标注为适用于对话和文本生成任务，并以 safetensors 格式提供。仓库信息表明此模型基于 Mistral AI 的原始 Mistral-Medium-3.5-128B，并以自定义许可证发布。

Key points

Nvidia released a quantized version of the 128-billion-parameter Mistral-Medium-3.5 model.
Nvidia 发布了拥有 1280 亿参数的 Mistral-Medium-3.5 模型的量化版本。
The quantization uses NVFP4, Nvidia's 4-bit floating point format.
量化采用 Nvidia 的 4 位浮点格式 NVFP4。
The model is available for text generation and conversational AI use cases on Hugging Face.
该模型可在 Hugging Face 上获取，用于文本生成和对话式 AI 场景。

Open original