SupraLabs Releases SupraSafety-18M, a Tiny BERT-Style Content Moderation Model for Edge Devices
English summary
SupraLabs has released SupraSafety-18M, a binary text classifier with 18 million parameters built on a BERT-style architecture. The model was trained from scratch using 2 T4 GPUs for 7 epochs on the nvidia/Nemotron-3.5-Content-Safety-Dataset. It achieves 81.2% accuracy and 86.9% precision in distinguishing SAFE vs. UNSAFE text. Designed for edge devices, mobile phones, and low-latency environments, it provides confidence scores, as demonstrated by example predictions. The model is available on Hugging Face.
Chinese summary
SupraLabs发布了SupraSafety-18M,这是一个基于BERT风格架构、拥有1800万参数的二进制文本分类器。该模型在2块T4 GPU上使用NVIDIA的Nemotron-3.5内容安全数据集从头训练了7个epoch,达到了81.2%的准确率和86.9%的精确度。它能对文本进行SAFE或UNSAFE的二分类,并提供置信度评分,专为边缘设备、手机和低延迟生产环境设计。模型已在Hugging Face上公开。
Key points
Tiny 18M-parameter BERT-style model for content moderation.
1800万参数的微型BERT风格内容审核模型。
Trained from scratch on the NVIDIA Nemotron-3.5 content safety dataset using 2 T4 GPUs for 7 epochs.
使用2块T4 GPU在NVIDIA Nemotron-3.5内容安全数据集上从头训练了7个epoch。
Binary classifier (SAFE vs. UNSAFE) with 81.2% accuracy and 86.9% precision.
二分类器(SAFE/UNSAFE),准确率81.2%,精确率86.9%。
Optimized for edge devices, mobile phones, and low-latency production environments.
针对边缘设备、手机和低延迟生产环境进行了优化。
Model available on Hugging Face under the SupraLabs organization.
模型在Hugging Face上的SupraLabs组织中公开提供。