Meituan Releases LongCat-2.0: A 1.6T MoE Model Trained on 50,000 Domestic Chips, Weights Still Pending
English summary
Meituan announced LongCat-2.0, a 1.6-trillion-parameter Mixture-of-Experts model with 48 billion activated parameters and up to 1 million token context. The model was trained entirely on a 50,000-card cluster of domestic AI accelerator chips using a proprietary distributed communication protocol, rather than NVIDIA's NCCL. It scores 59.5 on SWE-bench Pro, slightly above GPT-5.5's 58.6. On Hugging Face, the model carries an MIT License but weights are marked 'coming soon'; only inference framework and Infra code have been released. LongCat ran anonymously as 'Owl Alpha' on OpenRouter, achieving top-3 monthly call volume with pricing of $0.30 per million tokens and free credits. The model is vertically optimized for Meituan's local services like food delivery and store operations. Despite the engineering achievement, chip vendor, total training cost, wall-clock time, and training data composition remain undisclosed, limiting independent verification and reproducibility.
Chinese summary
美团发布了LongCat-2.0,一个1.6万亿参数的混合专家(MoE)模型,激活参数约480亿,最高支持100万token上下文。模型完全使用电信5万张国产AI加速卡集群训练,采用自有分布式通信协议,未依赖NVIDIA NCCL。在SWE-bench Pro上得分为59.5,略高于GPT-5.5的58.6。模型在Hugging Face上贴有MIT许可证,但权重标注为“即将推出”,目前仅开源了推理框架和基础架构代码。LongCat曾以“Owl Alpha”匿名在OpenRouter平台运行,凭借每百万token 0.30美元的定价和大量免费额度,月调用量冲至全球前三。该模型垂直优化于美团本地生活场景(如外卖调度、到店运营)。尽管工程上验证了国产算力的大规模训练可行性,但芯片厂商、训练总成本、实际训练耗时及训练数据均未公开,导致独立复现与验证困难。
Key points
LongCat-2.0 is a 1.6-trillion-parameter MoE model with ~48B activated parameters, supporting up to 1M token context.
LongCat-2.0为1.6万亿参数MoE模型,激活参数约480亿,支持最长100万token上下文。
Trained from scratch on a cluster of 50,000 domestic AI accelerator chips using a proprietary communication protocol instead of NVIDIA NCCL.
使用5万张国产AI加速卡集群从头训练,采用自有通信协议,未用NVIDIA NCCL。
Achieves 59.5 on SWE-bench Pro, slightly outperforming GPT-5.5 (58.6).
在SWE-bench Pro上获59.5分,略高于GPT-5.5的58.6分。
Marked MIT license on Hugging Face but weights are "coming soon"; only inference framework and Infra code released; training data undisclosed.
Hugging Face页面标有MIT许可证,但权重“即将推出”,仅开源推理框架和Infra代码;训练数据未公开。
Deployed anonymously as 'Owl Alpha' on OpenRouter, reached top-3 monthly call volume via low price ($0.30/M tokens) and free credits.
以“Owl Alpha”匿名身份在OpenRouter上线,以每百万token 0.30美元低价及免费额度冲至月调用量前三。
Vertically optimized for local services (food delivery, store operations); chip vendor, training cost, and wall-clock time remain undisclosed, making independent verification impossible.
垂直优化于外卖、到店等本地生活场景;芯片厂商、训练成本及实际训练耗时均未披露,无法独立验证。