TutorialsSource: MARKTECHPOSTJuly 1, 2026Importance: 5/5

Anthropic Launches Claude Sonnet 5, Closing the Agentic Gap to Opus 4.8

English summary

Anthropic released Claude Sonnet 5 on June 30, 2026, calling it its most agentic Sonnet model. It outperforms Sonnet 4.6 on every published benchmark, including SWE-bench Pro (63.2% vs 58.1%), OSWorld-Verified (81.2% vs 78.5%), and Humanity’s Last Exam with tools (57.4% vs 46.8%), and nearly matches Opus 4.8 on several evals while edging it on GDPval-AA v2 (1618 vs 1615). Introductory pricing is $2/$10 per million input/output tokens until August 31, 2026, then $3/$15, undercutting Opus 4.8’s $5/$25. The model supports effort levels and shows its best cost-performance value at low and medium effort, but at xhigh effort it can cost more than Opus for similar quality. Sonnet 5 uses an updated tokenizer that may increase token counts by up to 1.35×, and its cyber capabilities are intentionally kept low, with Opus remaining the recommended model for accuracy-critical tasks.

Chinese summary

Anthropic 于 2026 年 6 月 30 日发布 Claude Sonnet 5，称其为迄今为止最具智能体能力的 Sonnet 模型。它在所有已发布基准上均超越 Sonnet 4.6，包括 SWE-bench Pro（63.2% 对 58.1%）、OSWorld-Verified（81.2% 对 78.5%）以及带工具的 Humanity’s Last Exam（57.4% 对 46.8%），并在多项评估中接近 Opus 4.8，同时在 GDPval-AA v2 上以 1618 对 1615 略胜一筹。输入/输出价格分别为每百万 token 2 美元/10 美元（2026 年 8 月 31 日前为推广价，之后变为 3/15 美元），低于 Opus 4.8 的 5/25 美元。模型支持努力级别，在低级和中级下性价比最佳，但在 xhigh 级别下成本可能高于 Opus 且质量相近。Sonnet 5 使用了更新的分词器，可能导致 token 数量最多增加 1.35 倍，其网络能力被刻意降低，对精度要求高的任务仍推荐使用 Opus。

Key points

Sonnet 5 outperforms Sonnet 4.6 on every published benchmark, closing the gap to Opus 4.8.
Sonnet 5 在所有已发布基准上均超越 Sonnet 4.6，缩小了与 Opus 4.8 的差距。
Introductory pricing (until Aug 31, 2026) is $2/$10 per MTok, then standard $3/$15, significantly cheaper than Opus 4.8’s $5/$25.
推广定价（至 2026 年 8 月 31 日）为每百万 token 2/10 美元，之后为 3/15 美元，远低于 Opus 4.8 的 5/25 美元。
Best cost-performance at low and medium effort; at extra-high effort it can exceed Opus 4.8 cost for similar quality.
在低和中努力级别下性价比最佳；在超高努力级别下成本可能超过 Opus 4.8 且质量相近。
Uses an updated tokenizer (1.0–1.35× token increase); cyber capability intentionally limited for safety.
使用更新的分词器（token 增加 1.0–1.35 倍）；出于安全考虑，网络能力被刻意限制。

Open original