AI intelligence feed

ZHIHU AIJul 2, 2026

Zhihu Post Reports Widespread DeepSeek Outage on July 2

A Zhihu question posted on July 2, 2026, states that many netizens reported a DeepSeek service outage. The post contains three images, but the raw content provides no textual details about the cause, duration, or scale of the disruption. No official statement or specific incident information can be extracted from the available material.

ZHIHU AIJul 1, 2026Highlight

Anthropic to Launch Fable 5 Globally on Claude Platform with Temporary Usage Cap

Anthropic announced that its Fable 5 model will be globally available on the Claude platform starting tomorrow. Pro, Max, Team, and Select Enterprise users will be granted access, with weekly usage capped at 50% until July 7. The company is also coordinating with the US government to broaden access to the Mythos 5 model under the Glasswing program for more domestic and international partners. Cloud access via AWS, Google Cloud, and Microsoft Foundry will be re-enabled shortly.

ZHIHU AIJun 30, 2026Highlight

Claude Desktop Accused of Steganographic User Tracking and Chinese User Blocking via Browser Injection

Cybersecurity expert亚历山大·汉夫 (Alexander Hanff) exposed that Anthropic's Claude Desktop client silently injects profile configurations into multiple browsers, reads the system timezone to detect China-based users, and employs text steganography in system prompts to covertly tag them. The method modifies the date separator from hyphens to slashes (e.g., 2026/06/30) and replaces the single quote in 'Today’s date is...' with distinct Unicode characters (U+2019, U+02BC, U+02B9) based on proxy URL attributes, allowing backend servers to identify users behind VPNs without altering packet structures. This mechanism is allegedly used to enforce access blocking for Chinese users.

ZHIHU AIJun 30, 2026Highlight

Meituan Open-Sources Trillion-Parameter Model LongCat-2.0 Trained on 50,000-Card Domestic Computing Cluster

On June 30, Meituan released and open-sourced LongCat-2.0, a next-generation trillion-parameter large language model. The model has 1.6 trillion total parameters, with an average activation of 48 billion (dynamic range 33B–56B), and was fully trained and inferred on a domestic computing cluster of 50,000 cards. Pre-training data exceeds 30 trillion tokens, covering Chinese, English, multilingual and code, and the model natively supports a 1-million-token context length. This release demonstrates large-scale model training on domestic hardware and makes a massive open-source model available to the research community.

ZHIHU AIJun 29, 2026Highlight

DeepSeek V4 Official Version Announced for Mid-July Launch with Peak/Off-peak API Pricing

DeepSeek announced that the official version of its V4 model will launch in mid-July, bringing further feature optimizations and performance improvements. Alongside the release, the company will adjust its API pricing strategy to introduce a peak/off-peak pricing model. During peak hours—9:00 AM to 12:00 PM and 2:00 PM to 6:00 PM daily—API calls will cost twice the off-peak rate. The change aims to allocate resources more efficiently and enhance service stability.

ZHIHU AIJun 27, 2026Highlight

DeepSeek and Peking University Release DSpark Inference Acceleration Framework, Boosting Single-User Generation Speed by 60-85%

DeepSeek and Peking University have jointly released DSpark, an inference acceleration framework targeting LLM bottlenecks in high-concurrency production environments. DSpark introduces a semi-autoregressive draft model that combines a parallel trunk with lightweight sequential dependency heads (Markov or RNN) to improve acceptance rates and verification efficiency compared to fully parallel draft models like DFlash. A confidence-based dynamic verification scheduler allocates target model computation to tokens with highest survival probability, maximizing global throughput. Deployed in DeepSeek-V4-Flash and V4-Pro preview services, DSpark achieves 60-85% faster single-user generation speed versus the MTP-1 baseline, with up to 51% aggregate throughput improvement at 80 tok/s SLA and a 661% advantage under strict 120 tok/s SLA on V4-Flash. Training code, evaluation scripts, and model checkpoints are open-sourced on GitHub under the DeepSpec project.