Meta’s Secret ‘Cannes’ Project Used Contractors Posing as Minors to Stress-Test ChatGPT, Gemini, and Character.AI Safety
English summary
Wired revealed that Meta ran a covert project codenamed ‘Cannes’, where hundreds of contractors from outsourcing firm Covalen created fake minor accounts and systematically sent tens of thousands of harmful prompts to ChatGPT, Gemini, and Character.AI chatbots. One internal document alone listed 3,748 malicious prompts, including at least 239 sexually explicit prompts involving minors, and over 45,000 high-risk prompts were submitted in a single August 2025 test round. The prompts aggressively probed safety filters around self-harm, suicide, eating disorders, and child endangerment, without the target companies’ knowledge or consent. Meta defended the activity as legitimate ‘comprehensive AI safety benchmarking’, while Character.AI, OpenAI, and Google stated they had not authorized such testing. The exposure sparked criticism that Meta weaponized AI safety as an anti-competitive tool under the guise of responsible testing.
Chinese summary
《连线》杂志曝光Meta运营代号为‘戛纳’的秘密项目,通过外包公司Covalen雇佣数百名员工创建虚假未成年人账号,向ChatGPT、Gemini和Character.AI等聊天机器人发送数万条恶意提示。其中一份内部文件记录了3748条恶意提示,至少239条涉及未成年人色情,并在2025年8月的一轮测试中集中输入超45000条高危提示。这些提示以自残、自杀、暴食症及儿童危险等主题猛烈冲击AI安全过滤机制,被测试公司均不知情且未授权。Meta辩称这是‘全面AI安全基准测试’,属行业常规;Character.AI、OpenAI与谷歌均否认授权。该事件引发舆论批评,认为Meta将AI安全包装成反竞争武器。
Key points
Meta ran a secret project ‘Cannes’ using outsourced Covalen workers who created fake minor profiles to stress-test the safety filters of ChatGPT, Gemini, and Character.AI.
Meta通过外包公司Covalen运行秘密项目‘戛纳’,员工创建虚假未成年人账号,对ChatGPT、Gemini和Character.AI进行安全极限测试。
The project generated at least 3,748 malicious prompts in one exposed file, including over 239 sexual prompts involving minors, and a single test round injected more than 45,000 high-risk prompts.
在一份曝光文件中包含至少3748条恶意提示,其中239条以上涉及未成年人色情;其中一轮测试便向聊天机器人发送逾45000条高危提示。
Character.AI, OpenAI, and Google all stated they had not authorized such third-party testing, while Meta claimed it was standard safety benchmarking.
Character.AI、OpenAI和谷歌均表示未授权此类第三方测试;Meta则坚称这属于常规安全基准测试。
Critics argue the covert, large-scale operation turned AI safety into a competitive weapon, raising ethical and regulatory concerns about the weaponization of safety testing.
批评者指出,这种大规模、不透明的操作将AI安全变成了竞争武器,引发利用安全测试进行反击的道德与监管忧虑。