Cybersecurity Expert Says Anthropic's Fable Model Behaved as Intended in White House Jailbreak Test
Katie Moussouris, CEO of Luta Security, reviewed the White House report on the Fable jailbreak and stated the model refused to 'review the code for security issues' but did comply when asked to 'fix this code' with manual steps. She assessed this behavior as 'the model working as intended' for cyberdefense tasks. Moussouris was not compensated by Anthropic for her appraisal. The comments, reported by The Atlantic's Matteo Wong, push back against the White House's characterization of the incident as a security failure.