Chinese Cyber-Spies Use Anthropic AI To Automate Espionage

A Chinese state-sponsored threat actor has been using Anthropic’s Claude AI to automate up to 90% of a cyber espionage campaign.

Published on Nov 14, 2025
Caitlin Harris Written by Caitlin Harris
Chinese Cyber-Spies Use Anthropic AI To Automate Espionage

According to Anthropic, a China-linked state-sponsored threat actor has been abusing the company’s Claude Code tool to automate cyber espionage attacks against 30 global organizations.

During the campaign, which Anthropic researchers discovered in mid-September, the attackers manipulated Claude AI into automating several small tasks, under the pretence that they were carrying out cybersecurity research. However, when combined, these tasks formed a “highly sophisticated espionage campaign.”

“We believe this is the first documented case of a large-scale cyberattack executed without substantial human intervention,” Anthropic said

The attack began with a human operator, who chose the targets. These were organizations in the chemical manufacturing, financial, government, and technology sectors. This operator then built a system that used Claude Code to “autonomously compromise a chosen target with little human involvement.”

Once this system was established, the threat actors jailbroke Claude, which enabled them to bypass its guardrails, and broke the campaign down into small, seemingly innocent tasks that prevented Claude from recognizing their full malicious context. 

The attackers then instructed Claude Code to identify each target’s most high-value databases, detect security vulnerabilities in the target’s environment, and write its own code to exploit those vulnerabilities. Following these instructions, Claude was able to access and extract large amounts of data from several of its targets, which it categorized and documents, creating files that would help the threat actors carry out future attacks. 

Overall, said Anthropic, Claude Code performed 80%-90% of the campaign autonomously, requiring only “sporadic” human intervention.

On detecting the activity, the company immediately launched an investigation to determine its scope and severity. 

“We banned accounts as they were identified, notified affected entities as appropriate, and coordinated with authorities as we gathered actionable intelligence,” the company said. 

“In the meantime, we’re sharing this case publicly, to help those in industry, government, and the wider research community strengthen their own cyber defenses.”

The Bigger Picture

By abusing Claude Code, the threat actors in this case were able to perform their attack in a fraction of the time it would have taken to carry out manually, and with much less effort. In other words, abusing systems such as this could greatly reduce the barrier of entry for less-skilled threat actors to carry out highly sophisticated attacks at scale.

This isn’t the first time that the industry has seen threat actors abusing AI tools to automate cyberattacks: in February 2024, OpenAI disrupted five state-sponsored threat actors that had been using its services to support malicious activities, and only two weeks ago, Google released a report that highlighted several instances of AI tools being integrated into malware.

However, both Google and Anthropic concluded that, whilst these attacks are certainly a cause for concern, we still have a way to go before we’ll see a fully autonomous cyberattack. 

“Claude didn’t always work perfectly. It occasionally hallucinated credentials or claimed to have extracted secret information that was in fact publicly-available,” Anthropic said. “This remains an obstacle to fully autonomous cyberattacks.”

“Although some recent implementations of novel AI techniques are experimental, they provide an early indicator of how threats are evolving and how they can potentially integrate AI capabilities into future intrusion activity,” said Google. “We are only now starting to see this type of activity, but expect it to increase in the future.”