Anthropic, one of the world’s biggest AI companies, has revealed multiple cases of cybercriminals using their Claude LLM to commit cybercrime, including developing malware, exfiltrating data, and generating ransom notes.
“We have developed sophisticated safety and security measures to prevent cybercriminals and other malicious actors continually attempt to find ways around them,” Anthropic researchers Alex Moix, Ken Lebedev and Jacob Klein said in a report released this week.
This follows reports OpenAI & Google released earlier this year which go into the malicious uses of their AI models.
Anthropic’s report outlines cybercriminals are exploiting the capabilities of Claude (and other AI models) across all areas of the cybercrime cycle, from automating victim profiling to stealing and analyzing stolen data.
New agentic AI systems, which enable LLMs to access websites and perform user actions like booking a flight, are also being exploited to perform cyberattacks directly, Anthropic says.
Perhaps most importantly, AI “lowers the barrier barriers to sophisticated cybercrime, enabling even those with low technical skills and experience to develop ransomware or malware.”
A Closer Look
Anthropic shared several different examples of how cybercriminals are using AI, including a threat actor (tracked as GTG-2002) who used Claude’s coding features to help with a data extortion ransomware campaign.
Claude was used in almost every stage of the attack cycle, Anthropic said, including reconnaissance, credential exploitation, malware development, data exfiltration, and, finally, generating a HTML-formatted ransom note.
“This threat actor leveraged Claude’s code execution environment to automate reconnaissance, credential harvesting, and network penetration at scale, potentially affecting at least 17 distinct organizations in just the last month across government, healthcare, emergency services, and religious institutions,” the company said.
“The operation demonstrates a concerning evolution in AI-assisted cybercrime, where AI serves as both a technical consultant and active operator shift in how cybercriminals can scale their operations… This approach, which security researchers have termed “vibe hacking,” represents a fundamental shift in how cybercriminals can scale their operations.”
It’s worth noting that the report does not delve into huge technical detail on the attack and how it worked. At the time of writing, Anthropic has not shared details of the custom malware that was generated, nor the prompts the actor used to bypass Claude’s safety measures.
According to the researchers, this campaign was highly effective. The report notes: “The actor’s systematic approach resulted in the compromise of personal records, including healthcare data, financial information, government credentials, and other sensitive information, with direct ransom demands occasionally exceeding $500,000.”
Remote IT workers based in North Korea have also used Claude AI to assist with fraudulent works scams, Anthropic says. Thousands of North Koreans have used fake and stolen identities to get hired at US companies over the last few years, generating millions in revenue for the North Korean government.
Anthropic says they have used AI to create fake identities and portfolios, assist with interview preparation, and then automate tasks when employed so as to keep earning a salary for as long as possible.
Ransomware gangs are also using AI – one UK-based threat actor was apparently able to use Claude to develop a “novel” malware, then market this on the dark web with prices ranging from $400 to $1,200 USD.
There were several further specific examples given, including a Spanish-speaking actor using Claude Code to operate a web service selling stolen credit cards, and a romance scam bot used for social engineering, with ‘multi-language’ support.
Anthropic’s Response
In all cases found, Anthropic banned the accounts associated with the operation. The Company have also taken steps to detect and catch similar malicious activity in future and shared technical indicators with other AI companies to prevent these threats across the AI landscape.
“These enhanced detection capabilities help us to more effectively prevent adversarial actors from exploiting our platform for harmful purposes and ensure such activity is identified and addressed,” Anthropic said.
Anthropic was able to successfully prevent at least one attack in action: “We successfully prevented a sophisticated North Korean threat actor from establishing operations on our platform through automated safety measures,” Anthropic said.
The threat actor was attempting to create phishing lures and build fake technical interviews to deliver malware.
“We’re committed to continually improving our methods for detecting and mitigating these harmful uses of our models.”
Why This Matters
There has been some debate in the cybersecurity industry as to what impact AI has had on the threat landscape, to date. Most researchers that Expert Insights have spoken to report that the biggest impact so far has been for cybercriminals to generate malicious emails and content related to phishing campaigns, rather than sophisticated malware and ransomware.
However, almost all would agree that the potential for AI to be used in cybercrime is massive. The examples revealed by Anthropic demonstrate how harmful and effective AI can be in the hands of creative cybercriminals, even those with low technical skills, but the ability to write a good prompt.
It’s also worth bearing in mind that these are just the attacks we know about.
It’s likely even more sophisticated gangs would be better at covering their tracks or using models with less robust safety measures than Anthropic, who take their security responsibilities seriously.
Read Further
Full Report: Anthropic Threat Intelligence Report August 2025
Anthropic Blog: Detecting and countering misuse of AI: August 2025
Crims laud Claude to plant ransomware and fake IT expertise
Anthropic thwarts hacker attempts to misuse Claude AI for cybercrime