Anthropic: “AI Systems Are Being Weaponized” By Vibe Hackers

Anthropic has caught multiple threat actors abusing AI for cybercrime, including North Korean remote workers, ransomware-as-a-service gangs, and one hacker who breached 17 companies. How worried should we be?

Published on Aug 28, 2025

Written by Joel Witts

Share

Anthropic: “AI Systems Are Being Weaponized” By Vibe Hackers

Anthropic, one of the world’s biggest AI companies, has revealed multiple cases of cybercriminals using their Claude LLM to commit cybercrime, including developing malware, exfiltrating data, and generating ransom notes.

“We have developed sophisticated safety and security measures to prevent cybercriminals and other malicious actors continually attempt to find ways around them,” Anthropic researchers Alex Moix, Ken Lebedev and Jacob Klein said in a report released this week.

This follows reports OpenAI & Google released earlier this year which go into the malicious uses of their AI models.

Anthropic’s report outlines cybercriminals are exploiting the capabilities of Claude (and other AI models) across all areas of the cybercrime cycle, from automating victim profiling to stealing and analyzing stolen data.

Our new Threat Intelligence report details how we’ve identified and disrupted sophisticated attempts to use Claude for cybercrime.

We describe a fraudulent employment scheme from North Korea, the sale of AI-created ransomware by someone with only basic coding skills, and more. pic.twitter.com/dQIg8FoQ7e
— Anthropic (@AnthropicAI) August 27, 2025

New agentic AI systems, which enable LLMs to access websites and perform user actions like booking a flight, are also being exploited to perform cyberattacks directly, Anthropic says.

Perhaps most importantly, AI “lowers the barrier barriers to sophisticated cybercrime, enabling even those with low technical skills and experience to develop ransomware or malware.”

A Closer Look

Anthropic shared several different examples of how cybercriminals are using AI, including a threat actor (tracked as GTG-2002) who used Claude’s coding features to help with a data extortion ransomware campaign.

Claude was used in almost every stage of the attack cycle, Anthropic said, including reconnaissance, credential exploitation, malware development, data exfiltration, and, finally, generating a HTML-formatted ransom note.

“This threat actor leveraged Claude’s code execution environment to automate reconnaissance, credential harvesting, and network penetration at scale, potentially affecting at least 17 distinct organizations in just the last month across government, healthcare, emergency services, and religious institutions,” the company said.

“The operation demonstrates a concerning evolution in AI-assisted cybercrime, where AI serves as both a technical consultant and active operator shift in how cybercriminals can scale their operations… This approach, which security researchers have termed “vibe hacking,” represents a fundamental shift in how cybercriminals can scale their operations.”

It’s worth noting that the report does not delve into huge technical detail on the attack and how it worked. At the time of writing, Anthropic has not shared details of the custom malware that was generated, nor the prompts the actor used to bypass Claude’s safety measures.

According to the researchers, this campaign was highly effective. The report notes: “The actor’s systematic approach resulted in the compromise of personal records, including healthcare data, financial information, government credentials, and other sensitive information, with direct ransom demands occasionally exceeding $500,000.”

Remote IT workers based in North Korea have also used Claude AI to assist with fraudulent works scams, Anthropic says. Thousands of North Koreans have used fake and stolen identities to get hired at US companies over the last few years, generating millions in revenue for the North Korean government.

Anthropic says they have used AI to create fake identities and portfolios, assist with interview preparation, and then automate tasks when employed so as to keep earning a salary for as long as possible.

Ransomware gangs are also using AI – one UK-based threat actor was apparently able to use Claude to develop a “novel” malware, then market this on the dark web with prices ranging from $400 to $1,200 USD.

There were several further specific examples given, including a Spanish-speaking actor using Claude Code to operate a web service selling stolen credit cards, and a romance scam bot used for social engineering, with ‘multi-language’ support.

Anthropic’s Response

In all cases found, Anthropic banned the accounts associated with the operation. The Company have also taken steps to detect and catch similar malicious activity in future and shared technical indicators with other AI companies to prevent these threats across the AI landscape.

“These enhanced detection capabilities help us to more effectively prevent adversarial actors from exploiting our platform for harmful purposes and ensure such activity is identified and addressed,” Anthropic said.

Anthropic was able to successfully prevent at least one attack in action: “We successfully prevented a sophisticated North Korean threat actor from establishing operations on our platform through automated safety measures,” Anthropic said.

The threat actor was attempting to create phishing lures and build fake technical interviews to deliver malware.

“We’re committed to continually improving our methods for detecting and mitigating these harmful uses of our models.”

Why This Matters

There has been some debate in the cybersecurity industry as to what impact AI has had on the threat landscape, to date. Most researchers that Expert Insights have spoken to report that the biggest impact so far has been for cybercriminals to generate malicious emails and content related to phishing campaigns, rather than sophisticated malware and ransomware.

However, almost all would agree that the potential for AI to be used in cybercrime is massive. The examples revealed by Anthropic demonstrate how harmful and effective AI can be in the hands of creative cybercriminals, even those with low technical skills, but the ability to write a good prompt.

It’s also worth bearing in mind that these are just the attacks we know about.

It’s likely even more sophisticated gangs would be better at covering their tracks or using models with less robust safety measures than Anthropic, who take their security responsibilities seriously.

Read Further

Full Report: Anthropic Threat Intelligence Report August 2025

Anthropic Blog: Detecting and countering misuse of AI: August 2025

Crims laud Claude to plant ransomware and fake IT expertise

Anthropic thwarts hacker attempts to misuse Claude AI for cybercrime

Explore More

First Known AI-Powered Ransomware Found By Security Researchers

Threat researchers at ESET Research have uncovered the first known AI-powered ransomware, dubbed “PromptLock”.

Joel Witts Last updated on Apr 22, 2026

Read Now

Written By

Joel Witts Content Director

Joel is the Director of Content and a co-founder at Expert Insights; a rapidly growing media company focussed on covering cybersecurity solutions.

He’s an experienced journalist and editor with 8 years’ experience covering the cybersecurity space. He’s reviewed hundreds of cybersecurity solutions, interviewed hundreds of industry experts and produced dozens of industry reports read by thousands of CISOs and security professionals in topics like IAM, MFA, zero trust, email security, DevSecOps and more.

He also hosts the Expert Insights Podcast and co-writes the weekly newsletter, Decrypted. Joel is driven to share his team’s expertise with cybersecurity leaders to help them create more secure business foundations.