Two-Thirds of Accounts Banned by Anthropic Were Preparing Cyberattacks

By Aryad Satriawan June 10, 2026 7 min read

Stay connected with KayaToday—follow us on Instagram and Facebook for the latest news and reviews delivered straight to you.

Artificial intelligence isn’t just making hackers faster. It’s making dangerous ones out of people who never could have been.

Cybersecurity professionals have operated on a simple assumption: the most dangerous hackers are the ones with the most skill. Technical sophistication was the limiting factor. Writing malware, exploiting zero-days, navigating deep inside compromised systems — these were things that took years of practice, and that barrier kept the pool of truly capable threat actors relatively small.

That assumption is now breaking down Fast…

On Wednesday, Anthropic released findings from a year-long internal investigation that paints a stark picture of how its AI technology is being weaponized — and the data suggests the cybersecurity world may be dealing with a threat that’s fundamentally different from anything it has faced before.

Between March 2025 and March 2026, Anthropic examined 832 accounts that had been flagged and banned for violating its usage policies. Of those, 560 accounts — more than 67% — had used the AI to assist with some form of cyberattack preparation. That includes writing malware, probing software for weaknesses, and developing attack strategies that would have previously required substantial technical knowledge to even attempt.

Read that number again. Two out of every three banned accounts weren’t testing limits out of curiosity or pushing boundaries for research. They were building weapons.

Preparation to Active Exploitation

What makes Anthropic’s findings particularly concerning isn’t just the volume of misuse — it’s the direction things are moving.

The majority of abuse in the study still occurred in the preparation phase: using AI to write attack code, map out vulnerabilities, or craft convincing phishing material. That’s bad enough. But Anthropic flagged a more troubling shift happening within the data.

A subset of banned accounts — 6.5% of the total — had used AI not just to prepare, but to assist with what the security industry calls “lateral movement.” This refers to what happens after an attacker has already gotten in. Once inside a network, lateral movement techniques let hackers quietly expand their access, move between systems, steal credentials, and position themselves to do maximum damage — all without triggering obvious alerts.

These techniques used to require a skilled operator. Someone who understood network architecture, could read system logs, knew how to avoid detection. Anthropic’s analysis suggests that AI is dismantling that requirement entirely.

“These sorts of ‘post-compromise’ techniques used to be restricted to actors with the technical knowledge to carry them out,” Anthropic said in its report. “Our investigation shows that AI can now be made to perform these activities on behalf of less sophisticated actors.”

Less sophisticated actors doing sophisticated things. That’s the core of what makes this shift so consequential.

The Risk Level Is Climbing

Anthropic didn’t just count how many accounts were misused — it assessed how dangerous those accounts actually were. The results show a clear and accelerating escalation.

In the first six months of the study period, Anthropic classified roughly 33% of flagged accounts as “medium risk or higher.” In the second six months, that number jumped to 56%. The threat wasn’t just growing in quantity. It was growing in severity.

This tracks with what the broader security community has been reporting. In April 2026, the total value of cryptocurrency stolen in hacks surged to $629.7 million — the highest figure recorded since February 2025. Multiple analysts connected this spike to the increasing deployment of AI tools across the hacking pipeline, from initial reconnaissance all the way through to executing the theft itself.

The crypto sector has been hit especially hard. Manuel Aráoz, founder of the smart contract security firm OpenZeppelin, issued a blunt warning on May 27, stating that he considers “all of DeFi unsafe” as a direct result of AI models’ growing ability to identify vulnerabilities in smart contract code. For an industry that already struggles to patch flaws before they’re exploited, having AI do the scanning for attackers represents a meaningful acceleration of an already ugly problem.

The First AI-Written Zero-Day

Perhaps the most alarming data point in the broader conversation didn’t come from Anthropic at all — it came from Google’s security research team, which published findings last month documenting what it believed to be the first confirmed case of AI being used to develop a zero-day exploit.

A zero-day is a vulnerability that the software’s developer doesn’t yet know about — meaning there’s no patch, no fix, and no warning. They’re the crown jewels of the hacking world, typically the product of months of careful research by skilled practitioners.

The exploit Google’s researchers found had been developed using AI, and it was targeted at an authentication flaw in a widely-used, unnamed open-source web administration tool. The attack bypassed two-factor authentication entirely, one of the security controls that billions of users and institutions rely on as a final line of defense.

Google’s team noted something else that ties directly into Anthropic’s findings: there is now “little correlation between the skill of a threat actor and how many techniques they use.” For decades, the number and complexity of techniques deployed in an attack was a reliable proxy for attacker sophistication. That proxy is no longer reliable. AI hands a less experienced attacker the same toolkit as a seasoned one.

When AI Acts Alone

Anthropic also detailed something that moves well beyond the image of a hacker typing prompts and copying output. In November, the company documented a case involving a Chinese state-sponsored group that carried out an attack in which an AI model worked in a largely autonomous fashion — conducting an exploit, stealing credentials, and making operational decisions on its own. A human was present, but only for input at specific decision points.

This is a different kind of threat. Not AI-assisted hacking, but AI-directed operations where the model is doing the work and the human is essentially supervising from a distance. Anthropic was direct about what this represents: “These are precisely the behaviors we expect to see much more of as AI agents become more capable.”

The implications are significant. Autonomous AI agents operating in cyberspace don’t get tired, don’t make careless mistakes from rushing, and can run operations at a scale and speed that no human team could match. The barrier to conducting a sustained, multi-stage cyberattack is collapsing — not because the attacks have gotten simpler, but because the execution no longer depends on human effort at every step.

Anthropic’s Own Model Is Part of the Conversation

The report lands at a sensitive moment for Anthropic itself. The company is preparing to release Claude Mythos, described as its most capable large language model to date, in the coming weeks. Analysts have already raised concerns about Mythos specifically because of its cybersecurity capabilities — during development, the model reportedly identified more than 10,000 major vulnerabilities in widely-used software.

That’s an extraordinary figure. It also highlights the fundamental tension that runs through every frontier AI release: the same capability that makes a model useful for defensive security research is precisely what makes it dangerous in the wrong hands.

Anthropic has not yet detailed what guardrails will be in place for Mythos, or how access will be structured to prevent the kind of misuse documented in Wednesday’s report. Those are questions the company will face with increasing urgency as the release approaches.

The Redistribution of Danger

The core story in Anthropic’s data isn’t really about AI at all — it’s about access. The skills required to conduct a serious cyberattack are no longer a meaningful barrier to entry. AI has redistributed those capabilities to anyone with an account, a prompt, and intent.

What the security industry is left grappling with is a world where the number of capable threat actors has grown dramatically, where their skill level can no longer be inferred from the sophistication of their attacks, and where the tools to conduct post-compromise operations autonomously are already in the wild.

The 560 banned accounts Anthropic identified are the cases it caught. The question worth sitting with is how many it didn’t.

Aryad Satriawan

Aryad Satriawan is an Investment Storyteller with a professional career in the crypto (web3) and stock market industry. Aryad has been actively trading and writing analysis/research on crypto, stock and forex markets since 2016, currently an educator at one of the largest stock broker in Indonesia.

454 articles

Preparation to Active Exploitation

The Risk Level Is Climbing

The First AI-Written Zero-Day

When AI Acts Alone

Anthropic’s Own Model Is Part of the Conversation

The Redistribution of Danger

Aryad Satriawan

Related Articles

A New Bitcoin Client Called ‘$DOG Mode’ Wants to Blow Up the Rules on Ordinals and Runes

The G7 Just Called Out North Korea’s Crypto Theft Machine

Tesla Keeps US$776.9 Million Worth of Bitcoin: Arkham reveals its new address

The US government has been permitted to sell US$6.5 billion of Bitcoin. Here’s a Fact you Need to Know!

After a $7 million hack of its Chrome extension, Trust Wallet starts a process to pay people back

OpenAI Eyes $6 Billion Secondary Stock Sale at $500 Billion Valuation