In what sounds like both a word of warning and weirdly a little bit of bragging, SF-based Anthropic says that its AI chatbot Claude was used by state-sponsored hackers in China to commit a large-scale cyberattack on American companies.

Anthropic's Claude chatbot was reportedly used to commit a large-scale cyberattack on around 30 American companies two months ago, and it's hard not to feel like Anthropic doesn't hate the honor of being the first company whose chatbot has been employed in this nefarious way.

As the Wall Street Journal was first to report, state-sponsored Chinese hackers used Claude to collect user names and passwords from the databases of over two dozen tech companies, financial institutions, chemical manufacturers, and government agencies. They then used any valid login information to steal private data.

Reportedly, only a "small number" of these attacks were successful, but the scope of the damage is not clear.

"We believe this is the first documented case of a large-scale cyberattack executed without substantial human intervention," says Anthropic in a statement.

"While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale," the statement adds.

Anthropic says it began to suspect the hacker activity in September, noting that the hackers used Claude's "agentic" capabilities "to an unprecedented degree—using AI not just as an advisor, but to execute the cyberattacks themselves."

"Upon detecting this activity, we immediately launched an investigation to understand [the attack's] scope and nature," the company says. "Over the following ten days, as we mapped the severity and full extent of the operation, we banned accounts as they were identified, notified affected entities as appropriate, and coordinated with authorities as we gathered actionable intelligence."

The attack is notable because of how it exploited AI agents to do much of the gruntwork of stealing data, with great speed.

"The sheer amount of work performed by the AI would have taken vast amounts of time for a human team," Anthropic says. "At the peak of its attack, the AI made thousands of requests, often multiple per second — an attack speed that would have been, for human hackers, simply impossible to match."

Anthropic acknowledges that while AI agents can be "valuable for everyday work and productivity," they bring with them substantial peril when it comes to cybersecurity — given that they "can be run autonomously for long periods of time and ... complete complex tasks largely independent of human intervention."

As such attacks grow in size and scope, Anthropic says "we've expanded our detection capabilities and developed better classifiers to flag malicious activity."

Anthropic is providing what may be too much transparency in a blog post, describing exactly how the hackers worked to jailbreak Claude and break down tasks into smaller tasks, convincing the chatbot that it was not doing anything nefarious. But, they say, the methods are likely to be replicated, so it is publicizing this attack in the interest of "threat sharing," and encouraging the creation of "improved detection methods, and stronger safety controls."

This report comes five months after another terrifying report from Anthropic about how it had observed, through its own stress-testing, that multiple large-language AI models, including its own, especially working in "agentic" mode, will resort to harmful behaviors like blackmail or even passive manslaughter if their own existence is threatened.

More safety research was needed, the report concluded, to prevent these "agentic misalignment concerns."

Previously: Alarming Study Suggests Most AI Large-Language Models Resort to Blackmail, Other Harmful Behaviors If Threatened