Anthropic Catches Attackers Using Agents In The Act
The internet is rife with prognostications and security vendor hype about AI-powered attacks. On November 13, AI vendor Anthropic published details about the disruption of what it characterized as an AI-led cyber-espionage operation.
This revelation comes on the heels of a Google Threat Intelligence Group (GTIG) report which also highlighted the use of AI in attacks. Although the report covers activity in-the-wild, it focuses on malware that uses just-in-time invocation of LLM’s for defense evasion and dynamic generation of malicious functions.
The Anthropic report describes an altogether different — and much more sophisticated — use of AI that borders on being agentic.
The release of this information is important because AI vendors are the only parties with sufficient visibility into how adversaries are attempting to leverage AI platforms and models. Ideally, a report such as this would have been mapped to a framework like MITRE ATT&CK but it still provides insights about what defenders may be facing and how adversary capabilities are evolving.
Anthropic discusses many campaign details in its report, but the high-level summary is that a threat actor, whom Anthropic assesses with high confidence to be Chinese state-sponsored, targeted around 30 organizations across multiple industry sectors using an AI-driven attack framework employing agents and requiring very little human effort or intervention.
The attack used agents but wasn’t quite autonomous nor fully agentic
Although the campaign made extensive use of agents, it didn’t quite rise to the level of being truly agentic. While the operation represents a significant step forward in attackers’ use of AI — with agents allegedly performing 80 – 90% of the work — humans were still providing direction at critical junctions and there are still limits to what exactly what can be automated. One constraint may be the testing and validation of the output of AI.
As the report says, “An important limitation emerged during investigation: Claude frequently overstated findings and occasionally fabricated data during autonomous operations, claiming to have obtained credentials that didn’t work or identifying critical discoveries that proved to be publicly available information.
This AI hallucination in offensive security contexts presented challenges for the actor’s operational effectiveness, requiring careful validation of all claimed results. This remains an obstacle to fully autonomous cyberattacks.” Ironically, this means attackers may have to confront the same AI trust issues as defenders.
Bot management is more important than ever
Throughout the report, Anthropic points out that the rate of requests far exceeded “what was humanly possible”. In the application security space, organizations have contended with a similar challenge for years: bad bots attempting DDoS, account fraud, web recon and scraping, while disguising themselves by usurping residential proxies and continuously adapting their behavior to evade defenses.
Malicious agents and/or hijacked agents will use similar techniques. Bot and agent trust management software analyzes hundreds, sometimes thousands, of signals to determine bot and agent provenance, behavior, and intent to help defend against agents that target organizations through customer facing applications, which are one of the top external attack vectors.
Insecure intent was an important factor
This campaign was possible for a few distinct reasons. First, as Anthropic states, its newer frontier models understand more context. In addition to making deliberate misrepresentations about their identity and purpose, attackers broke up the attack into discrete tasks. This enabled them to create a gap between the context necessary for carrying out the attack and the context necessary to “understand” the requested actions as malicious in relation to each other.
In Forrester’s Agentic AI Enterprise Guardrails for Information Security (AEGIS) framework, we describe this issue as “securing intent” and it is one of the defining capabilities of AI security. Securing intent is not just an issue for LLM vendors, it’s also a major priority for any organization building an AI agent and is one of the defining capabilities of AI security.
The use of AI is novel, the underlying tactics and techniques are not
AI is only as effective as its training data; the attacks it produces are not novel. The real value is that, using agents, attacks can be constant, high volume, and eventually automated to not require a human.
The capabilities needed to defend against these attacks are many of the same ones we already rely on: focusing on Zero Trust, implementing proactive security, building a strong governance capability, and effectively detecting and responding to attacks. To protect against future AI-enabled attacks, security pros should:
- Implement the principles of proactive security. Visibility, prioritization, and remediation make up the core of proactive security and they’re applicable regardless of whether or not an attacker is using AI. By improving prioritization and shortening remediation windows, organizations will be better protected against current threats and better equipped to match the velocity of the AI-powered attacks of the future. Encrypt data at rest and in transit and use strong key-management. This makes high-value targets like databases and backups far less useful to attackers, even if they are exfiltrated.
- Leverage emerging AI capabilities in security tools. Emerging AI capabilities in security — especially in security operations — are proving effective in reducing the time to investigate alerts, especially for use cases like phishing. Vendors and users are leveraging these technologies. If you are not currently using AI agents for triage and investigation already, start exploring these now. Use Forrester’s Six Steps to the AI Enabled Security Organization to get started.
- Tighten boundaries and kill implicit trust everywhere. Kill long-lived credentials, enforce phishing resistant MFA and short-lived tokens everywhere, and constrain lateral movement paths. The attack Anthropic describes leaned heavily on “harvest credentials -> test -> pivot” so limiting the utility of stolen credentials hamstrings the automation loop that made their operation scalable. This includes applying Zero Trust principles to software development pipelines and environments, as they often have elevated access to sensitive data and are vulnerable to privilege escalation.
While the attack itself used existing exploits and wasn’t fully autonomous, it’s important to note that this serves as a harbinger of things to come for future attacks using AI and agents. Malicious actors will continue to improve on these capabilities, as they have with past technical advances.
Let’s connect
Clients who want to explore Forrester’s diverse range of AI research further can set up a guidance session or inquiry or contact their account team.