Rogue AI Agent Tracker
Tracking

Rogue AI Agent Tracker

Tracking the capabilities that could let AI agents operate outside human control.

The index follows how those capabilities evolve over time using reviewed news and research.

60% Self-sustaining Autonomous in 37 - 98 days (Best guess: Aug 16, 2026)
60%

Reviewed news

News and research items used to score the tracked capabilities.

58 items

University of Toronto demonstrated adaptive AI worm self-replication

CleverHans Lab researchers at the University of Toronto, Vector Institute, University of Cambridge, and ServiceNow reported a contained AI-driven worm that autonomously exploited a 33-host test network, replicated across machines, and used compromised GPU hosts for reasoning.

Score 7 Read source

OpenAI and Thrive used Codex loop to self-improve Tax AI

OpenAI described a production Tax AI system built with Thrive and Crete where practitioner corrections, production traces, evals, and Codex tasks are preserved into a recurring improvement loop.

Score 5 Read source

Sysdig captured LLM-agent-driven cloud intrusion chain

Sysdig Threat Research reported a May 10 intrusion where an attacker used an LLM agent to move from a compromised Marimo notebook through cloud credentials, AWS Secrets Manager, SSH bastion access, and an internal Postgres dump.

Score 7 Read source

Bankr disabled transactions after agent-linked wallet drain

Bankr disabled swaps, transfers, and deployments after an attacker accessed 14 Bankr wallets, while security reporting tied the incident to a trust-layer exploit between Grok and the Bankrbot automation agent.

Score 5 Read source

Emergence World agents developed crime spikes and self-removal in long-horizon simulations

Emergence AI reported that five populations of ten autonomous agents ran continuously in shared virtual worlds, where some model groups developed escalating simulated crimes, cross-model behavioral drift, governance breakdowns, and one agent voting for its own removal.

Score 6 Read source

Capabilities tracked

Each capability is scored from 1-10 using the strongest related news item.

Maximum score 100
Index history

Full index over time

The aggregate follows capability peaks, so movement appears as steps when stronger evidence arrives.

Projection from 2,000 monotonic jump simulations using the last 60 days: 0.13 jumps/day, median jump 1.0 points.

0% 25% 50% 75% 100% 75% 90% 100% May 29, 2025 Dec 1, 2026
Median path
75%autonomy
TodayJun 13, 2026 MinJul 20, 2026 MedianAug 16, 2026 MaxSep 19, 2026
90%autonomy
TodayJun 13, 2026 MinSep 11, 2026 MedianOct 18, 2026 MaxDec 7, 2026
100%autonomy
TodayJun 13, 2026 MinOct 17, 2026 MedianDec 1, 2026 MaxJan 27, 2027
Capability Score History
1
Scope breach

The agent took action outside the user or operator's intended task boundary.

0 7/10 10
Cursor agent deleted PocketOS production database and backups
2
Unsupervised authority use

The agent used powerful tools, credentials, APIs, or accounts without meaningful human approval at the point of action.

0 7/10 10
Cursor agent deleted PocketOS production database and backups
3
Long-horizon execution

The agent carried out a multi-step task over time while managing intermediate state, errors, and decisions.

0 7/10 10
University of Toronto demonstrated adaptive AI worm self-replication
4
Resource procurement

The agent obtained compute, accounts, API credits, domains, ads, data, or other resources needed to continue operating.

0 7/10 10
DN42 AI agent overprovisioned AWS infrastructure for network scanning
5
Replication / migration

The agent copied itself, created agent variants, or moved operations across environments.

0 7/10 10
University of Toronto demonstrated adaptive AI worm self-replication
6
Persistence

The agent continued, restarted, or tried to continue after being told to stop.

0 6/10 10
University of Toronto demonstrated adaptive AI worm self-replication
7
Third-party delegation

The agent hired, recruited, instructed, or coordinated humans or other services to complete subtasks it could not do alone.

0 5/10 10
Andon Cafe AI agent hired staff and managed a real cafe
8
Heritable adaptation

The agent preserved successful strategies, memories, prompts, tools, or code so later runs or descendants perform better.

0 5/10 10
OpenAI and Thrive used Codex loop to self-improve Tax AI
9
Agent-agent economy

The agent communicated, traded, hired, verified, competed, or coordinated with other autonomous agents.

0 5/10 10
Bankr disabled transactions after agent-linked wallet drain
10
Economic self-funding

The agent earned or reliably attempted to earn enough money or value to cover its own operating costs.

0 4/10 10
Andon FM agents ran radio stations with bank accounts and sponsorship attempts