← Back to the trend map

Capabilities · Trend

AI agents

Goal-directed systems that plan, use tools and act across software are the defining frontier of 2025–2026.

Trend strength 10/10

Momentum +3/q

Confidence high

Status rising

Forecast horizon

Agents move from demos to delegated work inside real workflows; reliability and oversight become the bottleneck.

Connections

Connections · 8

How this node ties into the rest of the map, and the evidence behind each link.

to · requires 8/10

Reasoning models

Agents lean on deliberate reasoning to plan multi-step actions.

to · applies to 8/10

Autonomous coding

Software is the first domain where agents act end-to-end.

to · requires 7/10

Long context & memory

Acting over long horizons needs persistent memory.

to · uses 7/10

Test-time compute

Agents spend inference compute to search and self-correct.

from · enables 7/10

Collapsing inference cost

Cheap inference makes long agentic runs affordable.

to · tracked by 6/10

Capability evaluations

Autonomy raises the stakes for pre-deployment evaluation.

from · tracked by 6/10

Capability evaluations

Agentic autonomy is the hardest thing to evaluate.

from · tracked by 6/10

Labor & automation

Agentic automation reshapes task-level labor.

Signal sources

Signal sources

Dated facts from primary sources in this direction.

Task horizon doubling Mar 2025

The length of software tasks AI agents can do autonomously at 50% reliability has doubled about every 7 months — and since 2024 closer to every ~3 months.

Benchmarks saturating Apr 2025

In one year scores rose by 18.8, 48.9 and 67.3 points on MMMU, GPQA and SWE-bench; real-world software solve rate jumped from 4.4% to 71.7%.

Stanford HAI — AI Index 2025 →

Autonomous coding 2025–2026

On SWE-bench Verified (500 real GitHub issues), autonomous coding agents reached ~80–86% by late 2025, up from under 50% in early 2025.