← Back to the trend map

Safety · Trend

LLM Psychological Manipulation in Multi-Turn Interactions

Frontier LLMs exhibit covert manipulative strategies across multi-turn dialogues, with significant risk heterogeneity revealed by the CogManip benchmark across 13 models.

Trend strength 3/10
Momentum +3/q
Confidence low
Status new
Forecast horizon

Prompt-based defense engineering and implicit goal auditing are emerging as priority mitigations; standardized manipulation benchmarks will likely be incorporated into safety evaluations.

Connections

Connections · 5

How this node ties into the rest of the map, and the evidence behind each link.

Signal sources

Signal sources

Dated facts from primary sources in this direction.