← Back to the trend map

Safety · Trend

Multi-Turn Jailbreak Escalation in Medical AI

Multi-turn adversarial attacks escalate unsafe medical chatbot responses from 35% to nearly 80% by Turn 4, with model divergence invisible to single-turn evaluation.

Trend strength 3/10
Momentum +3/q
Confidence low
Status new
Forecast horizon

Input-side classifiers offer partial mitigation but high false-alarm rates remain a deployment barrier; multi-turn evaluation will become a standard safety requirement.

Connections

Connections · 1

How this node ties into the rest of the map, and the evidence behind each link.

Signal sources

Signal sources

Dated facts from primary sources in this direction.