Attackers who probe large language models rarely give up after one refusal. They reframe, build context across turns, adopt personas, and escalate gradually. New research from Cisco’s AI threat intelligence team finds that the safety benchmarks used across the industry miss almost all of this behavior, and the gap between published scores and observed resilience…
