A Cisco evaluation of frontier LLMs found that no tested model consistently resisted multi-turn adversarial attacks, raising concerns about current AI safety assessments. The research suggests that many widely used AI safety benchmarks may underestimate real-world risk because they focus primarily on single-turn prompt evaluations rather than adaptive, iterative attacks. Key Takeaways from Cisco’s Research…
Tag: resisted
AI, Global Security News
The U.S. Has Long Been a Nation of Inventors—and Luddites
From smashing looms to blocking AI data centers, some Americans have resisted technology’s forward march.
