CISOs relying on LLM runtime guardrails and official safety scores when making security decisions about their organizations’ AI usage and model selection are due for a wakeup call. According to a new study from Cisco, frontier models from OpenAI, Anthropic, Google, xAI, and Amazon have significantly worse risk profiles when pressured in multi-turn attacks compared…
Tag: guardrails
AI, Global Security News
ClaudeBleed Vulnerability Lets Hackers Hijack Claude Chrome Extension to Steal Data
The ClaudeBleed vulnerability allows hackers to bypass Claude for Chrome guardrails to exfiltrate private Google Drive and Gmail data.
AI, Global Security News
Source Code Leaks Highlight Lack of Supply Chain Oversight
Or, why the software supply chain should be treated as critical infrastructure with guardrails built in at every layer.
AI, Global Security News, Government & Policy, Risk Management
Anthropic announces think tank to examine AI’s effect on economy and society
Fresh from battling the US Department of Defense (DoD) over AI guardrails, Anthropic has returned this week with a new initiative: the company is founding a think tank, the Anthropic Institute, “to confront the most significant challenges that powerful AI will pose to our societies.” Headed by Anthropic co-founder Jack Clark, who will take up…
AI, Global Security News, Network Security
Researchers Discover Major Security Gaps in LLM Guardrails
Palo Alto Networks’ Unit 42 has developed a successful attack to bypass safety guardrails in popular generative AI tools
AI, Global Security News
What’s Really at Stake in the Fight Between Anthropic and the Pentagon
The feud goes beyond AI guardrails and revolves around the dream of the nascent technology’s future.
AI, Global Security News
Trump Will End Government Use of Anthropic’s AI Models
Move follows weeks of tension between Pentagon and Anthropic over AI guardrails.
AI, Global Security News, Risk Management
Elon Musk Pushes AI to Be ‘Unhinged,’ Former Employees Say
As OpenAI, Anthropic, and Google race to fortify their AI guardrails, Elon Musk appears to be loosening his. Former xAI insiders say the billionaire is pushing to make his chatbot “more unhinged,” framing safety measures as censorship rather than protection. According to employees who spoke anonymously, the company’s dedicated safety function has effectively been dismantled,…
AI, Artificial Intelligence, Generative AI, Security, Cybersecurity, Data Breaches, Exploits, Global Security News, Risk Management
Single prompt breaks AI safety in 15 major language models
A single benign-sounding prompt can systematically strip safety guardrails from major language and image models, raising fresh questions about the durability of AI alignment when models are customized for enterprise use, according to Microsoft research. The technique, dubbed GRP-Obliteration, weaponizes a common AI training method called Group Relative Policy Optimization, normally used to make models…
