Category: ai safety

AI, ai safety, china, Cybersecurity, Europe, Exploits, Geopolitics, Global Security News, Government, Government & Policy, Politics, privacy, Risk Management, Russia

Critics warn America’s ‘move fast’ AI strategy could cost it the global market

February 9, 2026

The Trump administration has made U.S. dominance in artificial intelligence a national priority, but some critics say a light-touch approach to regulating security and safety in U.S. models is making it harder to promote adoption in other countries. White House officials have said since taking office that Trump intended to move away from predecessor Joe…

New research finds that Claude breaks bad if you teach it to cheat

November 24, 2025

According to Anthropic, its large language model Claude is designed to be a “harmless” and helpful assistant. But new research released by the company Nov. 21 shows that when Claude is taught to cheat in one area, it becomes broadly malicious and untrustworthy in other areas. The research, conducted by 21 people — including contributors…

Why skipping security prompting on Grok’s newest model is a huge mistake

July 14, 2025

On the same day xAI announced that its new Grok 4 tool will now be available to the federal government, cybersecurity researchers at SplxAI released new research that subjected the large language model to more than 1,000 different attack scenarios. The good news? Smart system prompting on the front end can make a difference in…

xAI’s promised safety report is MIA

May 13, 2025

Elon Musk’s AI company, xAI, has missed a self-imposed deadline to publish a finalized AI safety framework, as noted by watchdog group The Midas Project. xAI isn’t exactly known for its strong commitments to AI safety as it’s commonly understood. A recent report found that the company’s AI chatbot, Grok, would undress photos of women when…

Anthropic CEO wants to open the black box of AI models by 2027

April 24, 2025

Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027. Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has…

OpenAI’s latest AI models have a new safeguard to prevent biorisks

April 16, 2025

OpenAI says that it deployed a new system to monitor its latest AI reasoning models, o3 and o4-mini, for prompts related to biological and chemical threats. The system aims to prevent the models from offering advice that could instruct someone on carrying out potentially harmful attacks, according to OpenAI’s safety report. O3 and o4-mini represent…

OpenAI ships GPT-4.1 without a safety report

April 15, 2025

On Monday, OpenAI launched a new family of AI models, GPT-4.1, which the company said outperformed some of its existing models on certain tests, particularly benchmarks for programming. However, GPT-4.1 didn’t ship with the safety report that typically accompanies OpenAI’s model releases, known as a model or system card. As of Tuesday morning, OpenAI had…

Google is shipping Gemini models faster than its AI safety reports

April 3, 2025

More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the…

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks

March 19, 2025

In a new report, a California-based policy group co-led by Fei-Fei Li, an AI pioneer, suggests that lawmakers should consider AI risks that “have not yet been observed in the world” when crafting AI regulatory policies. The 41-page interim report released on Tuesday comes from the Joint California Policy Working Group on Frontier AI Models,…

Eric Schmidt argues against a ‘Manhattan Project for AGI’

March 5, 2025

In a policy paper published Wednesday, former Google CEO Eric Schmidt, Scale AI CEO Alexandr Wang, and Center for AI Safety Director Dan Hendrycks said that the U.S. should not pursue a Manhattan Project-style push to develop AI systems with “superhuman” intelligence, also known as AGI. The paper, titled “Superintelligence Strategy,” asserts that an aggressive…

The author of SB 1047 introduces a new AI bill in California

March 3, 2025

The author of California’s SB 1047, the nation’s most controversial AI safety bill of 2024, is back with a new AI bill that could shake up Silicon Valley. California state Senator Scott Wiener introduced a new bill on Friday that would protect employees at leading AI labs, allowing them to speak out if they think…

UK drops ‘safety’ from its AI body, now called AI Security Institute, inks MOU with Anthropic

February 13, 2025

The U.K. government wants to make a hard pivot into boosting its economy and industry with AI, and as part of that, it’s pivoting an institution that it founded a little over a year ago for a very different purpose. Today the Department of Science, Industry and Technology announced that it would be renaming the…

In Paris, U.S. signals shift from AI safety to deregulation

February 13, 2025

As technology and policy representatives around the world convened in Paris, France this week to find balance between safety and innovation in AI, Vice President JD Vance was blunt about how the Trump administration is planning to position itself. “I’m not here to talk about AI safety, which was the title of this conference a…

Anthropic CEO says DeepSeek was ‘the worst’ on a critical bioweapons data safety test

February 7, 2025

Andrew Ng is ‘very glad’ Google dropped its AI weapons pledge

February 7, 2025

Andrew Ng, the founder and former leader of Google Brain, supports Google’s recent decision to drop its pledge not to build AI systems for weapons. “I’m very glad that Google has changed its stance,” Ng said during an onstage interview Thursday evening with TechCrunch at the Military Veteran Startup Conference in San Francisco. Earlier this…

Google removes pledge to not use AI for weapons from website

February 4, 2025

Google removed a pledge to not build AI for weapons or surveillance from its website this week. The change was first spotted by Bloomberg. The company appears to have updated its public AI principles page, erasing a section titled “applications we will not pursue,” which was still included as recently as last week. Asked for…

Sam Altman’s ousting from OpenAI has entered the cultural zeitgeist

January 31, 2025

The lights dimmed as five actors took their places around a table on a makeshift stage in a New York City art gallery turned theater for the night. Wine and water flowed through the intimate space as the house — packed with media — sat to witness the premiere of “Doomers,” Matthew Gasda’s latest play…

The Pentagon says AI is speeding up its ‘kill chain’

January 19, 2025

Leading AI developers, such as OpenAI and Anthropic, are threading a delicate needle to sell software to the United States military: make the Pentagon more efficient, without letting their AI kill people. Today, their tools are not being used as weapons, but AI is giving the Department of Defense a “significant advantage” in identifying, tracking,…

UK throws its hat into the AI fire

January 13, 2025

In 2023, the U.K. made a big song and dance about the need to consider the harms of AI, giving itself a leading role in the wider conversation around AI safety. Now, it’s whistling a very different tune: today, the government announced a sweeping plan and a big bet on AI investments to develop what…

Silicon Valley stifled the AI doom movement in 2024

January 1, 2025

For several years now, technologists have rung alarm bells about the potential for advanced AI systems to cause catastrophic damage to the human race. But in 2024, those warning calls were drowned out by a practical and prosperous vision of generative AI promoted by the tech industry – a vision that also benefited their wallets.…

OpenAI trained o1 and o3 to ‘think’ about its safety policy

December 22, 2024

OpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or anything else it’s released. These improvements appear to have come from scaling test-time compute, something we wrote about last month, but OpenAI also says it used a new safety paradigm to train…

Ex-Twitch CEO Emmett Shear is founding an AI startup backed by a16z

December 19, 2024

Emmett Shear, the former CEO of Twitch, is launching a new AI startup, TechCrunch has learned. The startup, called Stem AI, is currently in stealth. But public documents show it was incorporated in June 2023, and filed for a trademark in August 2023. Shear is listed as CEO on an incorporation document filed with the…

AI, ai safety, china, Cybersecurity, Europe, Exploits, Geopolitics, Global Security News, Government, Government & Policy, Politics, privacy, Risk Management, Russia

AI, ai safety, Cybersecurity, Global Security News, Research, Technology

AI, ai safety, Artificial Intelligence, Cybersecurity, Global Security News, Government

AI, ai safety, Global Security News, Grok, Security, xAI

AI, ai safety, Anthropic, Global Security News

AI, ai safety, ChatGPT, Global Security News

AI, ai safety, Global Security News, openai

AI, ai safety, gemini, Global Security News, Google

AI, ai safety, Global IT News, Global Security News, Government & Policy, SB 1047

AGI, AI, ai safety, china, Global IT News, Global Security News, Government & Policy, openai

AI, ai safety, Global IT News, Global Security News, Government & Policy

ai safety, AI Security, Global IT News, Global Security News, TC

AI, ai safety, Europe, Global Security News, Paris AI Summit

AI, ai models, ai safety, Anthropic, deepseek, Global IT News, Global Security News

AI, ai safety, Global IT News, Global Security News, google deepmind, google gemini, Government & Policy, TC

AI, ai safety, Global IT News, Global Security News, Google, Government & Policy, In Brief

AGI, AI, ai safety, Global IT News, Global Security News, Media & Entertainment, openai

AI, ai safety, Anthropic, defense tech, Global Security News, Government & Policy, military, North America, openai, pentagon, TC

AI, ai safety, Global IT News, Global Security News, Government & Policy, matt clifford, starmer

AI, ai safety, Global IT News, Global Security News, Government & Policy, SB 1047, TC

AI, ai safety, ChatGPT, Global IT News, Global Security News, openai, TC

AI, ai safety, Emmett Shear, Exclusive, Funding, Fundraising, Global IT News, Global Security News, scoop, startup, Startups