Google’s most expensive AI model seems to have crossed a major milestone: Beating a 29-year-old video game. Last night, Google CEO Sundar Pichai posted triumphantly on X, “What a finish! Gemini 2.5 Pro just completed Pokémon Blue!” To be clear, the Gemini Plays Pokemon livestream was created by (in his own words) “a 30 year…
Category: pokemon
AI, Global Security News, pokemon
Debates over AI benchmarking have reached Pokémon
Not even Pokémon is safe from AI benchmarking controversy. Last week, a post on X went viral, claiming that Google’s latest Gemini model surpassed Anthropic’s flagship Claude model in the original Pokémon video game trilogy. Reportedly, Gemini had reached Lavendar Town in a developer’s Twitch stream; Claude was stuck at Mount Moon as of late…
AI, Anthropic, Benchmark, Gaming, Global IT News, Global Security News, pokemon, pokemon red
Anthropic used Pokémon to benchmark its newest AI model
Anthropic used Pokémon to benchmark its newest AI model. Yes, really. In a blog post published Monday, Anthropic said that it tested its latest model, Claude 3.7 Sonnet, on the Game Boy classic Pokémon Red. The company equipped the model with basic memory, screen pixel input, and function calls to press buttons and navigate around the…
