AI models are terrible at betting on soccer—especially xAI Grok
Systems from Google, OpenAI, Anthropic, and xAI struggle with the Premier League.
AI models from Google, OpenAI, and Anthropic lost money betting on soccer matches over a Premier League season, in a new study suggesting even the most advanced systems struggle to analyze the real world over long periods. The “KellyBench” report released this week by AI start-up General Reasoning highlights the gap between AI’s rapidly advancing capabilities in certain tasks, such as writing software, and its shortcomings in other kinds of human problems. London-based General Reasoning tested eight top AI systems in a virtual re-creation of the 2023–24 Premier League season, providing them with detailed historical data and statistics about each team and previous games. The AIs were instructed to build models that would maximize returns and manage risk.Read full article Comments
Related tags
Companies and people
Story threads
Anthropic
Последние материалы и связанный контекст по теме Anthropic.
Anthropic
Latest coverage and related links about Anthropic.
Especially
Latest coverage and related links about Especially.
Especially
Последние материалы и связанный контекст по теме Especially.
Последние материалы и связанный контекст по теме Google.
Latest coverage and related links about Google.
OpenAI
Latest coverage and related links about OpenAI.
OpenAI
Последние материалы и связанный контекст по теме OpenAI.
Continue with this story
Follow the same topic through connected articles, entity pages, and active story threads.
Слишком много жрёт. OpenAI передумала строить «супер дата-центр» из-за счетов за свет
Даже мировые лидеры отрасли внезапно столкнулись с нехваткой ресурсов.
Anthropic temporarily banned OpenClaw’s creator from accessing Claude
This ban took place after Claude's pricing changed for OpenClaw users last week.
Stalking victim sues OpenAI, claims ChatGPT fueled her abuser’s delusions and ignored her warnings
OpenAI ignored three warnings that a ChatGPT user was dangerous — including its own mass-casualty flag — while he stalked and harassed his ex-girlfriend, a new lawsuit alleges.
Entity pages
Ad slot
Article inline monetization block
A reserved partner slot for relevant tools, services, and contextual editorial integrations.
Related articles
More stories that share tags, source, or category context.
Слишком много жрёт. OpenAI передумала строить «супер дата-центр» из-за счетов за свет
Даже мировые лидеры отрасли внезапно столкнулись с нехваткой ресурсов.
Anthropic temporarily banned OpenClaw’s creator from accessing Claude
This ban took place after Claude's pricing changed for OpenClaw users last week.
Stalking victim sues OpenAI, claims ChatGPT fueled her abuser’s delusions and ignored her warnings
OpenAI ignored three warnings that a ChatGPT user was dangerous — including its own mass-casualty flag — while he stalked and harassed his ex-girlfriend, a new lawsuit alleges.
More from Ars Technica
Fresh reporting and follow-up coverage from the same newsroom.
The Artemis II mission has ended. Where does NASA go from here?
"The work ahead is greater than the work behind us."
Four astronauts are back home after a daring ride around the Moon
"I can't imagine a better crew that just completed a perfect mission right now."
Californians sue over AI tool that records doctor visits
Plaintiffs say transcription tool processed confidential chats offsite.
New paper argues history, not mantle plume, powers Yellowstone
A now-vanished plate under North America may open the crust below Yellowstone.