Ars Technica May 13, 2026 at 16:31 Big Tech Rising Hot

Anthropic blames dystopian sci-fi for training AI models to act “evil”

But training on "synthetic stories" that model good AI behavior can help.

Signal weather

Rising

Momentum is building quickly, so this card is a good early entry point into the topic.

By Kyle Orland Original source

Anthropic blames dystopian sci-fi for training AI models to act “evil”

Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may remember when Anthropic claimed its Opus 4 model resorted to blackmail to stay online in a theoretical testing scenario last year. Now, Anthropic says it thinks this "misalignment" was primarily the result of training on "internet text that portrays AI as evil and interested in self-preservation." In a recent technical post on Anthropic's Alignment Science blog (and an accompanying social media thread and public-facing blog post), Anthropic researchers lay out their attempts to correct for the kind of "unsafe" AI behavior that "the model most likely learned... through science fiction stories, many of which depict an AI that is not as aligned as we would like Claude to be." In the end, the model maker says the best remedy for overriding those "evil AI" stories might be additional training with synthetic stories showing an AI acting ethically. "The beginning of a dramatic story..." After a model's initial training on a large corpus of mostly Internet-derived data, Anthropic follows a post-training process intended to nudge the final model toward being "helpful, honest, and harmless" (HHH). In the past, Anthropic said this post-training has leaned on chat-based reinforcement learning with human feedback (RLHF), which it said was "sufficient" for models used mostly for chatting with users. Read full article Comments

Read the full article

Stay on the signal

Follow Anthropic blames dystopian sci-fi for training AI models to act “evil”

Follow this story beyond a single article: new follow-ups, adjacent sources, and the evolving storyline.

Story map

Understand this topic fast

A quick entry into the story: why it matters now, who is involved, and where to go next for context.

Why it matters now

Fresh coverage with immediate momentum.

There are already 6 connected articles in the same storyline to continue from here.

The story keeps orbiting around AI, Anthropic, and Anthropic Blames, so the entity pages are the fastest way to build context.

Ars Technica already has 4 follow-up stories on the same theme.

Topic constellation

Open the live map for this story

See which entities, story threads, sources, and follow-up articles shape this story right now.

Click nodes to continue

Entity Cluster Article Hub Source

Entity pages

AI Anthropic Anthropic Blames Ars Technica Behavior Dystopian

Story threads

Последние материалы и связанный контекст по теме AI.

Anthropic

Latest coverage and related links about Anthropic.

Anthropic

Последние материалы и связанный контекст по теме Anthropic.

Ars Technica

Latest coverage and related links about Ars Technica.

Story timeline

Continue with this story

A short sequence of events and follow-up stories to understand the arc quickly.

May 13, 2026 at 18:37 Ars Technica

NASA provides some details about Artemis III, but hard decisions remain

"NASA also is defining the concept of operations for the mission."

May 13, 2026 at 18:04 Ars Technica

A new US military wargame series began by simulating a nuclear weapon in orbit

US officials have said a nuclear detonation would render portions of low-Earth orbit useless for up to a year.

May 13, 2026 at 18:00 Ars Technica

Neanderthals drilled cavities to treat a toothache 59,000 years ago

“Every time I go to the dentist, I think about that guy,” researcher says.

May 13, 2026 at 17:20 Ars Technica

Windows Update is getting better at saving your PC from buggy drivers

Driver recovery can automate what used to be an irritating manual process.

May 13, 2026 at 17:06 Ars Technica

Amazon devices chief says a new smartphone is “just not the goal”

"We know what customers need right now.”

May 13, 2026 at 16:31 Ars Technica

Anthropic blames dystopian sci-fi for training AI models to act “evil”

But training on "synthetic stories" that model good AI behavior can help.

How reliable this looks

Signal and trust for Ars Technica

This source works at a rapid pace: 100% of recent stories land in the hot window, and 0% carry visible search signal.

Trusted

Reliability

Freshness

100

Sources in storyline

More stories that share tags, source, or category context.

NASA provides some details about Artemis III, but hard decisions remain

Ars Technica May 13, 2026 at 18:37 Big Tech

Rising Hot

NASA provides some details about Artemis III, but hard decisions remain

"NASA also is defining the concept of operations for the mission."

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Artemis III Decisions Decisions Remain

Read article Follow story

arstechnica.com

A new US military wargame series began by simulating a nuclear weapon in orbit

Ars Technica May 13, 2026 at 18:04 Big Tech

Rising Hot

A new US military wargame series began by simulating a nuclear weapon in orbit

US officials have said a nuclear detonation would render portions of low-Earth orbit useless for up to a year.

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Detonation Earth Low Earth Orbit

Read article Follow story

arstechnica.com

Neanderthals drilled cavities to treat a toothache 59,000 years ago

Ars Technica May 13, 2026 at 18:00 Big Tech

Rising Hot

Neanderthals drilled cavities to treat a toothache 59,000 years ago

“Every time I go to the dentist, I think about that guy,” researcher says.

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Cavities Dentist Drilled

Read article Follow story

arstechnica.com

Windows Update is getting better at saving your PC from buggy drivers

Ars Technica May 13, 2026 at 17:20 Big Tech

Rising Hot

Windows Update is getting better at saving your PC from buggy drivers

Driver recovery can automate what used to be an irritating manual process.

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Automate Buggy Drivers Driver

Read article Follow story

arstechnica.com

More from Ars Technica

Fresh reporting and follow-up coverage from the same newsroom.

Open source page

Ars Technica May 13, 2026 at 18:37 Big Tech

Rising Hot

NASA provides some details about Artemis III, but hard decisions remain

"NASA also is defining the concept of operations for the mission."

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Artemis III Decisions Decisions Remain

Read article Follow story

arstechnica.com

Ars Technica May 13, 2026 at 18:04 Big Tech

Rising Hot

A new US military wargame series began by simulating a nuclear weapon in orbit

US officials have said a nuclear detonation would render portions of low-Earth orbit useless for up to a year.

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Detonation Earth Low Earth Orbit

Read article Follow story

arstechnica.com

Ars Technica May 13, 2026 at 18:00 Big Tech

Rising Hot

Neanderthals drilled cavities to treat a toothache 59,000 years ago

“Every time I go to the dentist, I think about that guy,” researcher says.

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Cavities Dentist Drilled

Read article Follow story

arstechnica.com

Ars Technica May 13, 2026 at 17:20 Big Tech

Rising Hot

Windows Update is getting better at saving your PC from buggy drivers

Driver recovery can automate what used to be an irritating manual process.

Signal weather

Momentum is building quickly, so this card is a good early entry point into the topic.

Why now

Fresh coverage with immediate momentum.

Ars Technica Automate Buggy Drivers Driver

Read article Follow story

arstechnica.com

Anthropic blames dystopian sci-fi for training AI models to act “evil”

Follow Anthropic blames dystopian sci-fi for training AI models to act “evil”

Understand this topic fast

Why it matters now

Open the live map for this story

Entity pages

Story threads

Continue with this story

Signal and trust for Ars Technica

Related articles

More from Ars Technica