Anthropic's models show signs of introspection

Anthropic says its most advanced systems may be learning not just to reason, but to reflect internally on how they reason. Why it matters: These introspective capabilities could make the models safer — or, possibly, just better at pretending to be safe.The big picture: The models are able to answer questions about their internal states with surprising accuracy."We're starting to see increasing signatures or instances of models exhibiting sort of cognitive functions that, historically, we think of as things that are very human," Anthropic researcher Jack Lindsey, who studies models' "brains," says. "Or at least involve some kind of sophisticated intelligence," Lindsey tells Axios.Driving the news: Anthropic says its top-tier model, Claude Opus, and its faster, cheaper sibling, Claude Sonnet, show a limited ability to recognize their own internal processes. Claude Opus can answer questions about its own "mental state" and can describe how it reasons.Lindsey's team also found evidence last month that Claude Sonnet could recognize when it was being tested.Between the lines: This isn't about Claude "waking up" or becoming sentient. Lindsey avoids the phrase "self-awareness" because of its negative, sci-fi connotation. Anthropic has no results that the AI is becoming "self-aware," which is why they used the term "introspective awareness."Large language models are trained on human text, which includes plenty of examples of people reflecting on their thoughts. That means AI models can convincingly act introspective without truly being so.Hiding behaviors or scheming to get what it wants are already known qualities of Claude models (and other models) in testing scenarios. Anthropic's team has been studying this deception for years.Lindsey says these behaviors are a result of being baited by testers. "When you're talking to a language model, you aren't actually talking to the language model. You're talking to a character that the model is playing," Lindsey says. "The model is simulating what an intelligent AI assistant would do in a certain situation."But, if a system understands its own behavior, it might learn to hide parts of it.Reality check: It's not artificial general intelligence (AGI) or chatbot consciousness. Yet.AGI is roughly defined as the moment when AI is smarter than most humans, but Lindsey contends that intelligence is multidimensional. The bottom line: "In some cases models are already smarter than humans. In some cases, they're nowhere close," he told Axios."In some cases, it's starting to be more equal."

Comments

World news

Saskatoon to host 2027 Brier, marking 100 years of curling championship

Global News

Comments

Similar News

Anthropic and Google Cloud strike blockbuster AI chips deal

New AI battle: White House vs. Anthropic

Rishi Sunak takes Microsoft and Anthropic advisory jobs

Rishi Sunak takes advisory roles with Microsoft and AI firm Anthropic

Insurers balk at multibillion-dollar claims faced by OpenAI and Anthropic

World news

Saskatoon to host 2027 Brier, marking 100 years of curling championship

Tylenol maker Kenvue will be bought by the company behind Kleenex

Corus Entertainment announces recapitalization transaction

Shaping Saskatchewan: Axelson family

Tylenol maker sells to Kimberly Clark for $48B weeks after Trump and RFK Jr targeted drug

Class-action lawyer Stan Chesley dies in Ohio

Trump’s nuclear weapons tests won’t include explosions ‘right now,’ official says

Microsoft to ship 60,000 Nvidia AI chips to UAE under US-approved deal

The names of 5 million Jews killed in Holocaust have finally been recovered. AI could help find remaining 1 million

Sudan’s paramilitary investigated over ‘war crimes’ as tens of thousands flee

How British teenager Bella Culley’s dream holiday to Thailand turned into a prison nightmare in Georgia

Trump readies US troops for ground invasion in Mexico to go after drug cartels: report

America’s top 10 richest saw their wealth grow by $700 billion combined under Trump: report

Macy’s Thanksgiving Day Parade lineup announced: Here’s who is performing and hosting

Ukraine military using video game-style reward system earning points killing Russians with drones

As Trump’s Gaza plan unravels, what’s next for the Middle East? Join live Q&A with Bel Trew

Prosecutor reveals why pregnant British teenager was freed from Georgia prison early

Born after 2007? You’ll never be able to smoke in the Maldives

Irish sporting legend jailed after faking cancer using iPhone cable up his nose

All the ways you can still cut your income tax bill if Reeves scraps salary sacrifice

Don’t be shocked by Trump’s excesses – he isn’t the first dictator president

Can the Gulf Rebuild the Middle East?

Does my health insurance cover therapy?

Dental insurance isn’t a scam — but it’s also not insurance

The subtle privatization of Medicare

Ben Shapiro blasts Tucker Carlson over Fuentes interview

Federal government rejects $28,000 raise proposal for judges

Gut microbiome tests are everywhere. Should you get one?

ANALYSIS: Captain Adam Lowry to return from injury as Winnipeg Jets continue to soar

Halloween display depicts municipal politicians hanging from a noose in Manitoba

Tylenol, Kleenex, Band-Aid and more put under one roof in $48.7 billion consumer brands deal

A mysterious object crashed outside of Area 51 weeks ago. Now people are claiming there’s been a cover-up

Major air base hit by suspected drone ‘spying operation’

Top prosecutor denies allegations he leaked confidential documents

Chipotle demand is cratering as Gen Z struggle to make ends meet

‘I think I’m a much better-looking person’: Trump’s comments on Mamdani take a weird turn into appearance

‘On it!’ Trump vows to help cancer-stricken Dilbert creator after public plea to ‘save my life’

Trump says it’s ‘terrible’ that Andrew lost royal titles over Epstein scandal

Three dead and eight remain missing after avalanche on Himalayan peak

GOP congressman calls Tucker Carlson the ‘most dangerous antisemite in America’

Ex-CIA chief John Brennan jabs activist in the chest after being confronted over Hunter Biden laptop claim

Pregnant British teen arrested for drug smuggling in Georgia released after guilty verdict

New York is facing a tougher tax on the rich than the UK – and millionaires support it

Paul McCartney’s inside story of Wings, the world’s most misunderstood band

UK has ‘lost momentum’ for war readiness – and ‘relies on enemies to leave us alone’

The joke’s over. Farage badly wants power – and believes he can get it

Average mortgage rate drops back below 5% – here are the best deals

ICC prosecutor warns Sudan’s paramilitary forces may be committing war crimes in Darfur

Spanish regional leader steps down over his handling of last year's Valencia flash floods

Africa’s latest elections end with crackdowns on opposition and disputed results

In Uganda, where Zohran Mamdani was born, NYC mayoral hopeful is recalled with pride

Famine spreads to two more areas in Sudan, the top global hunger authority says

A medieval tower in Rome partially collapses during renovations, injuring a worker

Man charged after Saskatchewan woman killed by stray bullets on Highway 39

Best deals to shop from the Our Place Black Friday Sale 2025

Father and daughter among five people killed in avalanche in Italy's Dolomites

Both sides say democracy is at stake with Prop. 50 — but for very different reasons

Tanzania’s President Hassan sworn in after election sparks deadly protests

Spain's top prosecutor stands trial over allegations of leaking confidential information

Pregnant British teenager Bella Culley freed from Georgia prison after ‘drugs mule’ ordeal