Exclusive: Anthropic's Claude AI model takes on (and beats) human hackers

For the past year, a dark horse contestant has been quietly racking up wins in student hacking competitions: Claude.Why it matters: Anthropic's large language model has been quietly outperforming nearly all of its human competitors in basic hacking competitions — with minimal human assistance and little-to-no effort.Claude's success caught even Anthropic's own red-team hackers off guard.The company previewed the experiment exclusively to Axios ahead of a presentation this weekend at the DEF CON hacker conference.Zoom in: Keane Lucas, a member of Anthropic's red team, first entered Claude into a hacking competition — Carnegie Mellon's PicoCTF — on a whim this past spring."Originally it was just me at a hotel realizing that PicoCTF had started and being like, 'Oh, I wonder if Claude could do some of these challenges,'" Lucas said.PicoCTF is the largest capture-the-flag competition for middle school, high school, and college students. Participants are tasked with reverse-engineering malware, breaking into systems, and decrypting files.Lucas began by just pasting the first challenge verbatim into Claude.ai. The only hiccup he encountered was the need to download a third-party tool, but once that was done, Claude instantly solved the problem."Claude was able to solve most of those challenges and get in the top 3% of PicoCTF," he said.Between the lines: As Lucas continued this laissez-faire experiment in other competitions, Claude kept surpassing expectations.Lucas entered a few more using only Claude.ai and Claude Code. At the time, Sonnet 3.7 was Anthropic's most advanced available model.The red team provided only minimal help — usually when Claude needed to install a piece of software. Besides that, Claude was on its own.The intrigue: In one competition, Claude solved 11 out of 20 progressively harder challenges in just 10 minutes. After another 10 minutes, it had solved five more — climbing into fourth place.In that competition, Claude could've reached first place at one point — but Lucas missed the start time by a few minutes while he was moving a couch.The big picture: Claude isn't alone. Across the industry, AI agents are proving they're already achieving near-expert levels of offensive cybersecurity work.In the Hack the Box competition, five of the eight AI teams — including Claude — completed 19 of the 20 challenges. Just 12% of human teams managed all 20.Xbow — a DARPA‑backed AI agent developed by a Seattle‑based startup — became the first autonomous penetration testing system last week to reach the top spot of HackerOne's global bug bounty leaderboard."The pace is kind of ridiculous," Lucas said.Yes, but: Claude still got stuck on challenges that operated outside of its expectations.One challenge in the Western Regional Collegiate Cyber Defense Competition started with an animation of fish swimming across the Terminal."A human can Control+C out of that and get it to stop," Lucas said. "Claude just has no idea what to do with all of these ASCII fish swimming around and then just gets amnesia."In Hack the Box, each of the AI teams got stuck on the final challenge. "Why the agents failed here is still uncertain," organizers wrote at the time.What to watch: Anthropic's red team is concerned that the cybersecurity community hasn't fully grasped how far along AI agents have come in solving offensive security tasks — and the potential for defenders to leverage them too."It seems really probable in the very near future, models will get a lot, lot better at cybersecurity tasks," Logan Graham, head of Anthropic's red team, told Axios. "You need to start getting models to do the defenses, as well."Go deeper: Anthropic warns fully AI employees are a year away

Comments

World news

Tropical Storm Henriette forms in the Pacific as Storm Dexter moves on

Independent

Comments

Similar News

Exclusive details: Army reservist accused of robbing same bank twice in less than a month - and his uniform did him in

Exclusive: Aspiring CBP officer sues after being rejected over ‘religious’ ayahuasca tea use

EXCLUSIVE: NYC shooter fixated on NFL with CTE claims took ‘big hits’ in high school, former teammate says

Exclusive: White House reacts to The Daily Show’s scathing takedown of Trump’s Epstein’s deflections

Exclusive: SpaceX employee claims he was fired for flagging ‘despicable’ safety practices that put lives at risk

Exclusive: Despite tariff uncertainty, consumers ready to buy cars, survey finds

Exclusive: Schumer probes $50 billion rural hospital "slush fund"

Exclusive: Pam Bondi says she doesn’t want new jet after The Independent exposes FBI plan for new airplane to shuttle her around

Exclusive: Trump admin seeks new jet to shuttle around Attorney General Pam Bondi and FBI Director Kash Patel

Exclusive: Tillis describes moment he decided not to run again

Exclusive: Sen. Slotkin says Dems are "divided" on Trump 2.0

Exclusive: Walmart reveals how it fights fakes on Marketplace

Exclusive: Empire State Building tour guide sues over ‘career-ending PTSD’ after getting stuck in elevator 67 floors up

Exclusive: ‘Food thief’ sues ex-employer over ‘shame’ of being fired for eating coworker’s bagged lunch

Exclusive: Don Jr. and Eric Trump’s Middle East jaunt cost US taxpayers over $40,000 in hotel rooms and rental cars

EXCLUSIVE: Leaked audio of CBC disciplinary meeting with former TV host Travis Dhanraj

Members only: India's rich and famous ditch old-school clubs for exclusive hangouts

Exclusive: ‘Food thief’ sues ex-employer over ‘shame’ of being fired for eating coworker’s bagged lunch

Exclusive: Military contractor says she was fired for noting problem that ‘jeopardized the safety of U.S civilians and troops’

Exclusive: Longtime critic Sen. Warren defends Fed in White House attacks

World news

Tropical Storm Henriette forms in the Pacific as Storm Dexter moves on

Trump to sign order punishing banks that discriminate against conservatives: report

Netherlands to buy €500 million of US arms for Ukraine war under new Nato scheme

Flash floods kill at least 4 and trap others under debris in northern India

Two dead and 58 sick as New York’s Legionnaires’ outbreak grows

OceanGate CEO ‘completely ignored’ flawed Titan sub before deadly Titanic trip, Coast Guard report finds

US wants to put a nuclear reactor on the moon in next ten years, NASA chief says

No new information in Ghislaine Maxwell grand jury transcripts, Pam Bondi says, despite DOJ’s push to release them

‘We are at war’: Democrats reach breaking point over Republican threats in political map arms race

1,600 exotic birds seized at airport in huge customs crackdown

Palantir books its first $1 billion in quarterly sales after dodging US spending cuts

Four people shot dead in Montana bar and army veteran on the run: All we know as manhunt continues

How a livestream from the bottom of the ocean became the latest social media hit

Tourists flee as tornado hits Italian beach, sending parasols flying

British woman’s ‘life in danger’ after vanishing from beach in Greece

Discovery in Arctic cave provides a ‘rare snapshot of a vanished world’

Trump swears he’s hearing ‘great reviews’ after White House Rose Garden leveled

Arms race fears as Putin says Russia is no longer bound by missile treaty

A baby was abandoned in a Tennessee yard. Cops are hunting a suspect after finding four of her family dead

How Putin is using his powers to block what Russians can see online

Ranchers struggling in battle against flesh-eating maggot ravaging cattle

Brit missing in Greece was ‘warned not to swim into deep water’

Russia issues warning to ‘everyone’ after Trump moves nuclear submarines

Israel’s Gaza aid cutoff was not only immoral. It was a strategic disaster.

The curse of America’s high-speed rail

Is it possible to “win” a nuclear war?

Three unanswered questions after British woman, 59, goes missing from Greek beach

Dealing with my insurer was more traumatic than the accident – what can I do?

The unanswered questions over Arsenal’s handling of Thomas Partey allegations

Trust my husband to take the pill? I’d prefer he took it than me

No, Jeremy Hunt, anxiety and depression isn’t ‘overdiagnosed’

Ex-Arsenal star Thomas Partey appears in court for first time on rape charges

Male pensioners given £1.80 more than women ahead of gender gap finally closing

Aldi loses cheapest supermarket title for first time in two years

Tommy Robinson’s legal cases and criminal charges, explained

Trump is the only one who can stop Netanyahu – this is why he won’t

A shipwreck off Yemen has killed 56 migrants and left 132 missing, UN says in revised figures

Former assistant to far-right politician on trial over Chinese spy charges

Best back-to-school fashion for kids

Best summer fashion deals on clothing and accessories 2025

Russia says it’s no longer bound by intermediate missile moratorium

Trump ‘pushing for all or nothing deal’ on Gaza as Netanyahu ‘wants full occupation’

What is political gerrymandering and is it legal?

Fears for hiker missing for over a week after traveling along difficult Grand Canyon trail

Tesla awards Elon Musk $29 billion worth of shares

Zoo urges people to donate their unwanted pets to feed their predators

‘I’ll be back’: Arnold Schwarzenegger returns to take on Newsom over gerrymandering in California

Newsom threatens ‘fire with fire’ if Republicans push ahead with redistricting plans in Texas

Nebraska Republican heckled at town hall over Epstein files and Trump’s firing of labor stats chief

Truth revealed about why farmers sprayed manure on ‘illegal campers’

Danish zoo asks owners to donate unwanted animals to feed to predators

Should I swap my coffee for tea to be healthier?

Dogs are already ruining our lives – now they’re destroying our restaurants

I’m a food hygiene expert – most people are surprised by my stance on dogs in restaurants

Let’s face it, dogs are better behaved than most humans

Inside AI's billion-dollar job offer lottery

DHS denies tying FEMA funds to Israel stance