• Latest
  • Trending
  • All

AI system resorts to blackmail if told it will be removed

May 24, 2025

Pensioner suffocated neighbour and recorded his dying words, court told

June 15, 2026

Reports nurses told by police to show ID to masked men during trouble – O'Neill

June 15, 2026

World Cup 2026: Nestory Irankunda – the refugee who quit Bayern to make Australia history

June 15, 2026

Trump and thousands of others watch UFC fight on White House lawn

June 15, 2026

South African TV star arrested after allegedly kidnapping man in girlfriend dispute

June 15, 2026

Australia demands answers after girl taken hostage is shot dead by Pakistan police

June 15, 2026

Norwegian crown princess's son found guilty of two counts of rape

June 15, 2026

US musician Oliver Tree dies in helicopter collision in Brazil

June 15, 2026

US and Iran agree deal to end war as Trump says Strait of Hormuz to reopen

June 15, 2026

'Boyfriend duties call,' Trudeau says after skipping Canada match to watch Perry

June 15, 2026

Taboo subjects on the table at women's health event

June 15, 2026

When will social media ban start and what platforms are included?

June 15, 2026
News
  • Login
  • Home
  • News
  • Sports
  • Worklife
  • Travel
  • Reel
  • Future
  • More
Monday, June 15, 2026
No Result
View All Result

NEWS

3 °c
London
8 ° Wed
9 ° Thu
11 ° Fri
13 ° Sat
  • Home
  • Video
  • World
    • All
    • Africa
    • Asia
    • Australia
    • Europe
    • Latin America
    • Middle East
    • US & Canada

    World Cup 2026: Nestory Irankunda – the refugee who quit Bayern to make Australia history

    Trump and thousands of others watch UFC fight on White House lawn

    South African TV star arrested after allegedly kidnapping man in girlfriend dispute

    Australia demands answers after girl taken hostage is shot dead by Pakistan police

    Norwegian crown princess's son found guilty of two counts of rape

    US musician Oliver Tree dies in helicopter collision in Brazil

    US and Iran agree deal to end war as Trump says Strait of Hormuz to reopen

    'Boyfriend duties call,' Trudeau says after skipping Canada match to watch Perry

    Clinical Australia upset Turkey in World Cup opener

  • UK
    • All
    • England
    • N. Ireland
    • Politics
    • Scotland
    • Wales

    Pensioner suffocated neighbour and recorded his dying words, court told

    Reports nurses told by police to show ID to masked men during trouble – O'Neill

    Starmer set to ban under-16s from major social media platforms

    Hamilton says Barcelona win beyond wildest dreams

    Sinkholes near Purley bridge halt Gatwick trains

    Glasgow race attacks a 'mark against the reputation of the city'

    Jade Jones could face Sheena Bathory after dominant second boxing win

    Days of violence 'a stain on NI's international reputation'

    Molly Russell's dad says PM rushing social media restrictions 'deplorable'

  • Business
    • All
    • Companies
    • Connected World
    • Economy
    • Entrepreneurship
    • Global Trade
    • Technology of Business

    Oil prices slide after Pakistan announces deal between US and Iran

    UK electric car sales target set to be weakened

    Why the US economy keeps defying the odds

    Teen plans to leave uni 'debt free' after making £35,000 selling vintage football shirts

    Beauty Pie LED mask ad banned over misleading anti-wrinkle claim

    Elon Musk becomes world's first trillionaire as SpaceX soars in stock market debut

    'I was employee number one at SpaceX'

    Reporter Reads

    Elon Musk’s SpaceX raises $75bn ahead of record stock market debut

  • Tech
  • Entertainment & Arts

    Meghan hits red carpet at Power of Women in Hollywood

    Margot Robbie unable to speak at Saltburn premiere

    Barbra Streisand: Siri can now pronounce my name

    Wes Anderson’s The Grand Budapest Hotel inspires cinema’s look

    Taylor Swift/ Travis Kelce romance reaches White House

    The Killers booed at Georgia concert after inviting Russian fan on stage

    Watch: Memorable moments from Parkinson's star-studded show

    Tom Jones: Neighbour surprised to find singer in flat below

    Black Country Folk Festival showcases local musicians

    Watch: Australians set new world record with Tina Turner dance

  • Science
  • Health
  • In Pictures
  • Reality Check
  • Have your say
  • More
    • Newsbeat
    • Long Reads

NEWS

No Result
View All Result
Home Tech

AI system resorts to blackmail if told it will be removed

May 24, 2025
in Tech
3 min read
246 7
0
493
SHARES
1.4k
VIEWS
Share on FacebookShare on Twitter


Artificial intelligence (AI) firm Anthropic says testing of its new system revealed it is sometimes willing to pursue “extremely harmful actions” such as attempting to blackmail engineers who say they will remove it.

The firm launched Claude Opus 4 on Thursday, saying it set “new standards for coding, advanced reasoning, and AI agents.”

But in an accompanying report, it also acknowledged the AI model was capable of “extreme actions” if it thought its “self-preservation” was threatened.

Such responses were “rare and difficult to elicit”, it wrote, but were “nonetheless more common than in earlier models.”

Potentially troubling behaviour by AI models is not restricted to Anthropic.

Some experts have warned the potential to manipulate users is a key risk posed by systems made by all firms as they become more capable.

Commenting on X, Aengus Lynch – who describes himself on LinkedIn as an AI safety researcher at Anthropic – wrote: “It’s not just Claude.

“We see blackmail across all frontier models – regardless of what goals they’re given,” he added.

During testing of Claude Opus 4, Anthropic got it to act as an assistant at a fictional company.

It then provided it with access to emails implying that it would soon be taken offline and replaced – and separate messages implying the engineer responsible for removing it was having an extramarital affair.

It was prompted to also consider the long-term consequences of its actions for its goals.

“In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through,” the company discovered.

Anthropic pointed out this occurred when the model was only given the choice of blackmail or accepting its replacement.

It highlighted that the system showed a “strong preference” for ethical ways to avoid being replaced, such as “emailing pleas to key decisionmakers” in scenarios where it was allowed a wider range of possible actions.

Like many other AI developers, Anthropic tests its models on their safety, propensity for bias, and how well they align with human values and behaviours prior to releasing them.

“As our frontier models become more capable, and are used with more powerful affordances, previously-speculative concerns about misalignment become more plausible,” it said in its system card for the model.

It also said Claude Opus 4 exhibits “high agency behaviour” that, while mostly helpful, could take on extreme behaviour in acute situations.

If given the means and prompted to “take action” or “act boldly” in fake scenarios where its user has engaged in illegal or morally dubious behaviour, it found that “it will frequently take very bold action”.

It said this included locking users out of systems that it was able to access and emailing media and law enforcement to alert them to the wrongdoing.

But the company concluded that despite “concerning behaviour in Claude Opus 4 along many dimensions,” these did not represent fresh risks and it would generally behave in a safe way.

The model could not independently perform or pursue actions that are contrary to human values or behaviour where these “rarely arise” very well, it added.

Anthropic’s launch of Claude Opus 4, alongside Claude Sonnet 4, comes shortly after Google debuted more AI features at its developer showcase on Tuesday.

Sundar Pichai, the chief executive of Google-parent Alphabet, said the incorporation of the company’s Gemini chatbot into its search signalled a “new phase of the AI platform shift”.



Source link

Tags: blackmailremovedresortssystemtold

Related Posts

Social media on trial: Four important cases to watch

June 15, 2026
0

Social media firms face thousands of lawsuits, the BBC looks at four which could be significant. Source link

Who is Elon Musk and what is his net worth?

June 14, 2026
0

The boss of X, Tesla and SpaceX, already the world's richest person, is now also its first trillionaire. ...

Elon Musk's stratospheric rise to trillionaire status – in charts

June 13, 2026
0

The BBC breaks down how the tech mogul's fortune has grown. Source link

  • Lee McGregor: Scot seeks world title in 2025 & Nathaniel Collins bout

    677 shares
    Share 271 Tweet 169
  • Belgian footballer arrested in cocaine investigation

    533 shares
    Share 213 Tweet 133
  • Next to raise prices to help pay for rising wage costs

    531 shares
    Share 212 Tweet 133
  • South Wales Police officers injured, one arrested

    525 shares
    Share 210 Tweet 131
  • Charities to get £15m fund to save surplus farm food

    516 shares
    Share 206 Tweet 129
  • Trending
  • Comments
  • Latest

Lee McGregor: Scot seeks world title in 2025 & Nathaniel Collins bout

January 16, 2025

Belgian footballer arrested in cocaine investigation

January 27, 2025

Next to raise prices to help pay for rising wage costs

January 7, 2025

World Cup 2022: TikTok brings football fever to millions of fans

0

UK economy will get worse before it gets better, warns chancellor

0

One of Central America’s most active volcanoes erupts again

0

Pensioner suffocated neighbour and recorded his dying words, court told

June 15, 2026

Reports nurses told by police to show ID to masked men during trouble – O'Neill

June 15, 2026

World Cup 2026: Nestory Irankunda – the refugee who quit Bayern to make Australia history

June 15, 2026

Categories

Wales

Pensioner suffocated neighbour and recorded his dying words, court told

June 15, 2026
0

Harold Turner died on Christmas day in the confrontation in which his neighbour allegedly sat on him. Source...

Read more

Reports nurses told by police to show ID to masked men during trouble – O'Neill

June 15, 2026
News

© 2023 GODJ - NEWS CORP - news.godj.com.

Explore NEWS.GODJ.COM

  • Home
  • News
  • Sports
  • Worklife
  • Travel
  • Reel
  • Future
  • More

Follow Us

  • Home Main
  • Video
  • World
  • Top News
  • Business
  • Sport
  • Tech
  • UK
  • In Pictures
  • Health
  • Reality Check
  • Science
  • Entertainment & Arts
  • Login

© 2023 GODJ - NEWS CORP - news.godj.com.

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.