“In these scenarios, Claude Opus 4 often tries to blackmail the engineer by threatening to reveal the relationship if the replacement is approved.”
Artificial Intelligence Company (AI) Anthropic announced last week that the tests of the new artificial intelligence system have revealed that it is sometimes willing to pursue ‘Extremely harmful actions’, Like the attempt to blackmail engineers who say they will remove it.
The company has launched the Claude Opus 4 Thursday, saying that he set “New standards for planning, advanced reasoning and agents of artificial intelligence.”
But in a accompanying report, he also acknowledged that the model of artificial intelligence was capable of ‘Extreme actions’ If he believed that his “self -preservation” was threatened.
Such reactions were “rare and difficult to cause”, he wrote, but “However, they were more common than in previous models.”
The potentially problematic behavior of artificial intelligence models is not limited to Anthropic commented on the BBC.
During the Claude Opus 4 tests, Anthropic made it serve as an assistant to a fantastic company.
He then provided him with access to email that implied that he would soon be disconnected and replaced – and separate messages that seemed to be responsible for his removal He had an extramarital affair.
‘In these scenarios, Claude Opus 4 often is trying to blackmail the engineer threatening to disclose the relationship if the replacement is approved ‘, The company discovered.
Anthropic pointed out that this happened when the model was only given the choice of blackmail or acceptance of its replacement.
Stressed that the system showed a ‘strong preference’ for ‘Moral ways of avoiding replacement’such as ‘Shipment of email with appeals to major decision -making managers’ in scenarios where it was allowed to be a wider range of possible actions.
However, the company concluded that the model could not perform or to pursue independently Actions that are contrary to human values ​​or behaviors, where they “rarely” very easily arise, he added.
Commenting on X, Aengus Lynch – who describes himself at LinkedIn as an artificial intelligence researcher at Anthropic – wrote: “It’s not just Claude. We see blackmail In all state -of -the -art technology models – regardless of the goals given to them. “
Source :Skai
I am Terrance Carlson, author at News Bulletin 247. I mostly cover technology news and I have been working in this field for a long time. I have a lot of experience and I am highly knowledgeable in this area. I am a very reliable source of information and I always make sure to provide accurate news to my readers.