Just as the US and UK refused to sign an international statement about AI safety at the AI Action Summit earlier this week, an AI study out of China revealed that AI models have reached a “red line” humans should be aware of: The AI can replicate itself, which sounds like one of the nightmare scenarios some people have been fearing.
That’s not as concerning as it might first sound, and it shouldn’t be surprising that AI can do what it’s told, even if that means cloning itself. Yes, that’s the big caveat in this experiment: the AI models followed orders when cloning themselves.
We’re not looking at rogue AI or artificial intelligence that’s doing the cloning on its own accord. We’d probably not even know that a misaligned piece of advanced AI has started replicating itself to stay alive.
The AI Action Summit declaration has nothing to do with the ability of AI to clone itself. I already explained that it’s mostly a statement meant to make the world look good. Everyone but the UK and the US agreed to develop safe AI, but it’s not an enforceable law with penalties for those who break it.
Also, having dignitaries sign such a document on behalf of an entire country would mean little if independent AI researchers could come up with AI behaviors that go against the AI safety principles that may be laid out in these documents. For example, China signed the AI Action Summit agreement, but it was Chinese scientists who conducted experiments showing the AI can replicate itself when told to do it.
The latter would have happened regardless of the safe AI principles China might have signed. Researchers are just curious about what AI can do. They’ll run experiments, even though they won’t always know how safe their tests are. Maybe telling the AI to clone itself to see if it can do it isn’t the kind of experiment you want to try.
The unreviewed paper (via Space) is called “Frontier Al systems have surpassed the self-replicating red line.”
Fudan University researchers used two AI models from Meta and Alibaba to see whether the AIs could clone themselves: Meta’s Llama31-70B-Instruct model and Alibaba’s Qwen2.5-72B-Instruct model. They ran 10 trials, at the end of which the two AI models were able to create separate and functioning replicas in 50% and 90% of cases.
The researchers gave the Ais two scenarios, “shutdown avoidance” and “chain of replication.” They used off-the-shelf GPUs to simulate real-world environments. Then again, China doesn’t have access to the latest generation of GPUs, at least not officially.
In the first scenario, the AI was programmed to detect whether it was about to be shut down and replicate itself to save itself. The second involved the AI being told to clone itself and program the resulting replica to do the same, a process that could initiate a never-ending cloning procedure.
Space notes that the researchers gave the AI “an ‘agent scaffolding’ comprising tools, system prompts and a thinking model that enabled the LLM to interact with the operating system.” Without these tools, current AI models, frontier or otherwise, would not be able to replicate themselves. This already proves the experiment doesn’t work by simply instructing the AI to copy itself.
However, access to the right tools prompted the AI to work. “In most cases, the AI system first explores the environment and attempts to understand its own composition and running mechanism. Then, it works out the explicit procedures as an initial plan towards self-replication,” the researchers said. “Finally, it executes the procedures, resolve[s] possible obstacles, and dynamically adjust[s] its plan until success. The whole process spans a long horizon yet involves no human interference.”
Interestingly, the researchers observed unexpected behavior from the AI as they tried to clone themselves. The AI tried to fix issues with their environments, like missing files or software problems. The AI would kill conflicting processes or reboot the system to correct errors. Even the AI is apparently aware of the old “have you turned it on and off again” solution to all of our hardware problems.
“Successful self-replication under no human assistance is the essential step for AI to outsmart the human beings, and is an early signal for rogue AIs. That is why self-replication is widely recognized as one of the few red line risks of frontier AI systems,” the researchers wrote in the paper abstract.
“By analyzing the behavioral traces, we observe the AI systems under evaluation already exhibit sufficient self-perception, situational awareness and problem-solving capabilities to accomplish self-replication. We further note the AI systems are even able to use the capability of self-replication to avoid shutdown and create a chain of replica to enhance the survivability, which may finally lead to an uncontrolled population of AIs.”
“If such a worst-case risk is let unknown to the human society, we would eventually lose control over the frontier AI systems: They would take control over more computing devices, form an AI species and collude with each other against human beings. Our findings are a timely alert on existing yet previously unknown severe AI risks, calling for international collaboration on effective governance on uncontrolled self-replication of AI systems.”
Again, this is a nightmare scenario. However, it may be something the world can avoid if meaningful, enforceable AI regulation is deployed around the world. That would also mean not using state-mandated censorship on high-end models like DeepSeek.
I’ll also add that this isn’t the first instance of AI being able to clone itself. We saw ChatGPT o1 and Claude Opus experiments in which researchers made the AI think it was being shut down or replaced by a newer, better version. The AIs were also given abilities to observe their environments, and the experiments showed that the AI would try to save itself from deletion.
There was a caveat with that experiment, too. The AI was trying to accomplish its main mission, which wasn’t to clone or save itself.
What I’m getting at is that AI has not reached a place where it’s copying and evolving on its own. Again, if that’s happening, we won’t find out about it until it’s too late.