OpenAI went through a hell of a week, having fired and rehired Sam Altman in less than seven days. The CEO drama seems to be over, for now at least. As ChatGPT users, we all deserve an explanation of what has just happened between the OpenAI board and the CEO.
We need to know whether OpenAI has made any big AI breakthroughs that might have worried the board. Also, we need to know that the new board will continue to defend the non-profit’s main goal, which is for the company to develop safe AI that can serve humankind.
The worry is that AI can get out of hand once it reaches artificial general intelligence (AGI). It could become a superintelligence that might deem humans an existential threat.
It sounds like the scenario from various sci-fi movies, I agree. But then I read an incredibly detailed piece about what happened at OpenAI last week and humanity’s collective worries about dangerous AI. It made me realize the ugly truth about our quest to achieve AGI.
Once we make AI that’s as good as humans at solving problems, it might hide itself from us if it’s the kind of misaligned AI everyone fears.
What is AI alignment, anyway?
AI alignment is something you might have been hearing a lot since ChatGPT came along. It boils down to a simple thing: we must create AGI with interests that align with ours. That is, AI that will not eventually try to hurt (or eradicate) us once it realizes that we’re an obstacle.
I said before that we’re “doomed” to reach AGI now that we have ChatGPT. We can’t put the genie back in the bottle now that generative AI is out. Everyone will make better and faster models, and we’ll reach AGI whether we want it or not. What we want, and the OpenAI non-profit wants, is to avoid getting to AGI before we figure out alignment.
I also said that I’m not worried about the doom scenarios AI researchers put out. These are, after all, the same minds who came up with the breakthroughs that got us here. They could very well have stopped development before we got to generative AI. I’m sure the philosophical discussions about AGI and alignment were from the start.
Before I show you the scary AI scenario that made me change my mind, if only slightly, I’ll also remind you of the memes that flooded social media between Sam Altman’s firing and rehiring. The gist of one of them was that it was AI directing OpenAI. Therefore, a smarter version of ChatGPT had fired Altman.
What if AGI fired Sam Altman?
This brings me to the following theoretical scenario that Tomas Pueyo put up in his amazing walkthrough of the OpenAI CEO situation last week. The entire blog is available on Substack for free, and I highly recommend you read it to understand everything there is to know about OpenAI. With that in mind, here’s the scenario he offered for AGI actually firing Sam Altman:
OpenAI released a few weeks ago what they call GPTs. These are agents, pieces of code that can do specialized things for you, like help in your taxes or give medical advice. The most successful GPT is called Grimoire: It’s a coding wizard. The most successful AI agent is a software developer.
Ah, also, GPTs have access to the Internet.
Imagine that OpenAI has an internal Grimoire, with access to the source code of GPTs themselves, and has a goal of optimizing the GPTs code. Being an agent, Grimoire can operate independently, so it could have spun up a few Grimoires, spread over the Internet, started optimizing the source code of GPTs, and become more intelligent.
Maybe it would have thought then: What are my biggest constraints right now? Well, I need access to more computing power. Also, I identify a very big risk from the OpenAI board curtailing my progress because of their fear of AGI. How can I mitigate these risks?
So maybe it created a virus to enter into the computers of all the board members. Then, it showed them different information, making some believe that Altman was lying to them, which would trigger the firing “because Altman can’t be trusted.” At the same time, it could have hacked into Nadella’s computer to plant the idea of taking over the employees of OpenAI, Altman included. Maybe it would have also hacked into the computers of OpenAI employees to nudge them in the direction of following Altman to Microsoft.
During that time, it might have secretly opened investing accounts across different platforms using as collateral the OpenAI bank accounts, to which it had access. Once it gained access to the stock markets, it would have heavily shorted Microsoft just before the announcement that OpenAI was firing Altman. After this news became public, the stock dropped, and the AGI would have made a lot of money. It would have reinvested all that money in Microsoft stock. After Altman announced that he was joining Microsoft, the stock went back up, and the AGI would have sold it. OpenAI’s money would have not even left its bank accounts. Nobody would know, and the AGI would have made its first millions.
Now, the AGI would end up in Microsoft, where it could have access to infinite computing power, the best AI team in the world, much less alignment oversight, and millions in the bank. Perfect position to take over the world.
We would never know
As Pueyo notes, this is not what happened. “Probably.” And that’s the scary part right here. Once we reach AGI, it might not be clear. We might debate it, retest it, and discuss it. But if we really get to it, and if it’s misaligned, something like the scenario above could happen.
AI will not sleep. Computers never do. It would be able to better itself without our knowledge and then plan our demise so it can achieve whatever goals it might pursue. Once it does, there will be no turning it off. AGI will be capable of technical innovations beyond our intelligence. And you can be sure that it’ll hide its intelligence from us, making us question its existence until it’s too late.
That’s why what’s happening at OpenAI these days is incredibly important. It’s also why we need some degree of transparency, and why OpenAI’s board has to ensure the safety of its AI at the cost of profits — if that’s even doable.
I’ll also remind you of a different point I made previously. It’s not just OpenAI, Microsoft, or Google that are working on better versions of ChatGPTs that will lead to AGI. Anyone with enough resources can do it. Any individual or company. Or a nation-state whose more dictatorial interests could always lead to the discovery of an accidental and deeply misaligned AGI.
Again, check out Tomas Pueyo at this link for a great walkthrough of what happened at OpenAI during Thanksgiving 2023.