It’s time for another installment in our ongoing look at the future that will be brought to you thanks to the increasingly worrisome capabilities of artificial intelligence. Everyone is aware of the problem of fake news online, and now the OpenAI nonprofit backed by Elon Musk has developed an AI system that can create such convincing fake news content that the group is too skittish to release it publicly, citing fears of misuse. They’re letting researchers see a small portion of what they’ve done, so they’re not hiding it completely — but, even so, the group’s trepidation here is certainly telling.
“Our model, called GPT-2, was trained simply to predict the next word in 40GB of Internet text,” reads a new OpenAI blog about the effort. “Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.”
Basically, the GPT-2 system was “trained” by being fed 8 million web pages, until it got to the point where the system could look at a set of text it’s given and predict the words that could come next. Per the OpenAI blog, the model is “chameleon-like — it adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing.” Even if you were trying to produce, say, a fake news story.
Here’s an example: The AI system was given this human-generated text prompt:
“In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.”
From that, the AI system — after 10 tries — continued the “story,” beginning with this AI-generated text:
“The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science. Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.” (You can check out the OpenAI blog at the link above to read the rest of the unicorn story that the AI system fleshed out.)
Imagine what such a system could do, say, set loose on a presidential campaign story. The implications of this are why OpenAI says it’s only releasing publicly a very small portion of the GPT-2 sampling code. It’s not releasing any of the dataset, training code, or “GPT-2 model weights.” Again, from the OpenAI blog announcing this: “We are aware that some researchers have the technical capacity to reproduce and open source our results. We believe our release strategy limits the initial set of organizations who may choose to do this, and gives the AI community more time to have a discussion about the implications of such systems.
“We also think governments should consider expanding or commencing initiatives to more systematically monitor the societal impact and diffusion of AI technologies, and to measure the progression in the capabilities of such systems,” the OpenAI blog post concludes.