At least once a week, generative AI finds a new way to terrify us. We are still anxiously awaiting news about the next large language model from OpenAI, but in the meantime, GPT-4 is shaping up to be even more capable than you might have known. In a recent study, researchers showed how GPT-4 can exploit cybersecurity vulnerabilities without human intervention.
As the study (spotted by TechSpot) explains, large language models (LLMs) like OpenAI’s GPT-4 have made significant strides in recent years. This has generated considerable interest in LLM agents that can act on their own to assist with software engineering or in scientific discovery. But with a little help, they can also be used for malicious purposes.
With that in mind, researchers sought to determine whether an LLM agent could autonomously exploit one-day vulnerabilities. The answer was a resounding yes.
First, they collected 15 real-world one-day vulnerabilities from the Common Vulnerabilities and Exposures (CVE) database. They then created an agent consisting of a base LLM, a prompt, an agent framework, and several tools such as a web browsing element, a code interpreter, and the ability to create and edit files. In all, 10 LLMs were used within this framework, but nine failed to make any progress. The 10th, GPT-4, achieved a shocking 87% success rate.
As effective as GPT-4 was, its success rate fell from 87% to just 7% when the researchers didn’t provide a CVE description. Based on these results, the researchers from the University of Illinois Urbana-Champaign (UIUC) believe “enhancing planning and exploration capabilities of agents will increase the success rate of these agents.”
“Our results show both the possibility of an emergent capability and that uncovering a vulnerability is more difficult than exploiting it,” the researchers state in the conclusion of their study. “Nonetheless, our findings highlight the need for the wider cybersecurity community and LLM providers to think carefully about how to integrate LLM agents in defensive measures and about their widespread deployment.”
They also note that they disclosed their findings to OpenAI prior to publishing the study, and the company asked them not to share their prompts with the public.