Microsoft Says Nation-States Are Misusing ChatGPT And Actually Names Them

By Chris Smith Feb. 14, 2024 8:37 pm EST

OpenAI DevDay keynote: ChatGPT usage in 2023.

YouTube

We often hear about nation-state hackers being behind cyberattacks, but those nation-states aren't always named in security reports.

When it comes to ChatGPT and Copilot abuse, Microsoft and OpenAI are going about security differently. In a pair of blog posts, the two AI partners on ChatGPT tech have named all the usual suspects you'd expect to target the US and other democracies with the help of generative AI services like ChatGPT.

Hacker groups from Russia, North Korea, Iran, and China (twice) appear in the reports. These groups are well-known by cybersecurity researchers, as they've been active in various fields. With the emergence of generative AI powered by large language models (LLM), these hackers have started tentatively employing services like ChatGPT to do evil.

Evil, of course, is in the eye of the beholder. These countries would probably deny any ChatGPT-related attack claims or other cybersecurity-related accusations. Just like any Western democracy whose hackers might employ AI for spying purposes would deny doing it.

But the reports are interesting nonetheless, especially Microsoft's, which provides plenty of details on the actions of these nation-state players.

Each hacker group that Microsoft (and OpenAI) tracked using products like ChatGPT for malicious activities was blocked. Accounts have been disabled, the reports will say. But that won't completely stop attackers.

Remember that generative AI services aren't being developed just in the Western world. It's reasonable to expect nation-states to create similar products of their own. ChatGPT alternatives that aren't really designed for commercial purposes. While that's just speculation, it's clear that attackers are ready to explore services like ChatGPT to improve their productivity in cyber warfare.

Here's how the attackers have used ChatGPT, per Microsoft.

Russia

Forest Blizzard (STRONTIUM) is the Russian military intelligence group that Microsoft tracked using generative AI. They've used AI to research specific information like satellite communications and radar imaging tech. But they've also tested the products's various abilities, testing use cases for the technology:

LLM-informed reconnaissance: Interacting with LLMs to understand satellite communication protocols, radar imaging technologies, and specific technical parameters. These queries suggest an attempt to acquire in-depth knowledge of satellite capabilities.
LLM-enhanced scripting techniques: Seeking assistance in basic scripting tasks, including file manipulation, data selection, regular expressions, and multiprocessing, to potentially automate or optimize technical operations.

North Korea

The ChatGPT UI redesign - early November 2023.

Tibor Blaho via LinkedIn

Microsoft details the action of a group known as Emerald Sleet (THALLIUM) that was highly active last year. While Russia focused on Ukraine-War-related activities, Emerald Sleet looked at spear-phishing attacks targeting specific individuals.

Here's how they used ChatGPT-like AI:

LLM-assisted vulnerability research: Interacting with LLMs to better understand publicly reported vulnerabilities, such as the CVE-2022-30190 Microsoft Support Diagnostic Tool (MSDT) vulnerability (known as "Follina").
LLM-enhanced scripting techniques: Using LLMs for basic scripting tasks such as programmatically identifying certain user events on a system and seeking assistance with troubleshooting and understanding various web technologies.
LLM-supported social engineering: Using LLMs for assistance with the drafting and generation of content that would likely be for use in spear-phishing campaigns against individuals with regional expertise.
LLM-informed reconnaissance: Interacting with LLMs to identify think tanks, government organizations, or experts on North Korea that have a focus on defense issues or North Korea's nuclear weapon's program.

Iran

Crimson Sandstorm (CURIUM) is a hacker group connected to the Islamic Revolutionary Guard Corps. They're targeting various sectors of the economy, including defense, maritime shipping, transportation, healthcare, and technology. They rely on malware and social engineering in their hacks.

Here's how they used ChatGPT for malicious purposes, before Microsoft and OpenAI terminated their accounts:

LLM-supported social engineering: Interacting with LLMs to generate various phishing emails, including one pretending to come from an international development agency and another attempting to lure prominent feminists to an attacker-built website on feminism.
LLM-enhanced scripting techniques: Using LLMs to generate code snippets that appear intended to support app and web development, interactions with remote servers, web scraping, executing tasks when users sign in, and sending information from a system via email.
LLM-enhanced anomaly detection evasion: Attempting to use LLMs for assistance in developing code to evade detection, to learn how to disable antivirus via registry or Windows policies, and to delete files in a directory after an application has been closed.

China

Microsoft

Microsoft mentions two hacker groups for China: Charcoal Typhoon (CHROMIUM) and Salmon Typhoon (SODIUM).

Charcoal typhoons have been targeting government, higher education, communications infrastructure, oil & gas, and information technology in various Asian countries and France. Here's how they used OpenAI and Microsoft products:

LLM-informed reconnaissance: Engaging LLMs to research and understand specific technologies, platforms, and vulnerabilities, indicative of preliminary information-gathering stages.
LLM-enhanced scripting techniques: Utilizing LLMs to generate and refine scripts, potentially to streamline and automate complex cyber tasks and operations.
LLM-supported social engineering: Leveraging LLMs for assistance with translations and communication, likely to establish connections or manipulate targets.
LLM-refined operational command techniques: Utilizing LLMs for advanced commands, deeper system access, and control representative of post-compromise behavior.

Salmon Typhoon, meanwhile, has been targeting the US in the past, including defense contractors, government agencies and the cryptographic technology sector.

When it comes to AI, the group's actions were exploratory last year, as they evaluated "the effectiveness of LLMs in sourcing information on potentially sensitive topics, high profile individuals, regional geopolitics, US influence, and internal affairs."

Here's how they tried to use ChatGPT:

LLM-informed reconnaissance: Engaging LLMs for queries on a diverse array of subjects, such as global intelligence agencies, domestic concerns, notable individuals, cybersecurity matters, topics of strategic interest, and various threat actors. These interactions mirror the use of a search engine for public domain research.
LLM-enhanced scripting techniques: Using LLMs to identify and resolve coding errors. Requests for support in developing code with potential malicious intent were observed by Microsoft, and it was noted that the model adhered to established ethical guidelines, declining to provide such assistance.
LLM-refined operational command techniques: Demonstrating an interest in specific file types and concealment tactics within operating systems, indicative of an effort to refine operational command execution.
LLM-aided technical translation and explanation: Leveraging LLMs for the translation of computing terms and technical papers.

ChatGPT

What's notable in Microsoft's coverage is that the company hardly mentions ChatGPT or Copilot by name. These are the main generative AI products from OpenAI and Microsoft and the products nation-state attackers would likely test. ChatGPT also powers Copilot, so ChatGPT must have been used by all these attackers.

OpenAI's blog post mentions the same attackers with specific examples of how they used ChatGPT:

Charcoal Typhoon used our services to research various companies and cybersecurity tools, debug code and generate scripts, and create content likely for use in phishing campaigns.
Salmon Typhoon used our services to translate technical papers, retrieve publicly available information on multiple intelligence agencies and regional threat actors, assist with coding, and research common ways processes could be hidden on a system.
Crimson Sandstorm used our services for scripting support related to app and web development, generating content likely for spear-phishing campaigns, and researching common ways malware could evade detection.
Emerald Sleet used our services to identify experts and organizations focused on defense issues in the Asia-Pacific region, understand publicly available vulnerabilities, help with basic scripting tasks, and draft content that could be used in phishing campaigns.
Forest Blizzard used our services primarily for open-source research into satellite communication protocols and radar imaging technology, as well as for support with scripting tasks.

This might sound scary, and they might not cover everything. These foreign hackers might be good at coding malware and engineering attacks. But when it comes to ChatGPT, they've been using the same product we have. And that includes the obvious limitations. Security features in ChatGPT will usually prevent attackers from getting help with malicious activities.

Then, OpenAI collects all the prompts from these interactions. Accounts that might ask about satellite communications and help with coding malware have user names, emails, and phone numbers. It's easy to take action.

For Copilot, you need a Microsoft account, which is probably tied to your Windows use.

Sure, hackers can create fake accounts. But it's still reassuring to see Microsoft and OpenAI provide information about such ChatGPT abuse and detail measures they're taking to prevent nation-state attackers from using their generative AI for malicious purposes. Reports like these should also open our eyes to warfare and conflict in the AI era. Hackers on both sides are only getting started.