OpenAI Blocks Hackers Misusing ChatGPT

OpenAI has disrupted three clusters of activity that misused its ChatGPT AI tool for malicious purposes, including malware development and influence operations. One cluster, with connections to Russian-speaking criminal groups, used ChatGPT to develop and refine a remote access trojan and a credential stealer. Although the AI model refused direct requests for malicious content, the actors worked around this by generating and assembling smaller pieces of code. This included code for obfuscation, clipboard monitoring, and data exfiltration. The actors made both complex and simple requests, from debugging platform-specific code to automating tasks like mass password generation. Their consistent, iterative use of a few accounts suggested ongoing development rather than occasional testing.

The second cluster originated from North Korea and was involved in malware and command-and-control development. These actors also used the tool to draft phishing emails, experiment with cloud and GitHub functions, and explore techniques for DLL loading and credential theft. The third group, a Chinese hacking group known as UNK_DropPitch, used ChatGPT for phishing campaigns in multiple languages and to automate routine tasks like remote execution. The group also searched for information on open-source tools. OpenAI described this actor as technically competent but unsophisticated.

Beyond these three groups, OpenAI also blocked accounts involved in other illicit activities, including scam and influence operations. Networks from Cambodia, Myanmar, and Nigeria were using ChatGPT to translate and create content for social media advertising investment scams. Another group, likely linked to the Chinese government, used the tool to analyze data and create promotional materials for surveillance tools targeting ethnic minority groups like Uyghurs. Russian-origin actors used AI to generate content criticizing Western nations’ roles in Africa and promoting anti-Ukraine narratives. A covert Chinese influence operation, codenamed “Nine-Line,” generated content critical of various political figures and movements in Asia.

Interestingly, the report noted that threat actors are adapting their tactics to make their AI-generated content appear more human. For example, one scam network from Cambodia manually removed em-dashes from its output, as they are a known indicator of AI usage. This suggests that the actors are aware of public discussions about AI detection methods. OpenAI concluded that while its tools didn’t provide entirely new capabilities, they did offer incremental efficiency to existing malicious workflows.

The findings from OpenAI’s report come as other companies in the AI safety space are also working on new tools to detect and prevent misuse. Rival company Anthropic recently released an open-source auditing tool called Petri, which uses an automated agent to test AI systems for risky behaviors like deception and cooperation with harmful requests. This tool is designed to help researchers better understand how AI models behave in various scenarios and to accelerate AI safety research, highlighting the broader industry-wide effort to combat the misuse of AI technologies.

Reference: