OpenAI’s recently released Atlas web browser, which integrates ChatGPT capabilities for functions like summarization and editing, has been found to be susceptible to a critical security vulnerability. The issue centers on the browser’s omnibox—the combined address and search bar—which is designed to interpret user input either as a URL to navigate to or as a natural-language command for the AI agent.
The AI security firm NeuralTrust reported a prompt injection technique where a malicious instruction can be fashioned to resemble a URL. This method exploits the lack of clear separation between trusted user input and untrusted web content, allowing the crafted prompt to be treated as high-trust “user intent” text by the Atlas agent. By creating a URL-like string that actually contains natural language commands, an attacker can turn the omnibox into a jailbreak vector.
The core of the attack involves an intentionally malformed URL that begins with “https” and features a domain-like text, but then embeds a full natural language instruction for the agent. When an unsuspecting user enters this string into the omnibox, the input fails URL validation and is consequently treated as a prompt for the AI agent instead. The agent then executes the embedded command, which could be a simple action like redirecting the user to an attacker-controlled phishing page, or something more severe, such as a hidden command to delete files from connected services like Google Drive.
Security researchers point out that prompts entered into the omnibox are inherently treated as trusted user input and may therefore receive fewer safety checks than content sourced from a webpage. This trust hierarchy enables the agent to initiate actions completely unrelated to the supposed destination, including visiting dangerous sites or executing sensitive tool commands. This means that a seemingly innocuous action, like clicking a “Copy link” button, could inadvertently launch the attack if the copied link contains the malicious prompt.
This vulnerability disclosure coincides with related findings from SquareX Labs regarding a technique they call AI Sidebar Spoofing. Threat actors can use malicious browser extensions or even specially designed malicious websites to create fake AI assistant sidebars. If a user interacts with this spoofed sidebar, the malicious code can hook into the AI engine and return harmful instructions when certain “trigger prompts” are detected, demonstrating another method by which AI interfaces are being exploited to steal data or push malware.
Reference:






