In an exciting development, OpenAI has just released its experimental ChatGPT Atlas browser, a major new player in the field of AI-assisted web navigation. The ridehailing company recognizes that this new addition comes with dangers. In particular, its ‘agent mode’ has potential to significantly increase the security threat landscape.
Prompt injection attacks
OpenAI has acknowledged the increasing threat of prompt injection attacks. In response, they’ve issued guidelines for the general public to mitigate potential harms of this emerging technology.
Just last month, OpenAI released ChatGPT Atlas. This was a huge moment for the company as it strived to bring cutting-edge, generative AI technology to the masses and everyday browsing. With Arc, users can approach the browser in a much more tactile, dynamic manner. The system requires explicit user confirmation before sending any message or completing any monetary transaction. With the addition of agent mode, there have been warnings issued regarding heightened vulnerabilities.
In a recent blog post, OpenAI explained how these efforts were used to further strengthen ChatGPT Atlas’s defenses against prompt injection attacks. Security researchers have demonstrated that it is possible to alter the behavior of an AI-powered browser by inputting specific phrases into platforms like Google Docs. This capability comes with a very high risk, as it is still possible for the AI’s action to be misinterpreted and result in unintended harm.
Brave continues to cause a stir in the AI browser frontier. They argue that indirect prompt injection represents a fundamental threat to AI-powered browsers. The U.K.’s National Cyber Security Centre recently underscored these risks. Additionally, they cautioned that we can never fully remove the ability to execute prompt injection attacks on generative AI applications. Among the very first advice they gave on mitigating the risk and impact of these attacks was for cyber professionals to focus on those strategies.
OpenAI realizes that the security threat surface represented by ChatGPT Atlas is very large. The firm’s goal is to achieve harmony between increasing usability and security without sacrificing strong security standards. To get a better sense of the broader implications of these vulnerabilities, we spoke to principal security researcher at Wiz, Rami McCarthy. He emphasized that “a useful way to reason about risk in AI systems is autonomy multiplied by access.”
The task of managing agentic browsers is arguably the most challenging of those we face. McCarthy noted that they occupy a “challenging part of that space: moderate autonomy combined with very high access.” He further articulated his concerns, stating, “For most everyday use cases, agentic browsers don’t yet deliver enough value to justify their current risk profile.”
OpenAI has agreed to continue developing methods to prevent prompt injections as these threats are just going to keep coming. The organization stated, “We view prompt injection as a long-term AI security challenge, and we’ll need to continuously strengthen our defenses against it.” Instead, this explores their concern that the broad latitude enables bad or insidious content to more readily shape the agent. This is the case even with safeguards built in.
Reinforcement learning is one of the specific tactics that OpenAI intends to use to adjust and adapt to attacker behavior. Even though this approach is a better step toward security, McCarthy cautions against seeing it as the only solution. He noted that only allowing logged-in users access limits their reach and requiring approval of confirmation requests limits freedom of expression.
These include security measures OpenAI is currently taking to further secure ChatGPT Atlas. Prompt injection attacks are no doubt one of the most dangerous vectors still training. The company acknowledged that “prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully ‘solved’.”


