Google Launches New AI Bug Bounty

Google has announced a new AI Vulnerability Reward Program (VRP) that expands on its previous program from 2023. This new initiative provides a clear framework for bug hunters to report security and abuse issues within Google’s AI systems and receive financial rewards. To date, researchers have already earned more than $430,000 for finding vulnerabilities, and this new program aims to build on that success, incorporating feedback from participants to create a more effective system. By offering clear guidelines and tiered rewards, Google hopes to incentivize researchers to help make its AI products more secure.

The program’s scope is a key aspect of its design, and it has some notable exclusions. Prompt injections, jailbreaks, and alignment issues are not covered. Google has stated that it doesn’t believe a VRP is the right format for addressing these content-related problems, as the primary goal of the program is to address security vulnerabilities and abuse issues. While these content issues are not within the VRP’s scope, Google still encourages researchers to report them through in-product feedback features, which are available on all of its AI products.

The VRP is specifically designed to address more direct security threats. Its scope includes attacks that can modify a victim’s account or data, leak sensitive information, exfiltrate model parameters, or lead to the persistent manipulation of a victim’s AI environment. Other covered vulnerabilities include attacks that enable server-side features without authorization, cause persistent denial-of-service (DoS), or enable phishing through cross-user injection of HTML code on Google sites. These types of attacks are considered high-priority because they directly impact user safety and system integrity.

To determine reward amounts, Google has organized its AI products into three tiers: flagship, standard, and other. The flagship tier includes prominent features in Google Search, Workspace, and Gemini Apps, which are considered the most critical. The highest rewards, up to $20,000, are offered for attacks that modify a victim’s account or data in flagship products. For similar attacks in standard products, such as AI Studio and Jules, the reward can be up to $15,000. The highest reward for sensitive data exfiltration is up to $15,000 for flagship and standard products, and up to $10,000 for the ‘other’ tier.

Google has stated that a unified reward panel will review all submissions, ensuring researchers receive the highest possible reward. This streamlined approach aims to provide timely and valuable compensation, further encouraging security researchers to participate. By creating a dedicated and structured program, Google is demonstrating its commitment to proactively identifying and fixing vulnerabilities in its AI systems, ultimately enhancing the security and trust of its products for all users.

Reference: