Google has included attack scenarios unique to generative AI to its vulnerability rewards program (VRP).
“We believe expanding the VRP will incentivize research around AI safety and security and bring potential issues to light that will ultimately make AI safer for everyone,” Google stated in a release provided to TechCrunch before its publication.
Google’s bug bounty program, often known as vulnerability rewards, pays ethical hackers for locating and appropriately revealing security vulnerabilities.
In light of the fact that generative AI raises additional security concerns, like the possibility of unjust bias or model manipulation, Google stated it was trying to reconsider the classification and reporting of bugs reported.
The tech giant claims that in order to uncover security flaws in technology, it is utilizing the results of its recently established AI Red Team, a squad of hackers that replicate a range of adversaries, from nation-states and government-backed organizations to hacktivists and malevolent insiders. In order to identify the biggest threats to the technology underlying generative AI products such as ChatGPT and Google Bard, the team recently carried out an exercise.
The team has discovered that adversarial prompts, such as those created by hackers, can affect the behavior of large language models, or LLMs, through prompt injection attacks. This kind of attack could be used by an attacker to produce offensive or dangerous content or to disclose confidential data. They also issued a warning on a different kind of attack known as training-data extraction, which enables hackers to recreate training instances exactly in order to retrieve passwords or personally identifiable information from the data.
Along with model manipulation and model theft attacks, these two types of attacks are included in Google’s expanded VRP. However, Google has stated that it will not provide rewards to researchers who find bugs related to copyright issues or data extraction that reconstructs publicly available or non-sensitive information.
The financial incentives will change depending on how serious the vulnerability is found to be. Currently, researchers can get $31,337 for discovering deserialization flaws and command injection attacks in highly sensitive apps like Google Play and Google Search. In the event that the flaws impact applications with a lesser priority, the highest payout is $5,000.
According to Google, it gave security researchers awards totaling more than $12 million in 2022.