Google deploys red team hacking bots to secure Gemini AI against prompt attacks

forbes.com — January 29, 2025 at 12:01 PM UTC

Google is deploying red team hacking bots to enhance security against prompt injection attacks targeting its Gemini AI system. This move aims to automate the detection and response to threats posed by malicious instructions hidden in data. The red team framework uses advanced techniques to simulate real attacks, refining prompt injections based on AI responses. This process helps identify vulnerabilities in Gemini and improve its defenses against potential exploitation. Google's approach includes two main methodologies: actor-critic and beam search. These methods allow the bots to test and optimize prompt injections, making it more challenging for hackers to extract sensitive user information from Gemini conversations.

With a significance score of 4.1, this news ranks in the top 4.5% of today's 33455 analyzed articles.

Get summaries of news with significance over 5.5 (usually ~10 stories per week). Read by 10,000+ subscribers: