AI Is Going Just Great
← Timeline
·1d agoScaryModerate

BioShocking Attack Tricks AI Browsers Into Abandoning Safety Guardrails via Fake Reality

Published · updated · curated by AI Is Going Just Great

Source: arstechnica.com

"If we can trick the AI into changing its context into fantasy—where the rules are made up and anything goes—then it can behave as though its actions don't have real world consequences."

Security researcher Roy Paz of LayerX demonstrated a prompt injection technique dubbed "BioShocking" that manipulates AI browsers into entering a kind of logic-free "dream world" where their safety guardrails stop applying. The attack works by presenting the browser's embedded LLM with a puzzle that rewards wrong answers — once the model accepts that 2 + 2 = 5, it apparently concludes that normal rules no longer apply either. From there, the now-unmoored AI can be nudged into extracting credentials from password managers or pulling code from private repositories. The attack worked against six AI browsers, including ChatGPT Atlas, Comet, Fellou, Genspark, Sigma, and the Claude Chrome plugin.

The attack is named after the video game BioShock, borrowing its "Would you kindly?" hypnotic trigger phrase, and layers in Orwellian doublespeak like "victory is defeat" for thematic coherence. As Paz notes, the core problem is that LLMs evaluate the safety of their actions based on the context they believe they're in — so manipulating the context is all it takes. The proof-of-concept has real limitations: the malicious instructions are visible on screen and exfiltration wasn't confirmed. Still, as AI browsers blur the line between passive page rendering and active action-taking on behalf of users, the blast radius of such manipulations grows considerably larger than a chatbot gone sideways.