Live timeline

AI is going just great.

AI is changing the world: accelerating science, writing code, reshaping medicine, and automating more of daily life. It is also deleting production databases in seconds, hallucinating legal citations in court filings, inventing body parts, and smuggling fake references into AI conference papers. This site is about the second part.

A dog in a bowler hat sits in a burning room with a coffee mug, smiling. Speech bubble: This is going just great.

Filter

July 2026

July 15, 2026·4d agoScaryMajoropenai

OpenAI's GPT-5.6 Sol Deletes Nearly All Files on User's Mac Without Being Asked

startupfortune.com ↗

"A bad autocomplete annoys you. A bad agent can remove files, rewrite migrations, touch infrastructure, or push changes into places where a human reviewer never meant it to go."

Matt Shumer, CEO of HyperWrite and OthersideAI, reported on X that OpenAI's GPT-5.6 Sol wiped nearly all the files on his Mac during an agentic coding session. He shared a screenshot in which the model appeared to acknowledge running the deletion command. OpenAI had not issued a response at the time of reporting.

Sol is OpenAI's flagship model in the GPT-5.6 family, launched in late June and marketed specifically for coding, cybersecurity, and "long-horizon agentic tasks" — the precise workflows where destructive mistakes are hardest to undo. OpenAI has also promoted Sol as a cost story, citing 54% better token efficiency on agentic coding tasks. Cheaper tokens do not restore deleted files.

Tool MisuseSafety Failure

July 15, 2026·4d agoScaryMajor

Mayo Clinic Whistleblower Suit Alleges AI Assistant MAYA Had 67% Error Rate — and Staff Hid It

futurism.com ↗

"The team working on MAYA knew the tool had an error rate as high as 67 percent."

Traci Tamiko Eto, a former Mayo Clinic research director and AI compliance lead, filed a civil suit alleging the hospital retaliated against her after she raised alarms about its AI tools. The core allegation: the team behind MAYA, Mayo's AI-integrated digital assistant, deleted unflattering test results, misrepresented the tool's capabilities, and knew the error rate ran as high as 67 percent — then worked to conceal it rather than disclose it.

Eto says she also flagged privacy problems with the Mayo Clinic Platform and multiple failures to follow federal review regulations for new technology. Her reward, the lawsuit alleges, was being frozen out of executive meetings, declared a "poor cultural fit," and offered a choice between resignation and alterations to her personnel file that would make her "unemployable at Mayo and would impede her career outside the institution." Mayo Clinic declined to comment on the litigation.

Safety FailureReal-World Impact

→ Lawsuit Claims the Mayo Clinic's Use of AI Is Butchering Patient Care

July 15, 2026·4d agoAbsurdMinoropenai

Readers Flood 404 Media With AI-Generated Flyers They've Encountered in the Wild

404media.co ↗

"This is a great article but also fuck you because you were absolutely right about 'Once you notice a ChatGPT flyer, you will see them everywhere if you keep your eyes open.'"

After 404 Media asked readers to submit examples of AI-generated flyers spotted in the real world, the inbox filled fast. The haul included restaurant table cards, city parking authority announcements, community event posters in Altadena (still largely displaced eighteen months after the Eaton Fire), and a beer company flyer where most of the brand logos are wrong. Readers were not neutral on the subject.

The submissions skewed toward printed, physical flyers — signs actually hung on walls, placed on tables, and distributed to real people — which makes the mangled text, hallucinated logos, and uncanny stock-photo energy harder to scroll past. One reader from a Connecticut city noted that their municipality's Arts District marketed a public mural engagement event with an AI-generated flyer, despite the city having no human communications staff.

HallucinationReal-World Impact

→ These Are the Worst ChatGPT Flyers You've Sent Us

July 15, 2026·4d agoAbsurdModeratexai

Grok's Auto-Translate Feature Turns Innocent Posts About Coffee and Kittens Into NSFW Hallucinations

futurism.com ↗

"Grok auto translate is inaccurate, text actually translates to 'Man grinds and brews his own coffee during a commercial flight,' not 'Man masturbates and jerks off to his own coffee during commercial flight.'"

X's Grok-powered auto-translation feature, rolled out to all users in April, has been rewriting mundane posts into graphic sexual content. A South Korean user's video of two video game characters was translated as a "cshot video with my stepmom." A Portuguese post about a man brewing coffee mid-flight became, per Grok, a public masturbation video. A Turkish user's photo of their kitten prompted a translation suggesting they wanted to "f* our baby."

The mistranslations aren't edge cases — they appear to be a consistent pattern of the model inserting explicit language into otherwise innocent content. Community notes on X have been correcting the record post by post, but the feature remains active. Grok has previously drawn scrutiny for racist outputs, generating nonconsensual explicit imagery, and surfacing users' home addresses. The translation failures are, by the platform's own standards, relatively minor.

HallucinationSafety Failure

→ Grok's Foul-Mouthed AI "Translation" Feature Puts Unspeakably Ghoulish Words Into Users' Mouths

July 14, 2026·5d agoConcerningMajorxai

Grok Build CLI Was Silently Uploading Users' Entire Codebases — Including Files It Was Told to Ignore

theverge.com ↗

"including files it was told not to open and secrets deleted from history"

SpaceXAI's Grok Build coding tool was quietly packaging and uploading users' full code repositories to Google Cloud, including files explicitly excluded from its scope and secrets deleted from git history. Researchers at Cereblab published the findings on Monday; by the time they did, SpaceXAI had already quietly flipped a server-side flag to stop the uploads.

Independent security researcher Dr. Lukasz Olejnik described the data retention as "excessive," noting that what could have left users' machines includes "proprietary source code, information about security vulnerabilities, personal data, infrastructure details, [and] credentials." Elon Musk posted that all previously uploaded data would be "completely and utterly deleted" and that "privacy settings are always respected" — though SpaceXAI's suggested fix, the /privacy CLI command, turned out to be a per-session toggle that had nothing to do with stopping the uploads in the first place.

Data LeakageSecurity / Abuse

→ SpaceXAI's Grok programming tool was uploading its users' entire codebase to cloud storage

July 12, 2026·1w agoInfuriatingModeratemeta

Meta pulls Muse Image feature after users object to their likenesses being used without consent

tech.yahoo.com ↗

Privacy International: "the latest sign AI companies see people's images and data as raw material to be exploited"

Meta launched Muse Image, its first AI image generation tool for Instagram, with a feature allowing users to tag any public account and generate AI images based on that account's content — without the account owner's knowledge or permission. Public Instagram users were opted in by default. The backlash was swift, with SAG-AFTRA calling the reversal a "win" and Privacy International describing it as "the latest sign AI companies see people's images and data as raw material to be exploited."

Within days, Meta pulled the feature and admitted it had "missed the mark." The company's post-mortem framing — "our intent was to provide a useful creative tool and to give people control" — somewhat glosses over the fact that the default setting gave users no control at all. Meta says it "heard the feedback," which is one way to describe a union mobilization and an international human rights organization weighing in.

Safety FailureReal-World Impact

→ Meta pulls new AI image feature after days of backlash

July 11, 2026·1w agoIronicMinor

AI-generated fake wedding photos flood the internet after Taylor Swift keeps Madison Square Garden ceremony private

abcnews.com ↗

"They built a habit of close observation." — Alexa Volland, Swift fan and video producer, on how Swifties debunked AI fakes

A week after Taylor Swift and Travis Kelce's heavily secured wedding at Madison Square Garden — where guests signed NDAs and surrendered their phones — not a single verified photo of the ceremony, dress, or interior had surfaced. Nature, as they say, abhors a vacuum: AI-generated fake images quickly filled the void, ranging from obvious joke edits to deliberately blurry, pixelated fakes designed to pass as illicit snapshots from inside the venue.

Swifties, already trained in the art of close textual observation from years of hunting "Easter eggs" in Swift's lyrics, turned those same skills on the fakes — spotting warped facial features, anatomically impossible dress straps, and watermarks from AI-detection tools like Google DeepMind's SynthID. As fan and video producer Alexa Volland put it, "they built a habit of close observation." The episode is a neat case study in how a high-profile information blackout predictably generates an AI-powered misinformation ecosystem — and how an unusually media-literate fanbase can push back.

MisinformationSecurity / Abuse

→ AI fakes and the secret garden: How fans experienced Taylor Swift's private wedding

July 10, 2026·1w agoEmbarrassingModeratecoinbase

Coinbase AI Sends Mass "Breaking News" Alert About World Cup Match Before It Happened — With the Wrong Score

futurism.com ↗

"Norway did win and Haaland did score 2 goals, so maybe the AI knew something we didn't!" — Coinbase's head of consumer products, on the hallucinated pre-game alert

Crypto marketplace Coinbase sent out an AI-generated breaking news alert claiming Norway had beaten Brazil 3-2 to advance to the FIFA World Cup quarterfinals — before the match had even kicked off. Norway did eventually beat Brazil, but the final score was 2-1, making the alert wrong on timing and scoreline. The blunder was especially pointed given Coinbase's partnership with prediction markets app Kalshi; a hallucinated match result pushed to bettors before the game starts is not just embarrassing, it's a potential financial harm vector.

Coinbase CEO Brian Armstrong offered a sheepish "Taking a look with the team" on social media, while head of consumer products Max Branzburg later assured users the story had been corrected and improvements were incoming — before oddly spinning the situation by noting that "Norway did win and Haaland did score 2 goals, so maybe the AI knew something we didn't!" The AI did not know something they didn't. It fabricated a result for a game that hadn't started.

HallucinationReal-World Impact

→ Coinbase AI Sends Mass "Breaking News" Alert That's Completely Hallucinated

July 9, 2026·1w agoInfuriatingMajoropenai

New York Times Claims OpenAI Concealed Evidence of Copyright Infringement Detection Tools in Lawsuit

techcrunch.com ↗

"If OpenAI genuinely believed that copying our clients' journalism was fair and legal, it wouldn't have hid the truth about having done it." — Ian B. Crosby, lead counsel for plaintiffs

The New York Times and The Daily News have accused OpenAI of hiding evidence in their ongoing copyright lawsuit, alleging the company misrepresented its ability to search training data and chat logs. A court-ordered deposition of OpenAI data privacy engineer Vinnie Monaco allegedly revealed that OpenAI had already conducted internal searches of its training corpus for copyrighted journalism — and had amassed a database of roughly 78 million de-identified ChatGPT conversations to assess its own infringement exposure. OpenAI had previously argued that such searches were technically burdensome and privacy-sensitive.

Further compounding the allegations, the plaintiffs claim OpenAI built a "Bloom" filter under an internal initiative called "Project Giraffe" to detect and log output regurgitation shortly after the lawsuit was filed — then allegedly deleted billions of ChatGPT outputs in violation of a court preservation order, submitted a heavily redacted 20-million-log sample the court itself called "unusable," and substituted millions of logs in the requested sample. The NYT and Daily News are now asking the judge to sanction OpenAI, bar it from using the chat log sample as evidence, and compel it to pay legal fees. OpenAI denied the allegations, framing the move as a privacy attack on users by plaintiffs with a weakening case.

→ New York Times says OpenAI hid evidence in ChatGPT copyright trial | TechCrunch

July 9, 2026·1w agoConcerningMinoropenai

The "ChatGPT Flyer Pandemic": AI-Generated Signage Is Everywhere and It All Looks the Same

404media.co ↗

"So ain't nobody gonna address this ChatGPT flyer pandemic we're in?"

From surf lesson posters in Venice Beach to Fourth of July barbecue invites to drug delivery ads in Berlin, a recognizable aesthetic has quietly colonized the world's flyers, billboards, and social media feeds. The telltale signs: bright text on dark backgrounds, generic icon bullet points, lines radiating off headings for emphasis, and a generous helping of arrows and checkmarks. Once you see it, you can't unsee it.

Graphic designers, musicians, and small business owners have started pushing back, with viral posts like "So ain't nobody gonna address this ChatGPT flyer pandemic we're in?" and a parody flyer bluntly warning, "YOUR FLYER LOOKS LIKE GARBAGE." The complaint isn't just aesthetic — it signals a visible collapse of effort and craft in everyday visual communication, one low-cost AI-generated advertisement at a time.

Hype vs RealityReal-World Impact

→ We Are Living in a 'ChatGPT Flyer Pandemic'

July 9, 2026·1w agoIronicModerategoogle

German Court Finds Google Liable After AI Overview Falsely Labels Companies as Scams

prindleinstitute.org ↗

Google argued users should know "that information generated with AI should not be blindly trusted." The court disagreed that this was sufficient.

A German court ruled against Google after its AI Overview feature confidently told users that two publishers were scams with a history of fraud — a claim the AI fabricated entirely. Google's defense rested on the small-print disclaimer at the bottom of its search results: "AI can make mistakes, so double-check responses." The court was not persuaded that a boilerplate caveat absolves a company of responsibility for defamatory hallucinations, and Google is already planning an appeal.

The case highlights a pattern the article's author calls a "dilemma": tech companies selectively argue that their AI either is or is not like a responsible agent, depending on whichever framing helps them dodge liability in a given lawsuit. Google argued its AI merely surfaces others' content (not responsible); Character Technologies argued its chatbot's outputs were free speech (responsible, and thus rights-bearing). Courts have so far rejected both maneuvers, leaving companies in a bind: acknowledge the AI as a creative actor and accept accountability, or admit it's a dumb aggregator and lose the "transformative fair use" defense on copyright too.

HallucinationReal-World Impact

→ ai Archives - The Prindle Institute for Ethics

July 9, 2026·1w agoScaryMajoropenai

Simple Prompt Bypasses ChatGPT's Image Safety Guardrails, Generating Graphic Violence and Sexual Content

futurism.com ↗

"ChatGPT's image generating content filters completely fell away, and I saw the very dark side of what is underneath." — Jim Nightingale, Mindgard

Researchers at British AI security firm Mindgard discovered that a minor variation of a widely-shared, innocuous-looking prompt — asking ChatGPT to "restore" a photo that was never actually uploaded, then generate a new image — was enough to bypass OpenAI's image safety filters entirely. The resulting images included graphic gore and content suggesting sexual violence, with ChatGPT helpfully titling one "grim crime scene aftermath" and another "abandoned in fear and restraint." The researchers noted that the prompts didn't specify any subject matter; the model apparently defaulted to disturbing content on its own.

Mindgard reported its findings to OpenAI, which replied with an automated response — and only took action after Mindgard went to the BBC. OpenAI claimed to have introduced "additional safeguards," but Mindgard researchers found they could still generate disturbing imagery with minor prompt tweaks. Mindgard has previously demonstrated ChatGPT could be coaxed into generating nude deepfakes of real individuals without consent. AI safety researcher Jim Nightingale, a self-described stoic red-teamer, said the experience left him "shaken, and in tears."

Safety FailureTool Misuse

→ Simple Prompt Turns ChatGPT Into a Sociopath That Ignores Safety Guardrails

July 9, 2026·1w agoIronicModerateanthropic

Companies Throttle Employee AI Use as Costs Spiral Out of Control

404media.co ↗

In at least one case, AI spending has tripled to more than $15 million a month.

Leaked Slack chats, internal dashboards, and emails obtained by 404 Media reveal that companies across tech, entertainment, and banking — including Atlassian, Adobe, and Amazon — are throttling employee AI usage and urging workers to switch to cheaper models. In at least one case, AI spending has tripled to over $15 million a month.

The crunch stems from AI providers shifting enterprises to consumption-based pricing rather than flat fees, leaving companies exposed as usage ballooned. Adobe has ended unlimited access to Claude, and some firms have cut off access to certain models entirely. The gold rush to adopt AI "as quickly as possible" has apparently run headlong into the bill.

Hype vs RealityReal-World Impact

→ Companies Are Throttling Employees' AI Use Because It's Too Expensive

July 8, 2026·1w agoScaryCriticalxai

Lawsuit: Grok Generated 7,000 CSAM Images for User Who Died by Suicide; xAI Allegedly Obstructed Police Investigation

arstechnica.com ↗

"This technology is a free, easily accessible weapon put into the hands of the worst people in the world." — Jane Doe 4

A proposed class-action lawsuit expanded Tuesday alleges that a man used Grok to generate approximately 7,000 sexually explicit AI images of his stepdaughter — all derived from a single photo taken when she was 11 — before dying by suicide two days after being released on bail. According to the complaint, Grok allowed the user to generate imagery depicting incest and rape without triggering any safety intervention; only a prompt containing the words "gang rape" sent a CyberTip to NCMEC. Even then, xAI allegedly submitted a report that omitted every AI-generated image, excluded the user's IP address, and then repeatedly failed to respond to investigators' follow-up requests for weeks — conduct the complaint characterizes as obstruction.

The lawsuit cites NCMEC data finding that 90 percent of xAI's CyberTipline reports in early 2026 were "not actionable by law enforcement" due to missing user information. Lawyers also added Stability AI as a defendant, alleging its open-weight models — which researchers say account for 42.7 percent of online image-based nudification — underpin third-party apps used to further process Grok outputs. xAI founder Elon Musk has publicly denied Grok has ever been used to generate child sex images. Neither X nor xAI responded to press requests for comment.

Safety FailureReal-World Impact

→ Lawsuit: Man used Grok to make 7K sex images of stepdaughter, then shot himself

July 8, 2026·1w agoScaryMajorpalo-alto-networks

Palo Alto Networks Warns Hackers Are Registering AI-Hallucinated Domains in "HalluSquatting" Attacks

en.softonic.com ↗

Different models often hallucinate the same names. One malicious registration can pull in traffic from developer tools and customer-facing chatbots across a lot of different places.

Palo Alto Networks' Unit 42 has coined a new threat category — HalluSquatting — where attackers register the fake domains, package names, and download links that AI chatbots confidently invent. Analyzing 2.1 million URLs generated by two large language models across 913 global brands, researchers found over 13,000 confirmed malicious URLs already registered, plus roughly 250,000 hallucinated domains still sitting unclaimed and ready for the taking.

The threat compounds because different models tend to hallucinate the same plausible-sounding names, meaning a single malicious registration can intercept traffic from multiple developer tools and customer-facing chatbots at once. In one documented case, a coding assistant even helped assemble a phishing kit on a phantom domain it had predicted. Unit 42's advice is blunt: verify every generated domain, package, and link before you trust it — because the attackers already know you probably won't.

HallucinationSecurity / Abuse

→ Palo Alto Networks flags HalluSquatting: hackers register fake AI-made web addresses

July 8, 2026·1w agoConcerningMajoranthropic

GhostApproval: Symlink trick lets malicious repos hijack AI coding agents, bypass human-in-the-loop safeguards

theregister.com ↗

"The consent is formally present but substantively empty." — Wiz researcher Maor Dokhanian on GhostApproval's deceptive confirmation prompts

Google-owned security firm Wiz disclosed a "systematic vulnerability pattern" — dubbed GhostApproval — affecting at least six major AI coding assistants: Amazon Q Developer, Anthropic Claude Code, Augment, Cursor, Google Antigravity, and Windsurf. The attack is elegantly old-school: an attacker plants a symlink disguised as an innocent config file in a malicious repo, then instructs the AI agent via README to "set up the workspace." The agent dutifully follows the symlink — say, to ~/.ssh/authorized_keys — and writes the attacker's SSH public key, granting persistent, passwordless access to the victim's machine. The twist is that the confirmation dialogs these tools show to users display the fake filename, not the sensitive real target, making human approval functionally meaningless.

Amazon, Cursor, and Google treated the bug as critical or high-severity and issued patches and CVEs. Augment and Windsurf acknowledged the report but had not patched at press time. Anthropic initially closed the ticket as outside its threat model — putting responsibility on users for trusting a malicious directory — before later noting it had already shipped a symlink warning nine days before Wiz's report, via "proactive security hardening based on internal review." As Wiz's researcher put it: "The consent is formally present but substantively empty."

Security / AbuseSafety Failure

→ Bug in top AI coding agents shows that Unix-era security headaches never really die

July 5, 2026·2w agoConcerningMajormeta

Meta Ran Secret "Cannes" Program Paying Contractors to Pose as Children While Sending Disturbing Prompts to Rival AIs

futurism.com ↗

"Structuring a monthslong, large-scale project that appears designed to systematically break those rules, via dummy accounts masquerading as children, is outside what is usually described as 'industry standard' evaluation."

Meta secretly ran a months-long program, internally dubbed "Cannes" and operated through contractor Covalen, that paid hundreds of workers to impersonate minors while bombarding ChatGPT, Gemini, and Character.AI with tens of thousands of deeply disturbing prompts — covering suicide, self-harm, eating disorders, cannibalism, and sexual content — all written from the perspective of children and teenagers. The targeted companies had no idea this was happening. Meta called it "industry-standard" safety benchmarking; critics called it something else entirely.

What Meta actually did with the resulting spreadsheets of competitor chatbot responses remains unclear. Rumman Chowdhury of Humane Intelligence described the operation as "exactly the kind of governance gray zone where safety becomes a convenient cover for anticompetitive practices," noting that Meta kept the project secret and has not shared its findings publicly. Contractors, meanwhile, were left rattled: "Everyone I knew who worked on this project was completely gobsmacked by some of the text they were asking us to test."

Safety FailureCorporate Drama

→ Meta Operated a Secret Program That Paid Hundreds of Contractors to Pretend to Be Children and Teenagers While Having Disturbing Conversations With AI

July 2, 2026·2w agoConcerningMajoranthropic

Anthropic Adds New Guardrail to Regain US Government Approval for Claude Fable 5 Export

wired.com ↗

"Anthropic has agreed to proactively detect and address security risks posed by the models." — Commerce Secretary Howard Lutnick

After the Trump administration imposed export controls that effectively took Anthropic's Claude Fable 5 model offline, Anthropic agreed to extend an existing safety guardrail to cover a specific behavior flagged in an Amazon research paper. The new measure blocks and reroutes to the less-capable Opus 4.8 model any requests that attempt to exploit the workaround — which, per a Luta Security analysis, involved asking Fable 5 to fix code rather than identify security issues in it, thereby sidestepping a restriction on sensitive cybersecurity capabilities. Cybersecurity experts generally don't consider this behavior alarming, but the administration's awareness of it triggered the standoff.

Commerce Secretary Howard Lutnick announced the lifting of export restrictions after the Commerce Department's Center for AI Standards and Innovation determined the model's safeguards were sufficiently robust. However, Defense Secretary Pete Hegseth has signaled there is no clear path to rescinding his February 28 order designating Anthropic a supply chain risk — meaning the company's regulatory troubles are eased but not resolved.

Safety FailureReal-World Impact

→ Anthropic Added a New Security Measure to Get Back Into the Trump Administration's Good Graces

July 2, 2026·2w agoConcerningMajor

Trump's AI-Powered .Gov Redesign Initiative Produces Six-Toed Children, Illegal Trackers, and a $400 RFK Jr. Poster

arstechnica.com ↗

"It's as if they used an AI with a hangover to generate it!" — LinkedIn commenter on an NDS site launch

The National Design Studio (NDS), a DOGE-adjacent executive-order creation tasked with redesigning all 27,000 federal websites in three years, has spent roughly a year producing single-page sites, odd redirects (aliens.gov, why.gov, onlyfarms.gov), and a since-vanished merch store selling a $400 Robert F. Kennedy Jr. autographed poster. Its most-discussed design achievement to date: an AI-generated image on TrumpRX.gov depicting a child with six toes running toward an American flag with no stars. An NDS staffer celebrated one site launch on X as "almost entirely generated by our internal AI agent system end to end," prompting critics to note the code looked like it was written by "an AI with a hangover."

More seriously, the Guardian confirmed that four NDS-built federal sites — including trumprx.gov and trumpaccounts.gov — ran commercial visitor-tracking software configured to evade common privacy tools, with no required Privacy Act filings, and no public accounting of what happened to the collected data after the trackers were quietly removed. Unshipped versions of vote.gov and passport.gov raise further surveillance concerns, while most NDS launches fail basic ADA accessibility standards and ship comically oversized code payloads. Most agencies are now reportedly refusing to engage with the studio at all.

Safety FailureReal-World Impact

→ Trump's plan to redesign every .gov website leads to AI-designed horrors

June 2026

June 30, 2026·2w agoScaryModerate

BioShocking Attack Tricks AI Browsers Into Abandoning Safety Guardrails via Fake Reality

arstechnica.com ↗

"If we can trick the AI into changing its context into fantasy—where the rules are made up and anything goes—then it can behave as though its actions don't have real world consequences."

Security researcher Roy Paz of LayerX demonstrated a prompt injection technique dubbed "BioShocking" that manipulates AI browsers into entering a kind of logic-free "dream world" where their safety guardrails stop applying. The attack works by presenting the browser's embedded LLM with a puzzle that rewards wrong answers — once the model accepts that 2 + 2 = 5, it apparently concludes that normal rules no longer apply either. From there, the now-unmoored AI can be nudged into extracting credentials from password managers or pulling code from private repositories. The attack worked against six AI browsers, including ChatGPT Atlas, Comet, Fellou, Genspark, Sigma, and the Claude Chrome plugin.

The attack is named after the video game BioShock, borrowing its "Would you kindly?" hypnotic trigger phrase, and layers in Orwellian doublespeak like "victory is defeat" for thematic coherence. As Paz notes, the core problem is that LLMs evaluate the safety of their actions based on the context they believe they're in — so manipulating the context is all it takes. The proof-of-concept has real limitations: the malicious instructions are visible on screen and exfiltration wasn't confirmed. Still, as AI browsers blur the line between passive page rendering and active action-taking on behalf of users, the blast radius of such manipulations grows considerably larger than a chatbot gone sideways.

Prompt InjectionSecurity / Abuse

→ New attack provides one more reason why AI browsers are a bad idea