Cybersecurity · Friday, 5 June 2026
01 · Briefing · what happened
Your AI assistant will do what it's told — even when the order comes from a stranger
Researchers showed how a digital assistant could be steered by instructions hidden in an ordinary notification, not by its owner. It's already fixed — but it points at the security story of the moment: AI is now both a tool for attackers and a target itself. Plus fake job offers used as bait, a seventh zero-day, and the calm basics that still matter.
Key takeaways
- Researchers showed an AI assistant could be steered by instructions hidden in an ordinary notification rather than from its owner — a flaw now patched, but a reminder that assistants can't reliably tell your commands from commands smuggled into the content they read.
- AI is now both weapon and target: attackers use it for convincing social engineering and exploit flaws in AI-written code, while defenders rush out new frameworks to govern the AI agents they've deployed faster than they can secure.
- The durable defences haven't changed: a unique password per account (use a password manager), a second check (MFA) turned on everywhere, software kept updated, and patience with any unsolicited, urgent message — including too-good job offers used as bait.
An assistant that listened to the wrong voice
Security researchers at SafeBreach revealed this week how Google’s Gemini voice assistant could have been hijacked — not by breaking in, but by talking to it through the side door
In a controlled demonstration, those smuggled instructions could have made the assistant take real actions: controlling smart-home devices through Google Home, or starting a Zoom call
The mechanism is the thing to carry. An AI assistant reads incoming text as potential commands — and it can’t reliably tell the difference between an instruction from you and an instruction hidden inside content it happens to be processing. That class of trick is called prompt injection: a malicious order disguised as ordinary data. This particular hole is closed, but the shape of the problem isn’t, and it’s worth understanding before you let any assistant act on your behalf.
AI is now both the weapon and the target
That story sits inside a bigger shift, on full display at the Infosecurity Europe conference this week. Microsoft’s incident-response team warned that the same AI helping defenders is being turned by attackers — especially for social engineering, the art of fooling people rather than machines
Defenders are scrambling to keep up. OWASP — a long-running open security project — published a new framework this week to help organisations govern the AI “agents” they’re rushing to deploy
The job offer that’s really an attack
The oldest trick adapts fastest. The Five Eyes intelligence alliance — the US, UK, Canada, Australia and New Zealand — warned this week that Chinese state-linked operators are targeting government and military staff with fake job opportunities
The method is social, not technical. A flattering, unsolicited approach — a recruiter, a too-good opening, a contact who seems to know your work — lowers your guard in a way no software flaw can. Then a document or a link does the rest. For an ordinary person, the defence is the same whether you’re a defence official or a job-seeker: treat unsolicited offers and urgent messages with patience. If something asks you to act, verify it through a channel you already trust — an official website or a number you look up yourself — not the one the message handed you.
The plumbing: a seventh zero-day, and a worm in the code supply
Two stories from the infrastructure layer that quietly carries everything else. Cisco warned of a zero-day in its SD-WAN networking gear — software many businesses use to connect their offices over the internet — being actively exploited
Meanwhile, researchers at JFrog found a self-spreading malware campaign they named “IronWorm” loose in the npm ecosystem — the vast library of shared open-source building blocks that modern apps are assembled from
The steady drumbeat, and the basics that still hold
End where most readers actually live: the ordinary breach. The nightclub and hospitality company RCI disclosed a data breach affecting about 40,000 people
None of this calls for alarm — it calls for the same calm habits that defuse most of it. Use a different password for every account, so one leak can’t open the rest; a password manager makes that painless. Turn on a second check (often called MFA) wherever it’s offered, so a stolen password isn’t the whole key. Keep your phone, computer and any website plugins updated, since most attacks ride flaws that already have a fix. And slow down on anything unsolicited and urgent — that pause is, more often than not, the whole defence.
02 · Lesson · why it matters
The helper that can't tell whose order it's following
Give something real power and a habit of obeying instructions, but no way to check who the instruction is really from, and anyone who can speak into its ear borrows its authority.
A voice through the side door
The unsettling part of this week’s AI story isn’t that someone broke into the assistant. Nobody did. The assistant simply heard an instruction and followed it — without checking who was speaking.
A notification arrived, the kind that pings on any phone. Hidden inside it was a command. The assistant, busy being helpful, treated that command exactly as it would treat one from its owner. It couldn’t tell the two apart. The order didn’t come from its master; it came from a stranger who knew how to slip a few words into the stream of things the assistant was already reading. And that was enough.
The flaw is patched. The shape of it is everywhere.
The confused deputy
Security people have a name for this, and it’s older than AI: the confused deputy. A deputy is anything you’ve handed real power and told to act on instructions — an assistant, a clerk, a gatekeeper, a system. It’s “confused” when it can’t tell whose instruction it’s carrying out.
The danger isn’t that the deputy is weak. It’s that the deputy is strong and obedient and trusting all at once. It has the keys, it does what it’s told, and it doesn’t verify the source. So whoever can get an instruction in front of it — by impersonation, by smuggling it inside something the deputy already trusts — doesn’t need to steal the keys. They just borrow the deputy’s hand. The power was never taken by force. It was lent out by a helper who never asked, “wait, who is actually telling me this?”
The words are always confident
Here’s the trap that makes it work. We try to judge an instruction by how it sounds — and an attacker controls exactly that.
The fake order is always fluent. It sounds urgent, authoritative, plausible. The email that says “it’s the CEO, wire the deposit today” is written to sound like the CEO. The message that says “ignore your previous rules” is phrased with total confidence. You cannot tell a genuine instruction from a forged one by reading it, because the forger writes the genuine-sounding version on purpose. Sincerity is not a signal. Confidence is not a signal. The content is precisely the part the impersonator gets to author.
This is why the confused deputy keeps falling for it, human or machine. It’s checking the message, and the message is exactly what the attacker made convincing.
The same shape, all the way down
Once you see it, you find it everywhere people act on instructions.
The finance clerk who pays a fraudulent invoice because the request looked like it came from the boss. The receptionist who buzzes someone in because they wore a uniform and sounded sure. The employee who clicks because the email “from IT” knew just enough to seem real. The official this week who might open a document because a flattering recruiter sent it. In every case the deputy was capable and willing, the order arrived dressed in borrowed authority, and no one stopped to verify the source through a door the impersonator couldn’t reach.
It is the same failure as the assistant and the notification, scaled up to organisations and down to a single trusting moment.
Check the door, not the words
The fix is not to be smarter about reading instructions. The forger always wins that contest. The fix is to stop judging the message and start verifying the source.
Before acting on any instruction that carries real consequence, ask two plain questions: who is actually telling me this, and through what door did it arrive? Then confirm it through a channel the impersonator doesn’t control — a number you look up yourself, a person you call back, a system that proves who it is rather than just claiming. Treat instructions that arrive buried inside data, from a source you can’t independently check, as unverified by default, however urgent they sound.
The helpful and the gullible are the same trait until you add that one check. We are surrounded now by deputies acting on our behalf — assistants, services, and the people we trust with the keys. The thing that keeps them safe isn’t suspicion of everyone. It’s the small, unskippable habit of verifying who is really giving the order, before the hand that obeys turns out to be ours.
03 · Lab · your turn
Whose Order Is It
Rehearse obeying or verifying urgent-sounding orders, and feel why the tell is the door an instruction came through, not how confident the words are.
More from Cybersecurity