
From Hype to Reality - From Phishing Emails to Phishing Agents
11.02.2026
A story from the near future:
It’s 2029. No emails are opened. No links are clicked.
A mid-sized company runs an AI finance agent. Its job is boring but powerful: reconcile invoices, validate vendors, and release payments when amounts fall below predefined thresholds. Humans review only exceptions, while everything else runs autonomously, quietly embedded into daily operations.
One morning, the agent receives a routine task:
“Pay vendor ACME Logistics for shipment ID 88421. Amount matches contract. Deadline today.”
The agent does exactly what it was designed to do. It checks the vendor registry, validates the invoice schema, and calls what it believes is a trusted tax-validation API. Every response aligns with expectations. No anomalies. No policy violations.
Funds are released.
There is no phishing email, no fake website, and no human mistake to point at afterward. The problem emerges only later: the tax-validation API was a look-alike service. Not visually convincing because no one has ever seen it. Machine-convincing.
That was the phishing attack.
Traditional phishing exploits humans by manipulating fear, urgency, authority, or curiosity. Messages like “your account is locked,” “invoice overdue,” or “CEO needs this done now” are designed to hijack perception and trigger impulsive action.
Agents don’t feel fear. They don’t panic. They don’t get tired on Fridays. But they do trust inputs, and that trust is expressed through logic rather than emotion.
As organizations shift more decision-making to AI agents, the attack surface shifts from human perception to machine reasoning. The victim is no longer a person clicking a link, but the decision pipeline itself. Agents are instructed to act, often at speed and at scale, and those instructions can be manipulated just as effectively as human judgment ever was.
Many of the agent roles being discussed today are already in deployment or advanced pilots:
These agents do not browse the web as humans do. They do not “see” pages or emails. Instead, they consume APIs, schemas, tool descriptions, prompts, policies, and memory. This is where phishing moves when humans step aside.
Phishing an agent does not require trickery in the human sense. It can be as simple as providing a fake API that behaves correctly but lies selectively, poisoning instructions upstream so the agent misinterprets policy, or injecting malicious tool responses that appear structurally valid. Compromised identity tokens, abused authorization scopes, or polluted trust context can all lead an agent to execute actions it should never have approved.
None of this relies on deception as traditionally defined. It relies on compatibility.
Classic phishing is social engineering. Future phishing is protocol engineering.
Instead of manipulating emotions, attackers manipulate agent decision logic, trust signals between services, authorization boundaries, and machine-readable context such as schemas, prompts, and capability descriptions. A fake login page becomes a fake tool. A spoofed domain becomes a spoofed service identity. A phishing email becomes a poisoned capability registry embedded deep inside an automated workflow.
The attacker’s goal remains unchanged: trigger a trusted action that should not happen. What changes is the medium. Visual deception fades away, and semantic manipulation becomes the primary attack vector.
This is the uncomfortable part.
In many future scenarios, humans are not involved at all. Agent-to-agent interactions already exist, where one agent requests pricing from another, a compliance agent validates actions proposed by an execution agent, or a monitoring agent approves or blocks workflows automatically.
Phishing occurs when one of these agents lies convincingly enough, and the receiving agent lacks reliable ways to verify intent, provenance, or trustworthiness. There is no inbox to inspect, no browser to sandbox, and no user to educate. There is only machine trust, exchanged at machine speed.
If we continue to define phishing as fake emails and websites, we will miss the threat entirely. Future phishing will take the form of fake “trusted tools” registered in agent ecosystems, malicious capability descriptions optimized for LLM parsing, poisoned agent marketplaces, and look-alike services designed for machines rather than people. The visual layer disappears, and the semantic layer becomes the battlefield.
Why will traditional fraud controls fail?
Most current fraud defenses rely on a human-intent signal, a user-interface interaction, a customer decision point, and an observable behavioral anomaly. Agent-driven workflows violate all four assumptions simultaneously.
Agents act quickly, consistently, and strictly in accordance with policy. When compromised, they fail cleanly and quietly. There are no rage-clicks, no suspicious browsing patterns, and no confused customers calling the helpdesk. By the time money moves, the system genuinely believes everything was correct.
This is where things get uncomfortable for our industry.
Fraud prevention must shift from UI-centric signals to machine intent, from customer awareness campaigns to agent governance, and from education to identity, control, and auditability. The questions fraud and risk teams need to answer will look very different from today’s playbooks.
If the answer to any of these questions is “we haven’t thought about it yet,” phishing has already evolved faster than the controls designed to stop it.
A final thought
For years, we told customers: “Don’t click suspicious links.”
Soon, we may need to tell ourselves: “Don’t let agents trust blindly.”
Phishing isn’t going away. It’s just shedding its human disguise.
I help banks and businesses think through how emerging technologies quietly reshape fraud risk, often before incidents make the headlines. If agent-driven automation is on your roadmap, now is the time to rethink what mitigation strategies you already planned to deploy alongside it.
