Google and Mandiant's M-Trends 2026 report confirms vishing is now the second most common initial breach vector globally and the #1 vector in cloud environments. AI-powered attack platforms can now automate the entire social engineering call. The only effective countermeasure is training your people to recognize and resist the call before attackers make it.
Why Does It Matter That Vishing Leads the Google/Mandiant M-Trends 2026 Report?
M-Trends report is built on over 500,000 hours of frontline incident response investigations conducted by Mandiant in 2025, combined with Google Threat Intelligence Group (GTIG) telemetry. When it flags a trend, that trend is already in your organisation's threat model. The opening of the 2026 edition does not lead with ransomware, zero-days, or nation-state APTs. It leads with vishing.
We are tracking a significant shift toward voice-based social engineering (vishing), which has risen to the number two spot for initial infection vectors.
Top 2025 vectors:
| Attack Vector | Share of 2025 Investigations | Change vs. 2024 |
|---|---|---|
| Exploits (CVEs) | 32% | Stable (6th consecutive year at #1) |
| Voice phishing (vishing) | 11% | ↑ Significant increase |
| Prior compromise | ~10% | ↑ Up from #5 in 2024 |
| Stolen credentials | 9% | ↓ Down from 16% in 2024 |
| Web compromise | 8% | Stable |
| Email phishing | 6% | ↓ Down from 14% in 2024 |
| Insider threat | 6% | ↑ Up from 5% in 2024 |
Source: Mandiant M-Trends 2026
The simultaneous collapse of email phishing (−8 points) and rise of vishing is not a coincidence. Attackers follow the path of least resistance. Email security has improved. Human voices have not gotten easier to distrust. In cloud environments specifically, vishing was the single most common initial vector, at 23% of cloud-related compromises (ahead of stolen credentials (16%), email phishing (15%), and exploits (6%)).
The report makes an important conceptual distinction: email phishing is classified as a non-interactive technical lure. Vishing is classified as interactive human engagement. The difference is operational: interactive attacks are "significantly more resilient against automated technical controls" and require fundamentally different detection strategies. You cannot firewall a phone call.
Key takeaways from the M-Trends 2026 report
- Vishing is now the #2 initial infection vector across all 2025 Mandiant incident response investigations.
- Email phishing fell to 6% of intrusions in 2025, down from 14% in 2024, while vishing rose.
UNC3944(overlapping with Scattered Spider) has been running vishing campaigns since at least early 2022, targeting help desks to bypass MFA.UNC6040used voice phishing to lure victims into authorizing attacker-controlled SaaS apps, later leveraged for ShinyHunters-branded extortion. Learn more here →- The report notes that interactive voice attacks are significantly more resilient against automated technical controls than email phishing.
- The "hand-off" trend, where one actor gains access via vishing and sells it to a ransomware group, can happen in under 30 seconds.
What Do Real Vishing Attacks Look Like?
The M-Trends 2026 report names UNC3944 and UNC6240 as illustrative examples, but those campaigns are now well-documented. More recent attack reports show further sophistication in vishing methods. AI-automated calling infrastructure available for purchase, and nation-state actors using fake video meetings to harvest voice recordings for reuse in future attacks. Two recent examples are worth examining closely.
Scenario 1: The TOAD Attack Kit; AI Automates the Entire Call
Threat researchers at Abnormal AI published an analysis in April 2026 of a cybercrime platform called ATHR, sold on underground markets for $4,000 plus 10% of profits. It is the clearest illustration yet of how vishing has been industrialised.
ATHR packages the full attack chain into a single browser-based product:
- Lure delivery: A built-in mailer generates spoofed brand notification emails (Google, Microsoft, Coinbase, Binance) with configurable personalisation fields: lock time, failed attempts, last location, IP address. Each email is tailored to the target to pass casual inspection.
- Callback routing: When the target calls the number embedded in the email, ATHR routes them to either a human operator or an AI voice agent running on Asterisk WebRTC; the same infrastructure used by legitimate call centres.
- AI-driven social engineering: ATHR's AI vishing agents follow a 10-step structured script. They verify the callback, describe fabricated suspicious account activity, and extract a six-digit verification code; without a human operator on the line. A single operator can run campaigns against multiple brands simultaneously.
- Live credential harvesting: While the voice interaction runs, ATHR's phishing panels capture credentials in real time. Operators see each target as a live session and can redirect them to specific pages mid-call.
What makes this significant for defenders: the barrier to running a high-volume vishing campaign no longer requires skilled social engineers. The platform lowers execution cost and scales the attack. Your help desk staff, finance team, and IT administrators are now systematically targetable.
Scenario 2: UNC1069; Fake Meetings, Voice Capture, and North Korean Infrastructure
Researchers at Validin published a detailed analysis in April 2026 of UNC1069; a North Korean threat actor (overlapping with Bluenoroff) that runs campaigns against cryptocurrency, Web3, and financial services targets.
The attack chain:
- Attackers build fabricated venture capital personas and approach targets via LinkedIn and Telegram, leveraging previously compromised accounts to appear credible.
- They schedule meetings through Calendly links pointing to fake video conferencing platforms that mimic Zoom, Google Meet, and Microsoft Teams at the infrastructure level.
- During the meeting, attackers claim the target's microphone or webcam is malfunctioning, create urgency, and serve a ClickFix-style prompt, instructing the target to paste and run a command to "fix the problem."
- Critically: these fake meeting platforms capture the victim's audio and video recordings, which are then reused in subsequent campaigns to impersonate real people; including deepfake representations of executives.
This is no longer phishing. It is a multi-session, long-duration social engineering operation where the voice channel is both the attack vector and the intelligence collection tool. For Mandiant, this cluster is tracked as part of the broader prior compromise and vishing trends cited in M-Trends 2026.
Why Are These Attacks Working?
Both scenarios share the same structural advantage: they exploit trust in real-time human interaction.
Email phishing asks a target to click a link and hope they don't inspect it. Vishing puts a voice on the line, patient, contextualized, and responsive. Automated email gateways detect attachments. They do not detect a persuasive pretext delivered at 0.9x speed with a regional accent.
The M-Trends report frames this precisely:
Email phishing relies on 'volume and opportunistic delivery', while interactive vishing involves 'a live person, or now, an AI, steering the conversation in real-time.'
How Do You Defend Against Vishing? Train Your People to Recognize the Call
Technical controls stop known malware. They do not stop a caller claiming to be from your IT department, asking your CFO to authorize a wire transfer because the CEO is in a meeting and unreachable. The only scalable defense is behavioral conditioning through realistic simulation; exposing your team to the exact call dynamics attackers use, in a safe environment, with a learning feedback loop.
How Simulated Vishing Attacks Work
A well-designed vishing simulation programme has two distinct objectives:
1. Audit and risk evaluation Run a realistic vishing campaign against your organisation (no training prompt, no warning) and collect behavioral data. Which teams comply? Which roles disclose credentials under pressure? What pretexts land? This gives you a measurable baseline and identifies the exact population that needs targeted training.
2. Awareness training Use simulation as a training mechanism. When a user fails a vishing call, trigger an immediate contextual learning moment; delivered at the point of failure, not in a quarterly slideshow.
What Makes a Vishing Simulation Realistic Enough to Be Useful?
Three variables determine whether a simulation produces actionable data or just noise:
Pretext: The scenario must reflect actual attacker tactics: IT help desk impersonation, bank fraud alert, executive impersonation, HR callback. Generic pretexts produce results that don't translate to real risk assessment.
Voice: The caller's voice must match the pretext. For AI-powered simulations, voice cloning allows you to simulate an attack where the voice resembles a known colleague or executive; the exact scenario your CISO fears and that UNC1069 is already deploying.
Phone number: Spoofed numbers that appear to originate from your own organisation or a known vendor dramatically increase compliance rates. If your simulation uses a generic number, you are testing a scenario that is already lower-risk than what attackers deploy.
When Should You Use AI-Powered Vishing Simulation?
Two clear scenarios:
| Use Case | Why AI Simulation Is Required |
|---|---|
| Large-scale campaigns (100+ users) | Human callers cannot scale cost-effectively. AI voice simulation allows full-organisation testing in a single campaign window. |
| Advanced threat scenarios | Testing employee resilience against voice cloning and deepfake impersonation requires the actual technology; not a human caller reading a script. |
Arsen's vishing simulation platform uses real-time voice AI and voice cloning to replicate the exact attack vectors described in M-Trends 2026. Campaigns can be targeted at high-risk populations (finance, IT, executives) or run organisation-wide. Results are tracked at the individual and team level, and integrated into continuous security awareness training workflows.
What Should a Vishing Simulation Programme Include?
- Scenario library covering: IT help desk impersonation, executive fraud, HR callbacks, vendor impersonation, MFA bypass attempts
- Voice options: standard AI voice, regional accent matching, and voice-cloned executive impersonation
- Spoofed caller ID matching internal or known-vendor numbers
- Real-time call tracking and compliance rate metrics
- Immediate post-call training feedback for users who comply
- Reporting dashboards for CISO-level risk visibility by team, role, and location
FAQ
Vishing (voice phishing) is a social engineering attack conducted over the phone, where an attacker impersonates a trusted person or organisation to extract credentials, authorise fraudulent actions, or install malware. Unlike email phishing, it relies on live or AI-generated voice interaction, making it harder to detect with technical controls.
Email phishing declined to 6% of breaches in 2025, according to Mandiant M-Trends 2026, while vishing rose to 11% and hit 23% in cloud environments. The core reason: interactive voice attacks are significantly more resilient against automated defences and exploit real-time human trust; something no firewall addresses.
A Telephone-Oriented Attack Delivery (TOAD) attack starts with a benign-looking email containing only a phone number. When the target calls, an operator or AI agent walks them through handing over credentials or installing remote access software. The email bypasses technical filters because it contains no malicious link or attachment.
Platforms like ATHR now automate the entire vishing call with AI voice agents running structured social engineering scripts. A single operator can run campaigns at scale, across multiple brands, without trained human callers. Voice cloning extends this by impersonating specific known individuals: colleagues, executives, or IT staff.
Simulated vishing exposes employees to realistic vishing scenarios in a controlled environment. It measures behavioural vulnerability (who complies, under what pretext), triggers contextual training at the moment of failure, and builds sustainable resilience against voice-based attacks. It is the only mechanism that addresses the human element that technical controls cannot reach.
Define your objective (audit vs. training), select scenario pretexts relevant to your threat profile, configure caller ID spoofing and voice to match realistic attacker conditions, target high-risk populations first, and integrate post-call training feedback. Arsen's platform handles each step end-to-end. Request a demo to see a live campaign configured for your environment.