
Identity-based attacks are evolving rapidly, moving away from static, email-only campaigns toward highly interactive, hybrid operations. Okta Threat Intelligence recently uncovered a disturbing new ultra sophisticated tool used by bad actors: a custom phishing kits specifically designed to support voice-based social engineering, commonly called vishing.
These new tools allow put the attackers in the "driver’s seat," enabling them to manipulate what the victim sees in their browser in real-time, perfectly synchronized with carefully crafted conversational scripts that the attackers tell their victims over the phone, impersonating IT or other support services. It has been used by intrusion actors targeting Google, Microsoft, Okta and a range of cryptocurrency providers.
Here is a breakdown of this new attack model and how to prepare your workforce for it.
Key Takeaways:
The new kits discovered by Okta differ from traditional Phishing-as-a-Service tools because they are built for interaction. Instead of hoping a user clicks and completes a form alone, these kits assume a bad actor is on the line guiding them.
- Real-Time Session Orchestration: The most critical feature of these kits is the use of client-side scripts that allow the attacker to control the authentication flow in the victim's browser instantly.
- Synchronized Deception: Attackers can manually trigger specific pages (like a fake "Loading..." screen or a "Security Check Successful" banner) to match their verbal script, building real trust with the victim.
- Bypassing Secure MFA: The synchronization allows attackers to defeat strong MFA methods, like number matching or push notifications, by verbally instructing the user on which number to select or when to approve a prompt.
The Hybrid Social Engineering Sequence
This new breed of attack relies on a seamless blend of phone pressure and digital deception. Here is how the sequence typically unfolds using these advanced kits:
- The Setup: The attacker performs reconnaissance to find the target's name, company, and IT helpdesk phone number. They then call the target, spoofing the internal helpdesk number.
- The Lure: posing as IT support, the attacker claims there is a security issue (e.g., "We need to re-sync your account") and directs the employee to a fraudulent, look-alike login page.
- Real-Time Harvest: As the victim enters their username and password, the kit sends them immediately to the attacker via a private channel (like Telegram).
- The "Driver’s Seat" Login: The attacker takes those credentials and attempts to log in to the legitimate corporate portal (Microsoft, Okta, etc.) on their own machine.
- MFA Mirroring: When the real portal asks the attacker for an MFA code or push approval, the attacker uses the phishing kit to update the victim's screen to show the exact same challenge.
- Verbal Bypass: The attacker verbally primes the victim ("You will see a notification on your phone; please press 45 to verify"). The victim, believing they are talking to IT, complies.
- Access Granted: The attacker gains a valid session token, while the victim sees a reassuring "Success" message on the fake site, leaving them unaware they have just been compromised.
The C2 panel of a custom phishing kit that can control the authentication flow on phishing pages. Image courtesy of Okta Threat Intelligence.
Why Vishing is Becoming the Critical Threat Vector
Vishing is rapidly emerging as a critical threat vector as attackers exploit the heightened trust enabled by voice and seemingly legitimate calls, urgency, and the growing effectiveness of AI-driven deception, transforming voice from a secondary channel into a primary weapon of trust and authority.
Recent data underscores this shift: according to the CrowdStrike 2025 Threat Hunting Report, vishing attacks doubled, and vishing attacks now increasingly rely on digital voice imitation systems, according to the The European Union Agency for Cybersecurity. As email defenses continue to improve, attackers are increasingly turning to the human voice, especially when embedded in legitimate-looking digital workflows, because it bypasses technical controls and leverages innate human trust and responsiveness.
A Unified Strategy: Protecting Against Hybrid Attacks
A siloed approach no longer works. Employees need to be trained to pause, verify, and report, no matter the channel. To defend against these sophisticated, real-time attacks, CISOs must modernize their awareness programs.
1. Integrate Phishing and Vishing
Your awareness program must simulate cross-channel attacks. Teaching employees to handle phishing without vishing leaves gaps that attackers will exploit.
- Integrate phishing and vishing into one social engineering playbook.
- Train the same reporting muscle: whether it’s an email or a phone call, the reporting process should be unified and familiar.
- Highlight real attacker playbooks: callback phishing, IT desk scams, and CEO fraud campaigns often blend multiple techniques.
2. Prepare for Hybrid Attack Flows
Attackers increasingly combine channels in multi-stage campaigns. Your simulations should reflect this reality:
- Simulate emails that tell the victim to call a fake support line.
- Test employees with scenarios where a phone call "confirms" a fraudulent email request, layering credibility.
- Prepare staff for synthetic voices leaving voicemails directing them to phishing sites.
The "human firewall" can no longer just look for bad grammar in emails. With attackers controlling browser flows in real-time while speaking to victims, the line between technical exploitation and social engineering has blurred. By testing employees against these realistic, hybrid scenarios, organizations can close the training gap before attackers exploit it.
Think you're safe from vishing?
With AI making voice impersonation easier than ever, attackers are just one phone call away. Try Arsen’s vishing playground for free and find out how secure you really are →

