Listen to the audio used in a would-be deepfake scam. Attackers use machine learning to clone a target’s voice, and then call companies, trying to convince employees to move money where it shouldn’t be. This attack, as reported by security consulting firm NIS…

AI voice clones are getting more and more realistic
Whos calling?
Photo by Chris Welch / The Verge
One of the stranger applications of deepfakes AI technology used to manipulate audiovisual content is the audio deepfake scam. Hackers use machine learning to clone someones voice and then combine that voice clone with social engineering techniques to convince people to move money where it shouldnt be. Such scams have been successful in the past, but how good are the voice clones being used in these attacks? Weve never actually heard the audio from a deepfake scam until now.
Security consulting firm NISOS has released a report analyzing one such attempted fraud, and shared the audio with Motherboard. The clip below is part of a voicemail sent to an employee at an unnamed tech firm, in which a voice that sounds like the companys CEO asks the employee for immediate assistance to finalize an urgent business deal.
The quality is certainly not great. Even under the cover of a bad phone signal, the voice is a little robotic. But its passable. And if you were a junior employee, worried after receiving a supposedly urgent message from your boss, you might not be thinking too hard about audio quality. It definitely sounds human. They checked that box as far as: does it sound more robotic or more human? I would say more human, Rob Volkert, a researcher at NISOS, told Motherboard. But it doesnt sound like the CEO enough.
the target immediately thought it suspicious
The attack was ultimately unsuccessful, as the employee who received the voicemail immediately thought it suspicious and flagged it to the firms legal department. But such attacks will be more common as deepfake tools become increasingly accessible.
All you need to create a voice clone is access to lots of recordings of your target. The more data you have and the better quality the audio, the better the resulting voice clone will be. And for many executives at large firms, such recordings can be easily collected from earnings calls, interviews, and speeches. With enough time and data, the highest-quality audio deepfakes are much more convincing than the example above.
The best known and first reported example of an audio deepfake scam took place in 2019, where the chief executive of a UK energy firm was tricked into sending 220,000 ($240,000) to a Hungarian supplier after receiving a phone call supposedly from the CEO of his companys parent firm in Germany. The executive was told that the transfer was urgent and the funds had to be sent within the hour. He did so. The attackers were never caught.
Earlier this year, the FTC warned about the rise of such scams, but experts say theres one easy way to beat them. As Patrick Traynor of the Herbert Wertheim College of Engineering told The Verge in January, all you need to do is hang up the phone and call the person back. In many scams, including the one reported by NISOS, the attackers are using a burner VOIP account to contact their targets.
Hang up and call them back, says Traynor. Unless its a state actor who can reroute phone calls or a very, very sophisticated hacking group, chances are thats the best way to figure out if you were talking to who you thought you were.