It only takes five seconds to clone your voice

Reading Time: 2 min

I remember watching a video last year where some software was able to listen to a recording of a voice for a few minutes and then mimic it with amazing accuracy. Since then, the tools have become more and more powerful and a typical voice model can now be trained on just five seconds of listening to your voice.

Here is an example showing some examples of AI Clones working on just five seconds of input:

The repercussions of this are massive. Matthew Wright shares a hypothetical example:

You have just returned home after a long day at work and are about to sit down for dinner when suddenly your phone starts buzzing. On the other end is a loved one, perhaps a parent, a child or a childhood friend, begging you to send them money immediately.

You ask them questions, attempting to understand. There is something off about their answers, which are either vague or out of character, and sometimes there is a peculiar delay, almost as though they were thinking a little too slowly. Yet, you are certain that it is definitely your loved one speaking: That is their voice you hear, and the caller ID is showing their number. Chalking up the strangeness to their panic, you dutifully send the money to the bank account they provide you.

The next day, you call them back to make sure everything is all right. Your loved one has no idea what you are talking about. That is because they never called you — you have been tricked by technology: a voice deepfake.

To be honest, I’m not sure how to prevent this. Additional security on cell phone number spoofing would be a good step, and it’s coming eventually, but right now this is a very real possibility. If my daughter called, from her number, and with her voice, asking for help right away, I’d do it.

A recent episode of Seth Godin’s “Akimbo” podcast dug into this a bit more, where he used ChatGPT to write much of the script, and then he used an AI voice model of himself to read much of it. It did a stunningly good job with both of those.

The personal security implications of this, like I shared above, are a huge concern but it goes beyond that. Right now, I could ask my daughter to hop on a video call to prove that it’s her and that solves it — for now. In a very short amount of time, that will be easy enough to fake as well.

What’s the solution? I don’t have it, but the people that can best solve these kinds of crazy new problems are likely to be the next set of billionaires that we’ll be talking about in 2030.

Share this:

Related

Leave a ReplyCancel reply