The Uncanny Valley of Artificial Voice-Acting: Jacco Fransen on Overcoming the Seeming Conveniences of AI with Art
The labor union SAG-AFTRA has recently signed an agreement allowing advertisers to buy the rights to actors’ voices to produce AI-generated digital replicas with consent. This development in the unavoidable yet often troubled relationship between the arts and their commercial patrons begs the question: can AI effectively replace the human voice? Jacco Fransen, a renowned sound engineer in the Dutch broadcasting world and founder of sound studio Aborisound, has witnessed the effects of AI’s rapid encroachment into the artist’s domain and provides a nuanced response.
Several AIs have demonstrated their natural language processing prowess to simulate the human voice, corresponding with the rising trend of voice-replicating in films, advertising, gaming, and customer support services. In fact, it’s nearly impossible to avoid short-form advertisements that use the stilted cadence of an AI-generated voiceover. The wonder of these algorithms lies in their ability to almost instantly transform scripts into audio with little additional direction.
“AI enables producers and content marketers to communicate their message on a base level, but I don’t believe it can do so on an emotional, and therefore, effective and human connection level. You still need the human communicative aspect to genuinely connect with other humans,” theorizes Aborisound founder Jacco Fransen, “Especially when you’re trying to promote a message or encourage an audience to take some action. If you are lacking the human element, then there will be a gap between your products and your consumer.”
The fundamental purpose of an ad campaign is its ability to inform, persuade, and remind the targeted audience of a product or service. A key ingredient in the engagement level of any particular advertisement is the ability of viewers (and listeners) to recognize the humanity of its content. As Jacco understands, human connection is much more difficult to convey through a digital reproduction of a voice than through a real one. Studies agree, reporting that AI voiceovers are significantly less effective at favorable conversion outcomes than human voiceovers are – humans generally outperform their artificial peers.
Jacco Fransen
Still, AI voiceovers remain a highly desired alternative to voice actors. This is because they are more ‘flexible’ in the editing process, there’s less time spent on contracts and auditions, and they are generally cheaper. At least on the front end. “They can save money on the front end, but will still have to spend a lot of money on the back end to make a satisfactory product. And the end result still won’t compare to what can be done with a human voice,” adds Jacco.
Indeed, with AI voiceovers, companies don’t need to rent studio space, invite someone to record the script, or a director to capture the message. However, many content marketers find that the true effort of creating a marketable end product is vastly underestimated. It’s not as simple as subscribing to one of the many AI voice generator platforms out there and calling it a day. Prompting AI in the right direction is an intricate art in itself, and it still requires a significant amount of audio editing to make it sound natural in post-production. At the end of the day, if real humans can’t resonate with the message, any amount of money saved is futile.
While the state of AI today can find uses in the voice-over industry in providing creative direction for content developers, it will not be able to replace artists. Similar to other steps, like auto-leveler software, made toward automation in sound engineering, the process still necessitates human ears on the other end to ensure its accuracy. In Jacco’s experience as a voiceover director, “The tone of voice in what you’re trying to communicate is what makes the real difference.”
There are so many ways to say a single sentence. Changing non-verbal cues, intonation, and cadence on command is expressly human – and is the singular factor of a voiceover artist that adds crucial value to any script. Jacco adds, “The connection between hearing a voice and the emotional part of the brain is strong, and small elements of the human voice make a difference in how it affects its listener. The devil is in the details.”
The uncanny valley that arises from the awkward or too-repetitive lilt of many AI-generated voices can negatively impact a viewer’s cognitive attention, leading to decreased engagement with that piece of media. “People are sensitive enough to feel the difference between a talking computer and a talking human being; they are well aware of whether what is in front of them is real or fake,” notes Jacco. The many non-verbal cues imbued within a bit of speech add to the total listening experience; a subtext that requires human intervention when AI voices are involved.
The global voice-over market size was valued at $4.23 billion in 2023 and is projected to grow at a rate of 8.6% in the next six years, to reach $8.05 billion by 2030. Despite the hidden costs and vital absence of the human touch, if marketing companies are committed to utter reliance on AI voiceovers, one of the only steady sources of income for voiceover artists will disappear in the coming years.
As several countries face slashes in their public broadcasting budgets, companies might have a growing urge to seek cost-effective AI replacements for human artists. Jacco understands the predicament but emphasizes the importance of building human connections between brands and their customers. “AI is a good tool to supplement some of the processes in creating art, but over-reliance can quickly become dangerous. The subtle layers involved in voice acting and post-production are vital to communicating on a direct emotional level with clients. It’s all about the viewer’s experience.”
Sound is one of five primary ways that humans understand the world around them, and Jacco’s art form ushers audiences into whole new realities to connect with. Known for the documentary Memento (2024) and Van Ninevé naar Nazareth (2017), Jacco Fransen has made his career on his ability to discern the slightest imperfection in a waveform and categorically reproduce a visual environment in audio, down to the last speck of dust. Jacco is dedicated to preserving the craft and human touch of audio engineering. With Foley artists, voice-over booths, and an extensive sound effects database, he founded the Hilversum-based Aborisound to produce, record, and edit the entire audio experience in one well-equipped studio.
The labor union SAG-AFTRA has recently signed an agreement allowing advertisers to buy the rights to actors’ voices to produce AI-generated digital replicas with consent. This development in the unavoidable yet often troubled relationship between the arts and their commercial patrons begs the question: can AI effectively replace the human voice? Jacco Fransen, a renowned sound engineer in the Dutch broadcasting world and founder of sound studio Aborisound, has witnessed the effects of AI’s rapid encroachment into the artist’s domain and provides a nuanced response.
Several AIs have demonstrated their natural language processing prowess to simulate the human voice, corresponding with the rising trend of voice-replicating in films, advertising, gaming, and customer support services. In fact, it’s nearly impossible to avoid short-form advertisements that use the stilted cadence of an AI-generated voiceover. The wonder of these algorithms lies in their ability to almost instantly transform scripts into audio with little additional direction.
“AI enables producers and content marketers to communicate their message on a base level, but I don’t believe it can do so on an emotional, and therefore, effective and human connection level. You still need the human communicative aspect to genuinely connect with other humans,” theorizes Aborisound founder Jacco Fransen, “Especially when you’re trying to promote a message or encourage an audience to take some action. If you are lacking the human element, then there will be a gap between your products and your consumer.”