March 05, 2009

Article at CNN

Kindle 2 speaks volumes -- and raises questions

Amazon's new Kindle 2 has a synthetic voice that can read aloud e-books, articles and blogs. Described as an "experimental" feature, it has surprisingly good command of nuance and inflection, but some people are voicing concerns.

The Authors' Guild recently pointed out that writers and publishers sell audio rights -- think audio books -- and that those rights are not included with e-book rights.

Online debates have ensued, with the Authors Guild suffering a fair amount of ridicule. Fantasy writer Neil Gaiman has blogged that he does not have a problem with the new Kindle feature. But his agent does.

How the story unfolds is unclear, but "synthetic voice rights" might become part of book contract negotiations, welcomed by authors or not.

Today's synthetic voices can't match human talent (such as an actor hired to read audio books). But they do enable some interesting options.

For instance instead of having just one narrator, you can switch between voices, including male and female, as on the Kindle 2. You might not use this option, but it's there, and it will likely get better over time. Hemingway with a feminine touch, anyone?

The Kindle uses voice technology from Nuance Communications, which can create all kinds of voices. The Kindle gives users limited selection, but it's easy to imagine future reading devices offering a wide variety. Each voice lends its own effect.

In the future, "voice fonts" might become best-sellers in the e-book reader marketplace. For romances a particularly sultry synthetic voice might become popular. Computer voices from today might have a nice retro feel.

Perhaps customization services will emerge, so that your grandmother's voice could read bedtime stories to your grandchildren.

"Nuance's text-to-speech solutions can easily create custom voices using voice talents, which could be a celebrity," notes Nuance spokesperson Rebecca Paquette.

If a celebrity, why not a relative? Such services might be offered in the future. Nuance relies on recorded human speech, mapping samples to create a database of stored phonetic units that can be accessed in real time to create the synthetic speech.

"Sophisticated algorithms are used to create natural prosody and intonation across any sentence," Paquette explains, adding that the original recordings do not need to cover all words or combinations.

At the moment, the Kindle-sparked debate is focusing largely on whether today's synthetic voices can match the quality of professional voice actors reading long fiction.

They probably never will, of course, and they certainly can't now. But neither can most humans. Few of us are blessed with golden throats, after all.

"I do not foresee Kindle 2 as a threat to voice talents," says James McCoy, a Maryland-based voice actor who's worked on audiobooks.

A computer voice, notes McCoy, "knows no difference between a dramatic, suspenseful sentence and a math problem. The words aren't animated, they're read. That makes a lousy audio book."

And even if future synthetic voices can temporarily fool us into thinking they're human, it's unlikely they'll possess intelligence. It's annoying enough to listen to people who don't know what they're talking about. A synthetic voice never knows what it's talking about.

Many people aren't aware yet of the Kindle 2's text-to-speech feature, so more resistance to it could emerge. One might expect howls from, a leading online seller of not only audio books, but audio versions of publications like the New York Times as well.

Amazon, though, took over last year, meaning the online giant has its fingers in both professionally narrated audio books -- a burgeoning market -- and synthetic voices.

"It would appear they are hedging their bets," notes David Ciccarelli, CEO of talent congregator

It's possible that some Kindle 2 users out there are listening to entire novels for free, delighting in not having to pay a cent to, or to authors or publishers, for the privilege. But they still bought the device from Amazon.

If such users exist, their numbers must be miniscule, and -- given the quality -- their enjoyment levels not much higher. But their numbers might grow as e-book readers proliferate and computer voices get better over the next few years -- and decades.

"We're not that concerned about their impact on the market today," notes Paul Aiken, executive director of the Authors Guild. "We have to anticipate where things are heading."

An author living today might see little reason to bother with synthetic voice rights. But perhaps decades later, after he's passed away, his estate might regret his decision.

By that time computer voices should have vastly improved. A significant percentage of the population might find them good enough, even if they're still nowhere near the quality of human pros.

Today, synthetic voices are showing dramatic improvements, spurred on by better hardware and competition in not one but many markets.

Nuance's voice technology is employed in mobile phones, personal navigation devices, and in-car entertainment systems like the Ford Sync -- all highly competitive fields.

IBM has developed a computer voice that's well suited for telephone help lines because it sighs, coughs, pauses for effect, and uses verbal tics like "um" and "er" -- reassuringly human-like imperfections.

The current Mac OS includes a reading voice called Alex that's vastly superior to predecessors. Like them it can read any selected text, but it does so with nuance and intonation, instead of an irritating monotone.

Listen carefully and you can hear Alex taking a breath between sentences. It's a nice try, but you can still tell it's synthetic. These are early days, though.

The future, no doubt, sounds more like us.