Post to Tumblr - Preview

bendedreality.com

| New Google AI Mimics Human Speech to Near Perfection

ELOCUTION LESSONS Last year, artificial intelligence (AI) research company DeepMind shared details on WaveNet, a deep neural network used to synthesize realistic human speech. Now, an improved version of the technology is being rolled out for use with Google Assistant. A system for speech synthesis - otherwise known as text-to-speech (TTS) - typically utilizes one of two techniques. Concatenative TTS involves the piecing together of chunks of recordings from a voice actor. The drawback of this method is that audio libraries must be replaced whenever upgrades or changes are made. The other technique, parametric TTS, uses a set of parameters to produce computer-generated speech, but this speech can sometimes sound unnatural and robotic. WaveNet, on the other hand, produces waveforms from scratch based on a system developed using a convolutional neural network. To begin, a large number of speech samples were used to train the platform to synthesize voices, taking into account which