Nuance Advances Text-to-Speech Technology through Deep Learning
Combining advancements in deep learning with knowledge-based developments, Nuance’s Vocalizer suite of TTS solutions – including Vocalizer Embedded for embedded platforms, Vocalizer Server for cloud applications and the
Nuance’s approach to use deep neural networks for speech synthesis is as follows. First, the networks learn the relation between written text and the corresponding voice characteristics from Nuance’s vast speech data. Then, the system applies this knowledge to the words and phrases in an unseen text. In addition to learning the relations between the orthographic representation of the words and the acoustic output, Nuance’s deep neural nets also use the context of the utterances to ensure that words are spoken in the appropriate expressive manner for the application, with the proper pattern of stress and intonation. For example, street names and driving directions sound clearly intelligible and articulated, whereas dialogs with a virtual assistant sound more fluent and dynamic.
“The advancements we have made through the application of DNN allow our text-to-speech technology to deliver high-quality, more expressive speech output, enabling more natural interactions between man and machine,” said
Key applications of Nuance Vocalizer include:
- Automotive in-dashboard systems and virtual assistants
- Robotics and autonomous virtual agents
- Digital television and set-top boxes
Omni-channel customer engagement services
Nuance’s enhanced text-to-speech solutions are available for the cloud today and will be made available for embedded devices this year. For more information about Vocalizer, including voice samples, visit https://www.nuance.com/mobile/mobile-solutions/vocalizer-expressive.html.
Trademark reference: Nuance and the Nuance logo are registered trademarks or trademarks of
Source: Nuance Communications, Inc.