Yannis Agiomyrgiannakis博士，谷歌高级研究科学家 (London, UK)
李博 博士 谷歌研究科学家 (Mountain View, CA, USA)
As Speech-based conversational agents like Alexa, Cortana, Google Now and Siri become the preferred interface for Human-Machine interaction, there is a renewed interest in Text-To-Speech technologies. This talk highlights TTS from an industrial perspective and presents new developments in the fields of Vocoding, Statistical Mapping and Voice Morphing that significantly outperform the baseline and even challenge the status-quo.
Yannis Agiomyrgiannakis finished his PhD thesis on the subject "Sinusoidal Speech Coding for Voice-over-IP" in 2006 at the University of Crete, with Yannis Stylianou. He held a post-doc position regarding speech coding for TTS systems, glottal inversion and voice transformation, at the Text-to-Speech Synthesis group in France Telecom, working with Olivier Rosec. He joined Paul Taylor's startup called "Phonetic Arts" at Cambridge, a company that was introducing speech synthesis to the game industry and was acquired by Google in 2010, where he is the DSP tech-lead for Google TTS. He is the author of 20+ publications and 17 patents in speech coding, speech processing and speech synthesis. His interests are in Signal Processing, Speech Coding, Speech Analysis/Modeling, Statistical Modeling, Sinusoidal Synthesis, Text-to-Speech, Voice-over-IP, Source/Channel Coding, Vector Quantization, Multiple Description Coding, DSP implementation, Glottal Inversion, Voice Morphing, etc.