Title: Towards a Speech Version of ChatGPT
Abstract: Over the past few months, the capabilities of large language models (LLM) like ChatGPT have sparked widespread discussion. They possess a general processing ability, and they can often accomplish the desired tasks with the right instructions. Speech, teeming with rich and hierarchical information, offers different needs for diverse tasks. How far are we from the speech version of ChatGPT? What are we still missing for the birth of the speech ChatGPT? In this talk, I will share a series of the latest research findings in speech LM that could potentially lead us to the speech version of ChatGPT and discuss current challenges and potential solutions.
Hung-yi Lee (李宏毅) is a professor of the Department of Electrical Engineering of National Taiwan University (NTU), with a joint appointment at the Department of Computer Science & Information Engineering of the university. His recent research focuses on developing technology that can reduce the requirement of annotated data for speech processing and natural language processing. He won Salesforce Research Deep Learning Grant in 2019, AWS ML Research Award in 2020, Outstanding Young Engineer Award from The Chinese Institute of Electrical Engineering in 2018, Young Scholar Innovation Award from Foundation for the Advancement of Outstanding Scholarship in 2019, Ta-You Wu Memorial Award from Ministry of Science and Technology of Taiwan in 2019, and The 59th Ten Outstanding Young Person Award in Science and Technology Research & Development of Taiwan. He owns a YouTube channel teaching deep learning in Mandarin with about 160k Subscribers.