Welcome back everybody. So today I have the great pleasure to announce Mirko Ravanelli.
He is a postdoctoral researcher at Miele currently, but he will soon join Concordia University
as an assistant professor and the University of Montreal as an adjunct professor. His main
research interests are deep learning and conversational AI. So he is the author or co-author of more
than 50 papers on these research topics. He received his PhD with cum laude, so with distinction
from the University of Trento in December 2017. Mirko is an active member of the speech
and machine learning communities and he is the founder and leader of the speech main
project. And this brings us exactly to the title of today's presentation which will be
the speech main project. And Mirko, I'm really happy to have you here. I think it's really
great efforts that you're doing here and you're really doing a great service to the community
with your open source endeavors. So this is why we welcome you here. I really look forward
to your presentation and the stage is yours.
Thank you. All right. So today I would like to introduce a little bit the speech brain
project which is one project I founded. I'm leading, so it's very excited today to tell
you what we are doing here. So speech brain is an open source toolkit for conversational
AI. So conversational AI is a kind of new terminology just to refer to machines that
hopefully will be able to communicate with humans in a kind of natural way. And this
is clearly a kind of major step in artificial intelligence. It's not just kind of scientific
challenge but it's also an industrial challenge. And as you can see, right, all the big companies
made major investment in this technology. So you can see, for instance, that virtual
systems, automated call centers, chatbots, purchase systems are nowadays everywhere.
So but yeah, this kind of technology poses remarkable scientific challenges. So as we
will see, and as you probably know, building a conversational AI system requires combining
several technologies and all these technologies have still some kind of issues. For instance,
if you would like to have kind of AI assistance with a speech recognizer or any speech technology,
we know that these speech technologies might suffer a lot when dealing with under-resourced
languages, dialects, environmental noise, or even accent. If you move on, we may have
a dialogue part and the dialogue part is even more critical because we may have lack of
consistency, personality, empathy. And more in general, this kind of full big long pipeline
of technologies suffers from big issues, big out of generalization issues. Like for instance,
if we train our virtual system or conversational AI agents to solve on some topic and we talk
to the machine on totally different topics never seen during training, very often we
don't have a satisfactory conversation with the machine. So auto distribution is a really,
really big topic here. And not just here, but in general, machine learning might be
in the conversation AI is particularly important. Moreover, I see other big issues. So basically
beyond the pure scientific challenges, one big issue is that conversational AI is actually
in the hands of few players only. And the reason is that there are important barriers
that prevent this technology to be available to everyone. When I say available to everyone,
I don't mean available to everyone as user, but available to everyone like actually developing
and mastering the technology for creating new research or creating new services for
users. And the main barrier is data. Big companies have data, small companies, even us in the
academic field, we don't have a lot of data. I'm here a little bit more optimistic than
in the past. Like I see some growing projects, open source projects that try to fill this
gap like for instance, the common voice one, but still data is a big issue. Even bigger
issue is the computational resources. I know we are clearly going towards model which are
bigger and bigger and bigger. Every time we have a bigger model, we have better performance.
We don't know when this trend will saturate. But for now, training a really, really competitive
model like Wave2Vec, for instance, is out of the possibility of many companies or academic
labs. The other barrier is code. Big companies are not typically releasing their full code
for conversational AI systems so you cannot find the code for Siri or Google Home, et
Presenters
Zugänglich über
Offener Zugang
Dauer
01:03:40 Min
Aufnahmedatum
2021-12-09
Hochgeladen am
2021-12-09 14:36:03
Sprache
en-US
We have the great honor to welcome Mirco Ravanelli to our lab for an invited presentation!
Abstract: SpeechBrain is an open-source and all-in-one conversational AI toolkit. It is designed to facilitate the research and development of speech and language technologies by being simple, flexible, user-friendly, and well-documented. This talk describes the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel conversational AI pipelines. SpeechBrain achieves competitive or state-of-the-art performance in a wide range of benchmarks. It also provides training recipes, pretrained models, and inference scripts for popular datasets, as well as tutorials that allow anyone with basic Python proficiency to familiarize themselves with speech technologies. The talk will also discuss the future research direction for this project and share some ideas for the future development of intelligent speaking machines.
Short Bio: Mirco Ravanelli is currently a postdoc researcher at Mila. He will soon join Concordia University as an Assistant Professor and the Université de Montréal as an adjunct professor. His main research interests are deep learning and Conversational AI. He is the author or co-author of more than 50 papers on these research topics. He received his Ph.D. (with cum laude distinction) from the University of Trento in December 2017. Mirco is an active member of the speech and machine learning communities. He is the founder and leader of the SpeechBrain project.
Register for more upcoming talks here!
References
Speech Brain Project Website: https://speechbrain.github.io
This video is released under CC BY 4.0. Please feel free to share and reuse.
For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.
Music Reference:
Damiano Baldoni - Thinking of You (Intro)
Damiano Baldoni - Poenia (Outro)