46 - Beyond the Patterns - Mirco Ravanelli (MILA): The Speech Brain Project [ID:39116]

50 von 507 angezeigt

Welcome back everybody. So today I have the great pleasure to announce Mirko Ravanelli.

He is a postdoctoral researcher at Miele currently, but he will soon join Concordia University

as an assistant professor and the University of Montreal as an adjunct professor. His main

research interests are deep learning and conversational AI. So he is the author or co-author of more

than 50 papers on these research topics. He received his PhD with cum laude, so with distinction

from the University of Trento in December 2017. Mirko is an active member of the speech

and machine learning communities and he is the founder and leader of the speech main

project. And this brings us exactly to the title of today's presentation which will be

the speech main project. And Mirko, I'm really happy to have you here. I think it's really

great efforts that you're doing here and you're really doing a great service to the community

with your open source endeavors. So this is why we welcome you here. I really look forward

to your presentation and the stage is yours.

Thank you. All right. So today I would like to introduce a little bit the speech brain

project which is one project I founded. I'm leading, so it's very excited today to tell

you what we are doing here. So speech brain is an open source toolkit for conversational

AI. So conversational AI is a kind of new terminology just to refer to machines that

hopefully will be able to communicate with humans in a kind of natural way. And this

is clearly a kind of major step in artificial intelligence. It's not just kind of scientific

challenge but it's also an industrial challenge. And as you can see, right, all the big companies

made major investment in this technology. So you can see, for instance, that virtual

systems, automated call centers, chatbots, purchase systems are nowadays everywhere.

So but yeah, this kind of technology poses remarkable scientific challenges. So as we

will see, and as you probably know, building a conversational AI system requires combining

several technologies and all these technologies have still some kind of issues. For instance,

if you would like to have kind of AI assistance with a speech recognizer or any speech technology,

we know that these speech technologies might suffer a lot when dealing with under-resourced

languages, dialects, environmental noise, or even accent. If you move on, we may have

a dialogue part and the dialogue part is even more critical because we may have lack of

consistency, personality, empathy. And more in general, this kind of full big long pipeline

of technologies suffers from big issues, big out of generalization issues. Like for instance,

if we train our virtual system or conversational AI agents to solve on some topic and we talk

to the machine on totally different topics never seen during training, very often we

don't have a satisfactory conversation with the machine. So auto distribution is a really,

really big topic here. And not just here, but in general, machine learning might be

in the conversation AI is particularly important. Moreover, I see other big issues. So basically

beyond the pure scientific challenges, one big issue is that conversational AI is actually

in the hands of few players only. And the reason is that there are important barriers

that prevent this technology to be available to everyone. When I say available to everyone,

I don't mean available to everyone as user, but available to everyone like actually developing

and mastering the technology for creating new research or creating new services for

users. And the main barrier is data. Big companies have data, small companies, even us in the

academic field, we don't have a lot of data. I'm here a little bit more optimistic than

in the past. Like I see some growing projects, open source projects that try to fill this

gap like for instance, the common voice one, but still data is a big issue. Even bigger

issue is the computational resources. I know we are clearly going towards model which are

bigger and bigger and bigger. Every time we have a bigger model, we have better performance.

We don't know when this trend will saturate. But for now, training a really, really competitive

model like Wave2Vec, for instance, is out of the possibility of many companies or academic

labs. The other barrier is code. Big companies are not typically releasing their full code

for conversational AI systems so you cannot find the code for Siri or Google Home, et

Teil einer Videoserie :

Beyond the Patterns

Presenters

Prof. Dr. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

01:03:40 Min

Aufnahmedatum

2021-12-09

Hochgeladen am

2021-12-09 14:36:03

Sprache

en-US

We have the great honor to welcome Mirco Ravanelli to our lab for an invited presentation!

Abstract: SpeechBrain is an open-source and all-in-one conversational AI toolkit. It is designed to facilitate the research and development of speech and language technologies by being simple, flexible, user-friendly, and well-documented. This talk describes the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel conversational AI pipelines. SpeechBrain achieves competitive or state-of-the-art performance in a wide range of benchmarks. It also provides training recipes, pretrained models, and inference scripts for popular datasets, as well as tutorials that allow anyone with basic Python proficiency to familiarize themselves with speech technologies. The talk will also discuss the future research direction for this project and share some ideas for the future development of intelligent speaking machines.

Short Bio: Mirco Ravanelli is currently a postdoc researcher at Mila. He will soon join Concordia University as an Assistant Professor and the Université de Montréal as an adjunct professor. His main research interests are deep learning and Conversational AI. He is the author or co-author of more than 50 papers on these research topics. He received his Ph.D. (with cum laude distinction) from the University of Trento in December 2017. Mirco is an active member of the speech and machine learning communities. He is the founder and leader of the SpeechBrain project.

References

Speech Brain Project Website: https://speechbrain.github.io

This video is released under CC BY 4.0. Please feel free to share and reuse.

For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.

Music Reference:
Damiano Baldoni - Thinking of You (Intro)
Damiano Baldoni - Poenia (Outro)

Einbetten

Wordpress FAU Plugin

 https://www.fau.tv/clip/id/39116

iFrame

<iframe src="https://api.video.uni-erlangen.de/services/oembed/?url=https://www.fau.tv/clip/id/39116&format=iframe&maxwidth=1280&maxheight=720" width="1280" height="720"seamless allowfullscreen style="border: 0; padding: 0; margin: 0;overflow: hidden;"></iframe>

Herunterladen

Video

Per RSS abonnieren