2 - Seminar Meta Learning (SemMeL) - Arka Nandi - Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks [ID:23971]

50 von 463 angezeigt

Hello everyone.

Today I will be talking about the model agnostic meta-learning for fast adaptation of deep

networks written by Chelsea Finn, Peter Abel and Sergei Levine.

Before diving into the topics, I just want to give a brief outline on what I will go

through this entire presentation.

So first we will give an introduction to the meta-learning and how it works, the intuition,

the motivation behind it.

And then we will go to the model agnostic meta-learning.

And then we will learn the problem set up, the intuition and the several types of algorithms

of the meta-learning, the model agnostic meta-learning.

And then we will also look into the deep networks and the experimental evaluations of the MAML.

So, before diving into the topic, just let me say a few words that we humans can easily

learn quickly from very few examples.

And our artificial intelligence agents should also be able to do so.

But the thing is that the classical algorithms generally work well when a large data set

is given, which is not reflective of the real world, like in medical imaging.

So basically, they fail when it comes to the small data regime.

And that's where the concept of meta-learning comes into.

So we need an AI system which can continuously adapt and learn on the new job using very

small examples.

So as I have already, our goal is that a learner should be quickly able to learn new tasks

from very few examples, but it is very challenging.

Since we use a few examples, it is very prone to overfitting to new data.

And also we must integrate prior experience with a very small amount of examples.

So we have to be very careful about that.

So the concept of meta-learning is learning how to learn.

Technically a top level AI trains a bottom level AI and the bottom level AI adapts using

the prior experience basically.

So the bottom level AI is here the learner and the top level AI here is the meta-learner.

The thing is that the following things can be learned using meta-learning.

Basically we can learn a lot of things using meta-learning.

We can learn the initial parameters of the model, we can learn the optimizer parameters

and even we can also learn the task-relevant features of metrics, for example, a distance

metric.

But here in this presentation, we will only look after the initial parameters.

So now we will introduce the meta-learning.

So basically what we do, we should be able to solve new tasks which is denoted by t-test

from a set of tasks which is denoted by t1 to tn.

And the key assumption behind it that the task must share some structure.

And what I mean by sharing some structure is that it should be taken from the same distribution

of the tasks.

Let's say we have a t1 to tn which belongs to a particular distribution and we want a

task to be selected from that distribution only.

For example, the handwritten digit recognition from different languages and spam filters

for different users.

So both these tasks are structure related because it involves natural language processing.

Now we will come to the working of the meta-learning.

So here what we see, so actually the meta-learning, the entire process is divided into two parts.

It's a meta-training part and the meta-testing part.

And during meta-training, we have a training data set which also called the D-meta-train

Teil einer Videoserie :

Seminar Meta Learning (SemMeL)

Presenters

Prof. Dr. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

00:40:05 Min

Aufnahmedatum

2020-11-16

Hochgeladen am

2020-11-16 18:39:44

Sprache

en-US

Today Arka Nandi presents the paper "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks"

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.

https://arxiv.org/abs/1703.03400

Tags

Per RSS abonnieren