Hello and welcome to the unit on semi-supervised learning that is part of the seminar advances
in deep learning for time series.
What is this about?
Well
as you know
supervised learning models
they require really large amounts of label
training data to learn properly.
But labeling itself is costly.
It not just takes time, but it also requires knowledge that's sometimes rare and you need
to hire domain experts like medical doctors or lawyers or people that know the topic and
can label the data correctly.
Sometimes our knowledge is just not enough.
Well
generally
there's a large amount of unlabeled data or data that has not yet been
labeled, then labeled data.
So you often as a practitioner in a real world project
you're often faced with this context
with limited labeled data scenarios.
So that means you have just the tiny amount of labeled data
maybe some samples and tons
and tons of unlabeled data.
Now the question is
how can we best leverage this information that is already given but
unlabeled if we already have a little bit of labeled samples?
We are in a scenario such as this.
We have a lot of data which is unlabeled and here shown as these gray dots and we have
some samples that are annotated as a subset.
Here we have three annotated green class triangles and four annotated orange class squares.
If we train a model only on these seven data points
we will receive this decision boundary
has a small margin around it and some of the points are actually on the decision boundary.
Now last week we have used active learning to specifically annotate such samples for
which the model is the least certain or the least confident
for example.
This week we're going to look at what we could do up here.
For example
what we could do using label propagation.
Semisupervised learning assumes that these nearby samples here have the same class as
the labeled samples.
So if you have unlabeled samples
nearby labeled samples
it's likely they have a similar or
the same class.
This is an assumption, of course.
Label propagation assumes this.
Just to recall and to contrast
active learning instead uses the most informative samples
to annotate.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:54:46 Min
Aufnahmedatum
2025-11-11
Hochgeladen am
2025-11-11 12:20:10
Sprache
en-US