13 - Deep Learning [ID:12741]

50 von 705 angezeigt

So, non-operators, so is it even sensible to have these all powerful networks?

We have already seen that using a fully connected layer everywhere is not really the way to

go.

Instead, we use convolutional layers and batch normalization, et cetera, et cetera.

And maybe we can take this concept even further if we know more about our problem.

But let's start with the weakly supervised part.

So learning with limited annotations, what does that even mean?

We have seen for supervised learning that we can achieve impressive results.

So here you see an example or an output of mask R-CNN where we see clearly the people,

the person boundaries and the boundaries of the airplane, different instances of persons

are detected.

So you've talked about this last week.

And kind of a prerequisite for this kind of quality segmentation object detection is that

we on one hand have large amounts of training data.

So remember ImageNet, remember MSCoco for example.

And on the other hand, that this data comes with consistent and high quality annotations

so that the network can really learn the concepts based on the training data and based on the

labels that it has.

But if we look at what this means getting high quality annotations, we can take a second

look basically.

So if we just want to classify what is in an image as you can see here, it's relatively

fast.

So for MSCoco which has around 90 classes, they had a pretty smart way of annotating

it.

It took around 27 seconds per image to say which objects are contained in this image.

And in this case it's for example dog and bottle.

So this is the only label that is provided for this image in this very first case.

And we call this image level labels because we don't know how many objects are there.

We don't know what is their relationship to each other.

And we also don't know what their exact boundaries are for example.

Then kind of the next step in the pipeline is so-called instant spotting.

So this is now where are the objects located, localization task.

And if we have the class labels, then this can be done in another 14 seconds.

But here we are already at half a minute.

If you think about ImageNet for example, this scales pretty rapidly with the amount of working

hours that you need to annotate such a data set.

Now instance segmentation, so really creating basically dense boundaries for these objects

and really getting segmentation masks takes around 80 seconds.

This is now not per image but actually per instance in the image.

These are all average values so of course they are more complicated and easier to segment

objects and instances in such an image but generally around 80 seconds per instance.

Now if we want to get to annotations like this, so really dense and pixel labels where we

don't have overlap between the different classes and we don't have unnecessary overlap between

the different classes, it takes around 1.5 hours per image.

So this is again per image.

But just looking at this you can see how easily this annotation effort scales upwards and

how much money we actually need to spend to get good quality annotations and we haven't

even talked about getting just the data for it.

So on one hand we have the issue with getting data.

This is in the medical domain a big problem but it's even harder to find a physician or

Teil einer Videoserie :

Deep Learning

Presenters

Prof. Eva Katharina Breininger

Zugänglich über

Offener Zugang

Dauer

01:17:03 Min

Aufnahmedatum

2020-01-28

Hochgeladen am

2020-01-28 17:09:03

Sprache

en-US

Tags

Per RSS abonnieren