Welcome back to Deep Learning. Today we want to continue talking about our common practices
and the methods that we are interested in today is class imbalance.
Machines don't really have common sense now.
So a very typical problem is that one class, in particular the very interesting one, is
not very frequent. So this is a challenge for all the machine learning algorithms because,
let's take the example of fraud detection, out of 10,000 transactions 9,999 are genuine
and only one is fraudulent. So if you classify everything as genuine, you get 99.99% accuracy.
Obviously, if you had a model that would misclassify one out of 100 transactions,
then you would end up only in a model with 99% accuracy. So this is of course a very hard
problem and in particular in screening applications you have to be very careful because
just classifying everything to the most common class would still get you very very good accuracies.
Machine learning is the science of sloppiness.
It doesn't have to be credit cards. For example, here detecting mitotic cells is a very similar
problem. So a mitosis is a cell undergoing cell division and they are very important,
as we already heard in the introduction, because if you count the cells under mitosis,
you know how aggressively this cancer is growing. So this is a very important feature,
but you have to detect them correctly and they make up only a very small portion of the cells
and tissues. So the data of this class is seen much less during the training and measures like
the L2 norm or cross entropy don't show this imbalance, so they are not very responsive to this.
One thing that you can do for example is resampling. So the idea is that you balance
the class frequencies by sampling classes differently. So you can under sample. This
means that you have to throw away a lot of the training data of the most frequent classes
and this way you get to train a classifier that will be balanced towards both of these classes.
The stuff that works best is really simple. Now they're seen approximately as frequent as the
other class. Now the disadvantage of this approach is that you're not using all the data that is
being seen and of course you don't want to throw away data. We have 99% of all the data.
So another technique is oversampling and you can just sample more often from the
underrepresented classes and in this case you can use all of the data. Well the disadvantage
is of course that it can lead to rather heavy overfitting towards the less frequently seen
examples. Also possible are combinations of under and oversampling. This then leads to the
following procedure. This is an advanced resampling technique to try to avoid the
shortcomings of over and under sampling by synthetic minority oversampling techniques mode
but it's rather uncommon in deep learning. Underfitting caused by under sampling can be
reduced by taking a different subset after each epoch which is quite common and also you can use
data augmentation to help reducing overfitting for underrepresented classes. So you essentially
augment more of the samples that you have seen less frequently. Very typical choice. We are happy
that it works better than any competing method but that doesn't mean that we think we are done.
So instead of fixing the data of course you can also try to adapt the loss function
to be stable with respect to class imbalance and here you then choose a loss function which is
a loss with the inverse class frequency. So you can then create the weighted cross entropy where
you introduced it additional weight Wk and Wk is simply determined as the inverse class frequency.
More common in segmentation problems are then things like a dice based loss based on the dice
coefficient that is a very typical measure for evaluating segmentations. Instead of class
frequency weights can also be adapted with regards to other considerations but we are not discussing
them here in this current lecture. What is clear to me is that engineers and companies and labs
grad students will continue to tune architectures and explore all kinds of tweaks to make the
current state of the art ever slightly better but I don't think that's going to be nearly enough.
This already brings us to the end of this part and in the final lecture of common practices
we will now discuss measures of evaluation and how to evaluate our models appropriately.
So thank you very much for listening and goodbye.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:05:47 Min
Aufnahmedatum
2020-05-16
Hochgeladen am
2020-05-16 22:56:19
Sprache
en-US
Deep Learning - Common Practices Part 3
This video discusses the problem of class imbalance and how to compensate for it.
Video References:
Lex Fridman's Channel
Further Reading:
A gentle Introduction to Deep Learning