41 - Pattern Recognition [PR] - PR 37 [ID:24143]
50 von 72 angezeigt

Welcome back to Pattern Recognition. So today we want to look into a particular class of

classification algorithms and these are boosting algorithms that try to unite multiple weak

classifiers into a strong one and today we want to look into other boosts.

So let's have a look at the idea behind other boosts. Boosting generally is trying to build

a more powerful classifier from weak classifiers and you could say it is one of the most powerful

learning techniques introduced in the last 20 years. It can be applied to any kind of

classification system and you can get a bit of additional performance in any kind of situation

where you really want to tweak out the last couple of percent in terms of performance.

So the idea is to combine the output of many weak classifiers into a powerful

committee. So it is essentially using the wisdom of the crowd and the most popular

boosting algorithm is called addaboost and was introduced in 1997. So a weak classifier

is one whose error is only slightly better than random guessing but you need to have a classifier

that is better than random guessing. Boosting won't work with classifiers that are worse than

random guessing. So the idea of boosting now is that you sequentially apply the weak classifier

to repeatedly modified versions of the data and this produces then a sequence of classifiers and

the weighted majority vote yields the final prediction. So we consider now a two class

problem with the class being either minus one or one and given some observations D that have a

sample and a class we can then train classifiers G of X and this then implies that we also have

an error on the training data set and this can be computed as the average error as one over N

and then the sum over all the misclassifications where I is the indicator function which essentially

checks whether this sample is true then it returns one otherwise zero. Now we can sequentially apply

the weak classifiers to produce a sequence of GM of these different weak classifiers and then

you can combine the weak classifiers in a sum and the final classification of 4G of X is then given

as the sign over the weighted sum of the individual classifiers. So we need some weighting factors alpha,

alpha one to alpha M and they are computed by the boosting algorithm then each alpha weighs the

output of the corresponding classifier. So you could summarize visually the boosting algorithm

that you start with training some classifier then you compute the error on the training set,

you reweight the samples, train a new classifier, compute the error on the training set, reweight

the samples and so on and so on until you then stop at some point where you have trained M

classifiers. So each boosting step consists of applying the weights to the training samples

that essentially amplify the weight of misclassified samples and initially you can start with the weights

just uniformly distributed. So the first classifier in the sequence is trained just the usual way and

then for M greater or equal to 2 the weights are then individually modified and then the classifiers

GM with M greater or equal to 2 are trained on differently weighted samples. Now the weighting

scheme at step M looks at the misclassified observations that have been produced by the

previous classifier and there we have to increase the weights and the weights for correctly

classified samples have decreased weights. So this then means that observations that are difficult

to classify get ever increasing influence over the iterations. Each successive classifier is

forced to concentrate more on those observations that were misclassified by the previous one.

This then brings us to the adderboost iteration scheme. So you initialize your weights with a

uniform distribution then you set the iteration counter to 1, you fit the first classifier onto

the training data set and you use the current weights, then you compute the classification

error that is essentially a weighted sum of the misclassifications, then you update the

classifier weights and here we use the classification error in order to guide us which classifier to

trust more and which classifier to trust less and then we can compute new sample weights again using

our classification loss of our new joint classifier and this way we can then go ahead and iterate over

and over again until we have built the desired number of classifiers in our example M. Then we

finally output the completely trained classifier. So you see that in every iteration of adderboost

we have to train a new classifier. So this version of adderboost is called the discrete version

because each classifier returns a discrete class label. Adderboost can also be modified to return

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:09:07 Min

Aufnahmedatum

2020-11-17

Hochgeladen am

2020-11-17 19:19:52

Sprache

en-US

In this video, we introduce the Adaboost Algorithm that fuses many weak classifiers to a strong one.

This video is released under CC BY 4.0. Please feel free to share and reuse.

For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.

Music Reference: Damiano Baldoni - Thinking of You

Einbetten
Wordpress FAU Plugin
iFrame
Teilen