42 - Pattern Recognition [PR] - PR 38 [ID:24177]
50 von 96 angezeigt

Welcome back to pattern recognition.

So today we want to continue looking into other boosts and in particular we want to

see the relation between other boosts and the exponential loss.

So, boosting fits an additive model in a set of elementary basis functions.

So the result of boosting are essentially created by expansion coefficient beta and

some b which is a basis function given set of parameters gamma.

So these additive expansion methods are very popular in learning techniques.

You can see that very similar things are already applied in single hidden layer neural networks

building on the perceptron, in wavelets, in classification trees and so on.

So the expansion models are typically fit by minimizing a loss function L that is averaged

over the training data.

So you essentially can write this as L over the entire training data and then you can

plug in our definition and you see that we have this additive model that is essentially

given by our function fm.

So the forward stagewise modeling approximates the solution to one.

So the new basis functions are sequentially added, parameters and coefficients of already

added functions are not changed and at each iteration only a subproblem of fitting just

a single basis function is solved.

So we can express now the mth subproblem in the following way.

So we essentially have the loss of the m minus one solution plus beta and the current estimate

and we are minimizing over gamma and beta.

Now addaboost can be shown to be equivalent to forward stagewise additive modeling using

an exponential loss function.

So if our loss function equals to the exponent of minus y times f of x we are essentially

constructing the addaboost loss.

So let's prove this.

For addaboost the basis functions are the classifiers gm and they produce the output

of either minus one or plus one.

Using the exponential loss function we now must solve at every step that we essentially

minimize over beta and g the sum over the exponential loss.

If we now introduce a weight that we introduce here as wi and we set it to exponent of minus

y times fm minus one of xi we can rewrite this as a minimization over this weighted

sum of exponential functions.

So we have some observations with this.

Since the wi is independent of beta and g of x it can be seen as a weight that is applied

to each observation.

However this weight depends on the previous functions so the weights change with each

iteration m.

This then allows us to reformulate this problem a little.

We split it up into the misclassified and the correctly classified samples and then

we are able to rearrange this minimization expression and you see that we can again use

our indicator function here for representing the misclassified samples.

Now for every value of beta greater to zero the solution for this minimization process

is found as the minimization over the sum of the weight times the indicator function.

So if we plug the reformulated gm into the objective function and so for beta m this

yields that beta m is given as one over two times the logarithm of one minus the error

m divided over the error m.

Where error m is the minimized weighted error rate and here you see that it is essentially

the misclassification weight summed up divided by the total sum over the weights.

Now from the update formula of the approximation we can calculate the weights for the next

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:10:00 Min

Aufnahmedatum

2020-11-17

Hochgeladen am

2020-11-17 22:48:12

Sprache

en-US

In this video, we show that Adaboost is actually optimizing the exponential loss.

This video is released under CC BY 4.0. Please feel free to share and reuse.

For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.

Music Reference: Damiano Baldoni - Thinking of You

Einbetten
Wordpress FAU Plugin
iFrame
Teilen