18 - Pattern Recognition (PR) [ID:2593]
50 von 402 angezeigt

Okay. Welcome everybody to our lecture pattern recognition.

As you probably know,

it's basically a lecture on machine learning.

Today, we are talking about optimization.

So we are doing a little bit of mathematics today.

That's why Professor Hornegger asked me to do it.

No, I'm just kidding.

Professor Honecker is as far as I know in a university board meeting,

and I will give the lecture.

As far as I know, yesterday,

you were talking about support vector machines,

about the easy case,

the hard margin case,

and later on about the soft margin case.

Here you see the optimization problem.

It's actually a convex problem that's important for our later optimization,

and you see that you have to minimize this term.

Originally, you wanted to maximize the term one over the length of the vector alpha,

and instead of maximizing this term,

we said we minimize the length of alpha.

Then we said the norm of alpha is a positive value or at least non-negative,

and if we apply a monotonic function,

we don't change the position of the minimum.

So we can add here the square without changing anything,

and we don't change the position if we multiply it with one-half.

The advantage is if later on we take the first derivative,

then we get rid of the square and we get rid of this factor,

and we just have the vector alpha.

In the soft margin case,

you have several points of two classes, and

let's take let's choose a different color for the second class.

Let's say this is class two,

and now we want to find a linear decision boundary that's important,

should be a linear plane.

Where are we? Here we go.

Not that easy.

Okay.

A linear decision boundary and we want to have this band here as large as possible.

and the width of this band is defined by these vectors here,

and we will call these vector support vectors.

This is the hard margin case.

And in the soft margin case,

we allow samples to be inside this band, for example, here.

Then we define this distance here to be Xi,

depending on the sample Xi,

and we want to minimize all distances Xi.

So we sum over all Xi's and we want to have the slack variables as small as possible.

This is an engineering factor where we weight the two terms here.

We can say this term should be more important or this term.

Then we have a number of constraints and we say the value of this term here,

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:24:36 Min

Aufnahmedatum

2012-12-11

Hochgeladen am

2012-12-11 16:26:03

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen