27 - Student talk: Linear Support Vector Machines [ID:20234]

50 von 285 angezeigt

Okay, so let's start. Yeah, welcome everyone. Our group, this is Ja, Sheva, Sofia, Oleg

and me. We would like to present you the linear support rectum machines. And at first, I want

to give a correspondence to the lecture. So we talked about the k-means clustering. And as you

may know, this was, if we get a new data point, it is classified to the group with the closest

mean to it. And linear SVM does different. We have here a separating line, which is separating

the two classes from above. And I want to give a little example on it. So we have here a binary

set. It is of cats and dogs. As you can see, this is clearly linearly separable. And yeah,

there are multiple lines, which can separate these two groups from each other. I painted

one of some of them here. And now if we have a new data point, the green cross, which is near

to all of this decision boundary. So it looks like cat, but also like a dog. So something like this.

And now we want to classify it. And as you can see, if we take different decision boundaries,

the classification would be different. So for example, for this boundary, we would say, okay,

the green cross is a dog. And for this boundary, it's a cat. And now linear SVM has the following

idea. It's not looking only on the line, but on the line, which is extended by its margin. So up

to with the width up to the nearest point of the data set. And now the aim of linear SVM is to find

a line with the maximum margin. Yeah. And to give a short overview, what we are doing today. So at

first, you want to give the theory about support vector machines. The main part of this is done

by Sophia. And then you want to show you our implementation of SVM. We implemented linear

separability check for data sets. And then Oleg will show us SVM slack variables and also its

implementation. Yeah. And at last, at Sheba's turn, she will show us an extension of SVM. It's

with multi-class classification. And yes. So I would like to start with the theory part. And yeah,

at first we need to define what is in canonical hyperplane. So we have a pair W and B, while W

is an H and H is a vector space where inner dot product is defined on it. And B is a scalar. And

this is called the canonical form of the hyperplane with respect to some data points in H. If it is

scaled such that the minimum of the absolute value of the dot product W and XI plus B is equal to 1.

And yeah, it's like here this illustration. So we have the hyperplane. And this expression can

also be presented by these two hyperplanes, H1 and H2. And now it's a canonical hyperplane if all of

these points are only allowed to lie on these two boundaries or on the outer sides, so not in between

them. And yes, W is the normal vector on the hyperplane. And the minimum distance from a point

on this hyperplane to the origin is B divided by the norm of W. And now if we have these two

inequalities here, the first one is satisfied by all edge elements. So they are lying on H1 or on

the right side of H1, so they are greater than 1. And the round elements satisfying the second

equation here, so smaller than minus 1, we have lying on the hyperplane H2 or on the left side of

it. And now we introduce a new variable, which is Yi. And this is plus 1 for positive samples,

like our edge elements here. And it's minus 1 for the round elements, for the negative samples,

which are lying on the left side of hyperplane H2 or on it. And now we can combine both inequalities

to this one with the help of our new variable. And now we can also measure the margin of the

hyperplane. So the width is given by the distance of H1 and H2. And we have here for positive

samples, which are lying on H1. We have this equation here. So we just shifted the B on the

right side. And for negative samples, we have also this equation here. So for the points which are

lying on H2. And now we can multiply the second equation with minus 1, divide by the norm of W,

and we get the width of my gene as 2 divided by norm of W. And this is very important,

so keep this in mind because Sophia will need this too later. Yeah, and shortly, we also

introduced in our paper a lemma, which says that two sets are linearly separable if and only if the

intersection of the convex hulls is empty. And I don't want to show you the proof of this,

because the illustration is quite clear. We have here the iris data set, and it is here

reduced in two dimensions. We have three different classes. And this is the convex hull around each

class. And as you can see, the blue one and the orange one is clearly linearly separable,

because there's no intersection. And the orange and the green, for example, would be not linearly

separable because they intersect. Yeah, and I think now it's Sophia's turn. Okay, so

Teil einer Videoserie :

Mathematical Data Science 1

Presenters

Prof. Dr. Daniel Tenbrinck

Zugänglich über

Offener Zugang

Dauer

00:37:37 Min

Aufnahmedatum

2020-07-22

Hochgeladen am

2020-07-29 12:36:16

Sprache

en-US

A student talk about classification using linear support vector machines given by Batsheba Darko, Andrea GIlch, Oleg Kozachok & Sofia Qafa.

Tags

Per RSS abonnieren