Okay, so let's start. Yeah, welcome everyone. Our group, this is Ja, Sheva, Sofia, Oleg
and me. We would like to present you the linear support rectum machines. And at first, I want
to give a correspondence to the lecture. So we talked about the k-means clustering. And as you
may know, this was, if we get a new data point, it is classified to the group with the closest
mean to it. And linear SVM does different. We have here a separating line, which is separating
the two classes from above. And I want to give a little example on it. So we have here a binary
set. It is of cats and dogs. As you can see, this is clearly linearly separable. And yeah,
there are multiple lines, which can separate these two groups from each other. I painted
one of some of them here. And now if we have a new data point, the green cross, which is near
to all of this decision boundary. So it looks like cat, but also like a dog. So something like this.
And now we want to classify it. And as you can see, if we take different decision boundaries,
the classification would be different. So for example, for this boundary, we would say, okay,
the green cross is a dog. And for this boundary, it's a cat. And now linear SVM has the following
idea. It's not looking only on the line, but on the line, which is extended by its margin. So up
to with the width up to the nearest point of the data set. And now the aim of linear SVM is to find
a line with the maximum margin. Yeah. And to give a short overview, what we are doing today. So at
first, you want to give the theory about support vector machines. The main part of this is done
by Sophia. And then you want to show you our implementation of SVM. We implemented linear
separability check for data sets. And then Oleg will show us SVM slack variables and also its
implementation. Yeah. And at last, at Sheba's turn, she will show us an extension of SVM. It's
with multi-class classification. And yes. So I would like to start with the theory part. And yeah,
at first we need to define what is in canonical hyperplane. So we have a pair W and B, while W
is an H and H is a vector space where inner dot product is defined on it. And B is a scalar. And
this is called the canonical form of the hyperplane with respect to some data points in H. If it is
scaled such that the minimum of the absolute value of the dot product W and XI plus B is equal to 1.
And yeah, it's like here this illustration. So we have the hyperplane. And this expression can
also be presented by these two hyperplanes, H1 and H2. And now it's a canonical hyperplane if all of
these points are only allowed to lie on these two boundaries or on the outer sides, so not in between
them. And yes, W is the normal vector on the hyperplane. And the minimum distance from a point
on this hyperplane to the origin is B divided by the norm of W. And now if we have these two
inequalities here, the first one is satisfied by all edge elements. So they are lying on H1 or on
the right side of H1, so they are greater than 1. And the round elements satisfying the second
equation here, so smaller than minus 1, we have lying on the hyperplane H2 or on the left side of
it. And now we introduce a new variable, which is Yi. And this is plus 1 for positive samples,
like our edge elements here. And it's minus 1 for the round elements, for the negative samples,
which are lying on the left side of hyperplane H2 or on it. And now we can combine both inequalities
to this one with the help of our new variable. And now we can also measure the margin of the
hyperplane. So the width is given by the distance of H1 and H2. And we have here for positive
samples, which are lying on H1. We have this equation here. So we just shifted the B on the
right side. And for negative samples, we have also this equation here. So for the points which are
lying on H2. And now we can multiply the second equation with minus 1, divide by the norm of W,
and we get the width of my gene as 2 divided by norm of W. And this is very important,
so keep this in mind because Sophia will need this too later. Yeah, and shortly, we also
introduced in our paper a lemma, which says that two sets are linearly separable if and only if the
intersection of the convex hulls is empty. And I don't want to show you the proof of this,
because the illustration is quite clear. We have here the iris data set, and it is here
reduced in two dimensions. We have three different classes. And this is the convex hull around each
class. And as you can see, the blue one and the orange one is clearly linearly separable,
because there's no intersection. And the orange and the green, for example, would be not linearly
separable because they intersect. Yeah, and I think now it's Sophia's turn. Okay, so
Presenters
Zugänglich über
Offener Zugang
Dauer
00:37:37 Min
Aufnahmedatum
2020-07-22
Hochgeladen am
2020-07-29 12:36:16
Sprache
en-US
A student talk about classification using linear support vector machines given by Batsheba Darko, Andrea GIlch, Oleg Kozachok & Sofia Qafa.