Welcome back to Pattern Recognition. So today we want to talk a bit about the Laplacian
support vector machines, which I think is a pretty cool method because it allows you
to implement machine learning in domains where you have labeled and unlabeled data and combine
it with techniques from manifold learning and dimensionality reduction.
So I think it's a bit exotic, still it's an exciting technique to be presented here
in this video.
So today's topic is the Laplacian support vector machine and the idea that we want to
follow here is essentially the one that you can see in this example. So let's say you
have a data set, here this clock data set, where you only have two instances that are
labeled, essentially the blue and the green point, and now you want to essentially compute
a decision boundary between the two. Obviously, if I take the two observations and do super
wise learning with a linear decision boundary, I'll get the decision boundary as shown in
the center image. So this may be not very useful. So what we actually want to get is
exploiting the internal structure of how the points are related to each other and their
geometric closeness such that we are able to figure out what is the inner part of the
clock and what is the outer part of the clock from only the two labels. So the technique
that we'll be able to compute decision boundaries like the one shown here on the right is the
Laplacian support vector machine. In order to introduce this, we have to introduce a
couple of concepts. So first of all, our training data is now composed actually of two sets.
So the complete training data is composed of L, the labeled data, and you see that we
have a total of L observations in the labeled data set. And then we have the unlabeled data,
the U that is then giving us additional observations of U up to the index M. So our entire training
data set has M observations, but only L are labeled and U are unlabeled. The other concept
that we need to build on is the graph Laplacian that is associated with our entire training
data set. And this is given as the matrix capital L and is given as the subtraction
of D minus W, where W is the adjacency matrix. So this essentially is a matrix that tells
us which elements are connected with each other and it has some weight and tells us
how strongly the two are connected. And then we essentially have the diagonal matrix D,
which is the degree of each node. And on the diagonal, we essentially have the sum over
all of the weights that are incident to that particular node. And this is then called the
degree matrix. And I get the graph Laplacian by taking D and subtracting W. And you'll
see that this matrix will become very important when we are actually deriving the Laplacian
support vector machines. Also, we need some kernel matrix and here we have again the notion
of the kernel. So this is essentially the kernel function evaluated on sum XI and XJ.
And this then is all used to find a decision boundary F. And here we are essentially writing
up the decision boundary F as a vector and it's essentially the evaluation of the decision
boundary on all of the training observations. So F is then also computed for the unlabeled
data. So it's a vector of length M. Now, if we want to learn this function, we can set
up a regularization framework and the function or decision boundary that we want to obtain
is given as the minimization over a sum of a loss function. And now the loss function
essentially tells us how good the fit is actually with respect of the function to our labeled
training observations. And here you can take, for example, the squared loss function for
regularized least squares or also the hinge loss can be applied and this would then mimic
essentially an SVM. Furthermore, we have regularization terms in here and this regularization
terms have an ambient norm and the ambient norm is essentially a norm of the function
in the reproducing kernel Hilbert space that we actually have the function defined in and
the norm is supposed to enforce a smooth condition of the possible solutions. Then we also have
the intrinsic norm. Now the intrinsic norm operates on the low dimensional manifold.
So the projected space where we then project our data on using the graph leplacian and
it enforces smoothness along the sampled M. Now we already mentioned the concept of the
Presenters
Zugänglich über
Offener Zugang
Dauer
00:16:55 Min
Aufnahmedatum
2020-11-14
Hochgeladen am
2020-11-14 20:07:36
Sprache
en-US
In this video, we introduce the Laplacian SVM.
This video is released under CC BY 4.0. Please feel free to share and reuse.
For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.
Music Reference: Damiano Baldoni - Thinking of You