Welcome to the Tuesday afternoon session. We only have 45 minutes for topics that we
usually cover in 90 minutes. So I have to speak a little faster today. No overview.
We continue on the discussion of support vector machines and what we can change to improve
the performance of the support vector machines. And in this context, kernels are very important.
Whatever a kernel is, I'm going to characterize it today. It's a very, very important concept
that is heavily used in many pattern recognition applications. And the basic idea is not very
difficult. So don't worry. It's very straight, what we are going to discuss today.
So first of all, I'm going to motivate kernel. So what is a kernel? What is a kernel function?
Then I will introduce to you feature transforms. Feature transforms is nothing new to us. We
apply feature transformations in the context of LDA, for instance, where we have these
features that have a weird covariance matrix and we computed spherical data out of that
by using the SVD decomposition of the covariance matrix. So we have considered feature transforms
in terms of enforcing a certain statistical behavior of features. We've also seen different
feature transforms in the context of dimension reduction like PCA. So feature transformations
are quite familiar to us. The only different thing that we will consider today is we will
see that we gain some advantages if we do not do a projection of the features, a reduction
of the feature dimension, but going up into higher dimensional feature spaces. And then
we gain some advantages. It's completely different what you have seen in winter semester, or
at least a few of you, where we have always discussed problems with the curse of dimensionality
and the problem that we have to break things down to lower dimensional features. Today,
we will learn sometimes it makes sense to bring things up into higher dimensional spaces.
Then I will introduce the concept of kernel functions and how these kernel functions can
be used in the context of support vector machines, in the context of perceptrons, and after...
You are still smiling. Now you destroyed my concept. I don't know what I talked about
now. Good. What's your name by the way? Christian. Christian is late today. Your parents pay
500 bucks and you are late. I mean, can you imagine? Well, and then we will talk a little
bit about kernel function and different types of kernel functions. And there is one bubble
missing here, one item. This is on kernel PCA. That's something I will explain to you
next week. Okay? So we will talk about kernels and there are whole books on kernels. There
are whole lectures on kernels. There is a whole calculus on kernels. So I can teach
two semesters only on kernels. Go to the Max Planck Institute in Tübingen. You can attend
hundreds of courses on kernels. There are the leading researchers in this field. For
us, it's good to talk about this in 40 minutes and we basically cover everything that is
important. No, the basic idea. Okay? The basic idea. So let's talk about the motivation
for kernels. So far, and you might have considered this as extremely boring and I agree it's
extremely boring and it's far away from practical applications and many practical applications,
we have considered linear decision boundaries. We have considered hyperplanes to separate
just two classes. This guy is talking about pattern analysis and restricts all the discussion
to two classes with linear decision boundaries. Give me a break. That can't be true. Yeah,
but it's true. So far, it's true, but things will change from now on. Our life will completely
change now because linear decision boundaries in its current form, as we have discussed
so far, has clear limitations. I mean, basically, this concept is far too simple to solve any
practical problem. Well, we have seen a few where we can use it, but beyond the adidas
intelligence shoe, there is not that much you can do with linear decision boundaries,
basically. And even in that case, we have applied LDA to bring things down from a quadratic
decision boundary to a linear one. Many problems that we will consider in practice, they will
have non-linearly separable data and classes. So we are at the limit. We can't solve it.
Noisy data, I mean, that's in the nature of measurements. As soon as you start to use
sensors and you do measurements, you have noisy data. And even in the case of linearly
separable classes with noisy data, you run into problems with this assumption. We have
Presenters
Zugänglich über
Offener Zugang
Dauer
00:00:00 Min
Aufnahmedatum
2009-06-09
Hochgeladen am
2025-09-30 08:52:01
Sprache
en-US