Welcome back to pattern recognition.
So today we want to look into a couple of the advanced kernel tricks and in particular
we want to introduce the kernel PCA and we want to look into some kernels that can also
work on sequences.
So looking forward to showing you some advanced methods of pattern recognition.
So, let's revisit the PCA.
So we had some observations x1 to xm in a d-dimensional feature space and they had a
zero mean.
If they don't have a zero mean then we could of course enforce them to have zero mean by
normalization.
If we do so we can then compute the scatter matrix or the covariance matrix that is then
given as essentially the outer product of the respective feature vectors and this is
then a d times d matrix.
So this has squared the dimension of the input domain of the features.
So we could compute the eigenvectors and eigenvalues of this and this is essentially determined
from this eigenvector problem here.
Then we sort the eigenvectors with decreasing eigenvalues and this then allows us to use
them to project the features into the eigenvalues.
So let's look into some facts from linear algebra.
The eigenvectors expand the same space as the feature vectors and the eigenvectors can
be written as a linear combination of feature vectors.
So our EI can be expressed as a linear combination given some alpha and the respective observations
x.
So this then means that we can now rewrite the eigenvector eigenvalue problem for the
PCA in the following way.
So we replace the eigenvectors with the previous definition and this is on the one hand side
we see that we replace the scatter matrix with this outer product and we see that we
replace the eigenvectors with this alpha x sum and we also do that on the right hand
side.
If we do so we can rearrange this a little bit we can bring the sums on the left hand
side together and we can also bring the m on the other side of the equation.
Now let's look at this in a little more detail and in particular if we have some additional
feature vector L now and we multiply transpose from the left hand side we get the following
equation.
If you look at this then everything now in terms of feature vector turns out to be inner
products.
So we have inner products and the kernel trick can be applied if the transform features have
a zero mean.
Now for any kernel we then get the key equation for the kernel PCA that is it's the sum over
the product of the kernels and this is equal to the sum of the alphas times the kernel
again.
So this is the key equation.
We can now see that in this equation essentially the kernel matrix pops up so let's rearrange
this a bit into matrix notation then we get rid of the sums and we introduce again our
symmetric positive semi-definite kernel matrix K and K now is only in an m times m space.
So m times m is essentially the domain of the number of feature vectors.
So we essentially are here limited by the number of observations and no longer by the
dimensionality of the original feature vectors.
Now we see here that we can pull out K on both sides and this is equivalent to bring
everything on one side and you can also pull out K and there you see then that K alpha
Presenters
Zugänglich über
Offener Zugang
Dauer
00:16:32 Min
Aufnahmedatum
2020-11-13
Hochgeladen am
2020-11-13 01:28:34
Sprache
en-US
In this video, we look at the Kernel PCA and advanced kernels such as sequence kernels.
This video is released under CC BY 4.0. Please feel free to share and reuse.
For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.
Music Reference: Damiano Baldoni - Thinking of You