Welcome back to Pattern Recognition. So today we want to look into the original formulation
of the linear discriminant analysis that is also published under the name Fisher Transform.
So here we see the original idea that was used to derive the LDA mapping and the original idea is
that you want to project onto the direction that is essentially maximizing the class information
while minimizing the spread within the classes. So this is then introduced as the direction R
and here we stay for the time being with a two class problem so we only have the classes one
and two. Now how can we do this? Well we project everything onto this ideal direction R and this
can of course be done by taking the feature vectors and computing the inner product with r
and this then gives us a scalar value x tilde i and this value has to be determined using r
so r is determined as r star which is the maximization of the ratio of the between class
scatter and the within class scatter. So you see that the between class scatter is essentially
given as in this two class case as the subtraction of the means and essentially the two norm of this
and then the within class scatter is given here by S and S is essentially derived by the covariance
of the individual classes so the scatter within the class. We will see that in a bit and in the
end we want to apply essentially a threshold on our x tilde and this will tell us to which
class to assign the decision. So what are the steps in order to compute this? Well the first
thing you do is you can compute the mean scatter matrix for each class so you have to know the
class membership and then for the respective class you can compute the mean value and the scatter
matrix that is essentially the covariance matrix within this class and if you have that then you
can also compute the so-called within class scatter matrix this is called SW and this is
essentially the sum of the two within class scatter matrices. So this is essentially S1 plus S2
and what we also need is the between class scatter matrix and this is given simply as mu1 minus mu2
and then the outer product of same vector that gives you the between class scatter matrix. Note
that this is the simplified version for two classes only. It's essentially slightly more complex. If
you want to go towards more classes then you essentially end up with the inter and intra class
covariance matrices. Now how can we then find r star? r star is essentially used to compute our
x tilde and with x tilde we can then also find our mu tilde and sk tilde and let's start looking
into mu. You see here that this is essentially using r as an inner product and if we know that
this is an inner product we can see that r is constant over the entire term so we can compute
the mean over the respective class and pull out the r transpose. Let's look at the scatter matrix
and here you can see the scatter coefficient actually but you can see here that we actually
need to compute the variance in our RT direction and this can then essentially be transformed by
applying again our projection onto RT of the individual vectors as well as the means and this
then essentially maps back to having nothing else than the respective within scatter matrix
multiplied with RT from the left hand side and r from the right hand side. So now that we found
this we can plug this in to our actually definition of this fraction and this would then simply mean
that we have to subtract the two means and we have to divide by the square of the respective
scatter coefficients. Note that this is also known as the Generalized Rayleigh coefficient. So this
can be used in order to find RT. How do we do that? Well we maximize the generalized Rayleigh
coefficient and this is equivalent to solving the following generalized eigenvalue problem.
We essentially want to have the product of Sb times r star equals to lambda Sw times r star. Now
we can solve for only one matrix so we take the inverse of Sw and multiply it from the left hand
side and you see here then that we essentially come up with this matrix that is Sw inverse times
Sb. So this would be the inverse of the within scatter matrix times the between scatter matrix.
In this two class problem the projection of r star times Sb is always in the direction of the
two means as we have already seen in previous videos. So this is the decisive direction so
there's actually no need to compute the eigenvalues and eigenvectors of the matrix in this case because
we can simply then compute the direction as the difference between the two means and this
normalized with the inverse of the within scatter matrix that will give us r star. So usually the
total linear mapping for LDA is computed dimension by dimension through maximization of the Rayleigh
Presenters
Zugänglich über
Offener Zugang
Dauer
00:11:01 Min
Aufnahmedatum
2020-11-05
Hochgeladen am
2020-11-06 00:17:23
Sprache
en-US
In this video, we have a look at the original formulation of LDA as Fisher Transform.
This video is released under CC BY 4.0. Please feel free to share and reuse.
For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.
Music Reference: Damiano Baldoni - Thinking of You