Welcome back to pattern recognition. Today we want to continue talking about the logistic
function and today's plan is to look into an example how to use the logistic function
with a probability density function.
So this is our decision boundary that can be modeled with the logistic function and let's
say our decision boundary delta of x equals to zero so this is the zero level set and now we can
see that this zero level set can be related to the logistic function. So points on the decision
boundary will satisfy exactly that the two probabilities, meaning the probability of the
class zero and the probability of the class one are equal. So this is the point of equilibrium
and this is essentially the place where we can't decide whether it's the one class or the other,
both of them are equally likely. Then we can rearrange the fraction of the two in the following
way. So we apply the logarithm to the fraction of the two posteriors and this is then the logarithm
of one and the logarithm of one should be equal to zero. Now we can state that the decision boundary
is given by capital F of x equals to zero. Let's look into the proof. So here you can see that we
applied the logarithm to our posterior probabilities and this is supposed to be F of x equals to zero.
We can then of course apply e to the power, this will cancel the logarithm on the left hand side
and we get e to the power of capital F of x on the right hand side. Furthermore we can now rearrange
to P of y equals to zero given x. So this is actually the probability for our class zero and
we can see that this can be expressed by the respective term here on the right hand side. Now
the probability of y equals to one can be expressed in our case also with the probability of one minus
the probability of y equals to zero given x. So this is a slight modification here and now we can
see that we have two times P of y equals zero given x and we can actually rearrange this. This gives
us then the following solution and now you see we are already pretty close to our logistic function
and we can rearrange this simply by dividing everything by e to the power of capital F of x
and then you see that the only thing that remains is one over one plus e to the minus F of x and
this is nothing else than our logistic function. Let's have a look at some example using a probability
density function. So here you see the probability of x given y equals to zero and of course we can
also find the probability for the opposite class. Here you see that we have two Gaussians and the
Gaussians simply have the same standard deviation and the means are apart and now we can also find
the posterior probabilities. So you see here that the probability for y equals to zero given x is
indicated here by this dashed line and for the opposite class we can find it as this dashed line.
Now let's go ahead and look into an example and here our example is a multivariate Gaussian that
is given this probability. So we can see this is the standard where we introduce a covariance
matrix that is given as sigma and a mean vector mu. Now what we want to show in the following that
the entire formulation above can of course also be rewritten into a posterior probability and here
then we want to find the logistic function and this logistic function then should be expressible
in the right hand term and here you see that we essentially have a quadratic function in x and
this quadratic function is able to describe exactly the posterior probability if we have two different
Gaussian functions. Now let's try to find the solution and what we're interested in is finding
the decision boundary capital F of x and again we use this trick that we want to rewrite it in
terms of the generative probabilities so we write it here as the priors times the probability of x
given the respective classes and we put that in a kind of fraction. If we do that we plug in the
definition of the Gaussian and you can see we did this on the slide so we essentially have the
fraction of the priors and we have the fraction of the Gaussians and you see that we have mu 0 and
we have some covariance sigma 0 as well as some mu 1 and sigma 1 for the second class. What we
can already figure out if we look at this term is that there are quite a few things that are not
dependent on x so we can pull out a couple of things from above equation all the things that
are not dependent on x that are essentially the priors and again the scaling variables in front
of the Gaussian distribution so you can see here that essentially the covariance matrices and the
priors essentially give us a constant component that is a kind of offset. So we observe that
priors imply a constant offset to the decision boundary and if the priors and covariance matrix
Presenters
Zugänglich über
Offener Zugang
Dauer
00:13:19 Min
Aufnahmedatum
2020-10-26
Hochgeladen am
2020-10-26 09:16:54
Sprache
en-US
In this video, we combine the logistic function with the Gaussian probability density function.
This video is released under CC BY 4.0. Please feel free to share and reuse.
For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.
Music Reference: Damiano Baldoni - Thinking of You