Welcome back to pattern recognition. So today we want to look a little more into modeling of
decision boundaries and in particular we are interested what is happening with other distributions
and we are also interested what is happening if we have equal dispersion or equal standard
deviations in different distributions.
So now we want to look into a special case and the special case with the Gaussian here is that we
have a covariance matrix that is identical for both classes and if we do so we can see that the
formulation that we found earlier in the previous video collapses in particular to a simplification
of the matrix A. So matrix A was the quadratic part and this is simply a zero matrix now. This
means essentially that the entire quadratic part cancels out because we are simply multiplying
the one matrix with the inverse of the other matrix. They are identical so they simply turn
out to be zero. The nice thing here is that we can already see from the formulation that we find
here that we essentially have a line that is separating now those two distributions and the
line is essentially given by the difference between the two means and it's weighted by the inverse
covariance matrix and of course there is an offset and the offset is mainly dependent on the prior
probability for the two classes and this is weighted then by the difference of the two means
multiplied by the inverse covariance matrix. Well if we look at this example here you can already
see these are now two Gaussians and those two Gaussians have the same prior and they are
distributed with the same covariance matrix and if we have this case our decision boundary collapses
to a line so this is simply the line as shown here. Again if we would play with the priors the line
would move back and forth because this is altering the offset between the two classes.
So if the conditionals are Gaussians and share the same covariance the exponential function is
affine in X and this then results in a line as decision boundary. The nice thing about this
observation is that this result is also true for a more general function of probability density
functions so it's not just limited to Gaussians and we do express this as equal dispersion.
Generally there is the exponential family of probability density functions and they can be
written in the following canonical form so this is some e to the power of theta which is essentially
the location parameter vector and this is multiplied as an inner product with X and then there is some
function b of theta and this is divided by the dispersion parameter and then we still add some
c of X and the dispersion which is essentially then able to add some additional bias turn on top here.
So you see that this canonical form is implying several functions b c and a and we're not defining
the functions here right now but if we use this kind of parametrization we can then see quite a
few probability density functions can be expressed in the following way. So let's look into the
exponential family and of course we find the Gaussian probability density function here you've
already seen that there is different ways of formulating the Gaussian here we all have diagonal
covariance matrices and obviously there's also non-diagonal covariance matrices which then results
in this rotation here as you can see so this is a very typical function that we're using to model
the probability densities. Then we also have the exponential probability density function so you
can see here that generally functions that can be expressed as lambda times e to the power of minus
lambda X fall into this category and this kind of function is very interesting because these are
exponential decays so these are probabilities for observing decays and are very commonly used
for example in radioactivity the functions generally follow this kind of format but this
is also true for any kind of exponential decay by the way also beer foam follows this decay rule.
So you can see if we vary the parameter you see that we can find different kinds of decay functions.
Well not just the exponential probability density function follows the exponential family but also
the binomial probability mass function that is given here as n choose k p to the power of k 1
minus p to the power of n minus k and you see these are repeated experiments so for example the
coin toss that we talked about earlier so if we have the classical Bernoulli experiments or
multinoulli experiments they will follow very similar probability mass functions and here you
see different instances of the binomial probability mass function. Also the Poisson Probability Mass
Function follows this exponential family so here you see the Poisson Probability Mass Function and
Presenters
Zugänglich über
Offener Zugang
Dauer
00:10:22 Min
Aufnahmedatum
2020-10-26
Hochgeladen am
2020-10-26 13:37:00
Sprache
en-US
In this video, we look into further probability density and mass functions of the exponential family.
This video is released under CC BY 4.0. Please feel free to share and reuse.
For reminders to watch the new video follow on Twitter or LinkedIn. Also, join our network for information about talks, videos, and job offers in our Facebook and LinkedIn Groups.
Music Reference: Damiano Baldoni - Thinking of You