The following content has been provided by the University of Erlangen-Nürnberg.
We have one Gaussian, two dimensional Gaussian here, one here and one here. And now we pick
one of the three Gaussians with a certain prior probability and we generate a sample according
to the PDF. And we did this several times so we have now different samples and the colors
indicate which component actually generated the sample. So if you want to compute the
priors of the three Gaussians, the mixture weights, that's rather simple. You just count
the number of blue triangles and divide it by the total number of samples, that's the
prior probability of this class. If you want to have the prior probability of the second
class, you count the number of green squares divided by the total number of samples and
that's your prior. If you want to compute the mean of this Gaussian, that's also simple.
You take all the blue triangles, you sum over all blue samples and divide it by the number
of blue samples. And the same way you compute the covariance matrix. You can do this for
all the three components by just considering the samples that belong to this class. That's
supervised learning, you know the class information. You know which component a certain sample
was generated by. However, we are talking about unsupervised learning. Unsupervised learning
means we don't have the class information. We have no color, we have no squares, no triangles,
no triangles, no circles anymore. We just have different samples. And you can see if
you have a look at it, this might be one component, this might be the other component and here
might be a third, a Gaussian component. But if I ask you which component generated this
feature, it's hard to tell. It's pretty likely that this component generated this feature.
If I take this sample, it's even more clear. What about this sample here? It might be generated
by this component, might be generated by that component, might even be generated by this
upper component. You don't know. If you have a look what the crown truth was actually,
you see this sample here was generated by the green component. This sample here was
generated by the upper component. That's unsupervised learning.
So, what was the idea of the EM algorithm? I showed you the EM algorithm or let's say
the outcome, the result of the EM algorithm for GMMs. These are the resulting formulas.
We can compute the mean vector for each component, the covariance matrix and the mixture weights.
If we know the probabilities from the expectation step that tells us what's the probability
that a sample XI was generated by component K. And if we have these parameters, we can
estimate these probabilities. So, this is an iteration scheme. I'd like to show you
these formulas in action actually. So, I have to switch devices right now. Let's see if
this works. Perfectly. So, it's a one-dimensional case. You see the
data with a certain distribution and now we want to estimate Gaussian densities. For example,
just one. So, we just start with an arbitrary Gaussian and then we estimate the mean and
the standard deviation. So, we use one iteration scheme. So, we have a standard deviation.
So, we use one iteration step and now we have the mean. The mean is pretty in the middle
here and we have this standard deviation. What happens if I iterate some more steps?
What happens in the second step if I apply these formulas? The formulas say, compute
the probability that this sample was generated by this PDF and do this for all your samples.
Then, assign the samples to your PDF and compute the mean and the variance again. We only have
one PDF. So, all samples are assigned to this PDF. I compute the mean over all samples.
That's what I did already. So, the mean and the variance doesn't change anymore. I can
iterate. Nothing happens. So, let's use two densities. Initialized somehow and now what
will happen? I will assign each sample to one of the two PDFs or compute the probability
that this PDF generated this sample which will be higher than that this probability
higher than that this PDF generated this sample. Okay? So, I will, if I use a hard decision,
I will assign these samples here to the first PDF and the other samples here to the second
PDF. So, I have two groups of samples now and from the first group I compute the mean
and from the second group of samples I compute the mean and the standard deviation again.
Presenters
Zugänglich über
Offener Zugang
Dauer
01:12:29 Min
Aufnahmedatum
2013-01-08
Hochgeladen am
2013-01-10 13:47:59
Sprache
en-US