20 - Musteranalyse/Pattern Analysis [ID:2322]
50 von 245 angezeigt

Hello everybody to the Tuesday session. We teach as usual in English because we have

one native, non-native speaker here. Today we have on our list, first of all, we want

to finalize the EM algorithm and then I would like to talk about the hidden Markov models

and we're going to look into the training formulas of hidden Markov models, different

types of hidden Markov models, and then we will also address the key questions that are

always discussed in the context of hidden Markov models. So we know the EM algorithm.

How does the EM algorithm work? The EM algorithm is an iterative scheme that more or less emulates

a maximum likelihood estimation. That was the core idea and if this is the likelihood

function the EM is iteratively maximizing a function like that and ends up here in the

maximum. If the initialization was done properly such that the first guess was in the area

of attraction of the global maximum. What is the area of attraction? The area of attraction

is the set of all points where the gradient points towards the global maximum. Welcome!

Do you know the company BrainLab? BrainLab is dealing with brains and it's very interesting

and it's very interesting to see what they are doing and we will visit them on July 27th

and if you want you can join us and we will travel to Munich and listen to a company presentation

over there. Do you want to attend? I don't know right now but I want. Good. So that's

the EM algorithm and the EM algorithm needs to characterize this function here. This function

is as expected an expectation. This is the Q function that we see over there that is

iteratively maximized. The Q function is defined how? Well you can remember the formula. You

can also say well the EM algorithm is dealing with the missing information principle meaning

complete information or the observable information is complete information minus hidden information

and I know the complete stuff over there that's more or less the core for the Q function.

You multiply both sides with P of Y given blah blah blah X and so on and then you integrate

out the hidden variable Y. That's the Q function. And then you iteratively maximize the Q function

and you know in the Q function there is a logarithm and the logarithm usually splits

up the product of probabilities into sums of probabilities and this property leads to

the fact that many optimization problems can be decomposed into sub space optimization

problems right due to the fact that the logarithm breaks up the product into a sum and you compute

the gradient those variables that are independent of the variables you derive the function they

more or less vanish and disappear. That's the core of the EM algorithm. We know it's

slow. The EM algorithm is very slow and it's a local method but for learning systems doesn't

matter. I mean if you have a speech recognizer and you want to train it using some training

samples you can do that over overnight. So the parameters are adapted through a nightrun

and then you use the system for classification. Good. So that's the EM algorithm. And last

time oh we did so much so fun stuff here. Very nice. And then we discussed hidden Markov

models. We discussed hidden Markov models where we have states where we have input probabilities

P1 P2 P3 and we have transition probabilities A11 A12 and so on. So we have basically three

stochastic processes combined in this stochastic automator. Which stochastic processes do we

have? We have the probabilities that are used for starting things so P1 P2 P3. That's the

initial probability when we start. Then we have the transition probabilities and we have

the output probabilities. Okay. And we have the output probabilities. Some people say

this is just a combination of two statistical processes and they introduce a state as zero

and start like that and say here with probability one I start here and then I go up here up

here up here and this is producing no output. But that doesn't matter at all. For us P1

P2 P3 are the starting or the initial probabilities for the states. And now I can use this hidden

Markov model. Let me just copy it. How can I copy? Let me just introduce a new page.

So let's look at one example. Here we have S1. Here we have S2. Here we have the transition

probabilities A21 A12 A11 A22 Pi1 and Pi2. Why is it called hidden? Because we have a

wall here. Like with our girls example. We have here a wall and I don't see the automata

from the outside. So this is a wall. It's a brick wall. That's a brick wall. That's

Zugänglich über

Offener Zugang

Dauer

00:35:44 Min

Aufnahmedatum

2009-07-07

Hochgeladen am

2012-07-30 10:19:36

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen