The final thing we looked at was most likely explanation, where you can think of as having
a signal, here the umbrella signal, first two days umbrella, third day no umbrella,
then two times umbrella again, which really comes from a source which is unknown.
And what you want to know is really the source signal rather than the transmitted signal.
You have a noisy channel and you're interested in the most likely source signal that explains
the perceived signal.
And the most important thing here is that we realize that this is not absolutely trivial.
We can't do this by filtering alone.
By filtering or possibly smoothing everything, filtering and smoothing again, that would
actually give us the sequence of most likely states, which may not be the most likely sequence.
And once you realize this, you realize that actually the objects we're talking about
is really the sequences.
So we have to lift this all and the optimization in particular to the sequence level, which
in the end leads to an algorithm that's again tail-recursive but has a different message,
in this case this M message, which is really about the maximizing folded into this.
And that actually is called the Viterbi algorithm.
That, for the Viterbi algorithm, we get linear space complexity.
Which isn't surprising because we have to keep the paths in memory.
Because it's a path level algorithm, we somehow have to store the paths, or at least best
candidates or something like this.
And that leads the longer our paths get.
Of course we need linearly much space here.
If you keep all paths around then of course you get quadratic, but you don't need to.
So Viterbi gives us most likely explanations.
And that's something you often want whenever you have kind of these channel problems.
And indeed this and many other algorithms were originally developed for things like
communications and control problems.
Okay, so we have ways of representing problems and environments where time plays a role.
And we have inference algorithms that kind of look good, right?
Remember?
All the other algorithms we talked about were exponential, at least.
And only if we had kind of polytree-like or tree-like structures in the background, then
they would be better.
Here we're getting something that's linear.
Kind of surprising.
But these things work out.
But of course these algorithms are linear with respect to the time.
There's an exponential factor in the size of each time slice, which I'm not talking
about here.
Just keep that in mind.
But of course in our examples, every time slice was really tiny.
In the umbrella example we had two variables.
That doesn't pay to make a Bayesian network for them.
Or you're done very early.
But really we've only been testing the time dimension here.
It's always the kind of model size dimension, which we've assumed constant here.
With what have we assumed that constant?
Any ideas?
That's the stationarity.
We've assumed that the sensor model is stationary, meaning it always looks the same.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:06:06 Min
Aufnahmedatum
2021-03-30
Hochgeladen am
2021-03-31 10:56:40
Sprache
en-US
Recap: Inference: Filtering, Prediction and Smoothing (Part 3)
Main video on the topic in chapter 6 clip 5.