Namely building on all of that we've done
now, what if we want to make decisions?
So we're going to talk about
Markov decision problems now.
Again we're always assuming
sequential environments because everything
else will just be way too complicated.
We'll have two algorithms for evaluating
or for figuring out
which decisions we actually want to
make in some probabilistic environment.
Those are value iteration and policy iteration,
both of which have different advantages again.
I can already spoil you one thing.
In practice, what you want to
do is some combination of the two.
So, it's important to understand
how those two work in the first place.
We start out by assuming
that we have perfect information in
the sense of we can actually fully observe our environments,
which of course, in reality,
often isn't the case.
We introduced partially observable Markov decision
problems later on where we're, again,
assuming that our sensors
aren't actually bad reliable,
and then we'll figure out how to
actually build agents that do that stuff for us.
Thanks for watching!
Presenters
Zugänglich über
Offener Zugang
Dauer
00:01:23 Min
Aufnahmedatum
2021-03-30
Hochgeladen am
2021-03-31 10:46:50
Sprache
en-US
Recap: Making Complex Decisions
Main video on the topic in chapter 7 clip 1.