36 - Recap Clip 7.2: Sequential Decision Problems [ID:30439]
50 von 61 angezeigt

We've been talking about sequential decision problems.

Essentially, the idea is that we look at agents that full agents, meaning decision problems,

not only just modeling the world, but also taking decisions in worlds where you want

to take the aspect of time seriously.

Meaning worlds that actually change in a time scale that is commensurate with your deliberation

and action time scales.

You can't just take a time out, think a bit, and then do something.

Something else will be happening while you're doing that.

Which means that actually you need to take time into account because the world, the environment,

isn't static while you're acting.

We've looked at two things.

We've looked at last week, we looked at Markov decision problems.

Builds on Markov processes.

Markov processes as something where we're just modeling, not taking decisions.

We've looked at Markov decision problems last week with Dennis where the decision aspect

comes to that.

What we're going to do today is we're going to rid ourselves of the assumption here, which

is that the world is fully observable.

We're actually transitioning from Markov decision problems to partially observable Markov decision

problems.

That's quite a mouthful, so we're going to say POMDPs.

That's essentially what we're doing.

The techniques here are relatively close to what we've scratched the surface on in planning.

Planning was also something where we're using search in a fully observable and deterministic

world.

That takes time into account.

Remember, we had in a sense a time-slice model.

In planning, we took time into account by having these add and delete lists of facts.

The word facts already tells you it's a deterministic world, so we actually don't need a belief

model.

We have a world model.

We know what's happening.

We can observe everything, and by deterministic actions, we can plan ahead.

Planning is really the very simple case.

It's kind of that it is fully observable.

It is kind of similar to MDPs, only that MDPs allow uncertainty and the utility to the picture.

If you kind of add uncertainty and utility to the picture, then you're kind of getting

this kind of mixture of techniques from MDPs and so on.

If you still add, that's what we're going to do today, is if you have an uncertain sensing,

then you end up with POMDPs.

The last thing we're really going to say about POMDPs is that the world is a POMDP.

Well not surprisingly.

We've added everything that's good and expensive to the mix.

We've added time.

We've added uncertainty.

We've added uncertain sensing, so everything that we've kind of abstracted away last semester

and in the beginning of this semester, we're adding back.

That's kind of going to be the eventual agent design.

I just want to remind you of this kind of similarity to planning and game playing and

so on that is going to pop up all over the place.

Teil eines Kapitels:
Recaps

Zugänglich über

Offener Zugang

Dauer

00:07:07 Min

Aufnahmedatum

2021-03-30

Hochgeladen am

2021-03-31 11:07:05

Sprache

en-US

Recap: Sequential Decision Problems

Main video on the topic in chapter 7 clip 2.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen