6 - Searching/Planning with Observation [ID:29184]
50 von 145 angezeigt

Okay, so in this video nugget, we're going to do something very similar to the last one

where we were talking about searching and planning without observations.

Now we add observations to the mix. So the idea is that basically while we were doing

conformant planning, we're now doing the same thing in online planning. Basically,

we're adding percepts to the mix. And the most important thing here is that the transitioning

of the belief state really follows a two-step approach here. One is if we have a non-deterministic

or partially observable world, then we are in a situation where basically we have already

a belief state on the left here. And something is wrong here. Let me just do this again. There we

have the pointer again. So we have a belief state here on the left. Then we have an action,

which may be non-deterministic. So from this one state, we might actually get two successor states.

So we get a new belief state after the action. But then we have a percept. And the percept will

actually tell us, give us more information. It divides the, or partitions the belief states into

other belief states. And the percept tells us which of the ones it is. So we have to basically

to do planning and searching with observations, we have to rethink our transition model,

which is what we're going to do here. So if we want to have partial observability,

and I apologize for a lot of math here. So if we have a physical problem, the usual thing,

we have a set of states of actions, a transition model and initial and goals,

then we have the beliefs that state search problem is given by again, the power set of the states.

So the set of all subsets of the states, the actions, the lifted transition model,

the initial state, which is any state, and then the subsets of the goal states.

And the transition model, we construct in three stages that basically correspond to this idea.

Okay, so we have the prediction state, where we basically have some kind of a prediction function,

which we consider as being given with a problem, which is really something that takes a belief

state and an action and gives us a belief state. Then we have the observation prediction state,

which basically sees what the possible percepts are that could be observed in predicted belief states.

So remember that we had percepts with preconditions, and we remember the open, that you could only see

the color of the paint in a can if the can is open. So we have to see what the percepts in this

updated, action updated belief state could be. And then we use the update state, which for each

possible percept looks at the resulting belief state. Okay, so we have the prediction state,

which is essentially a belief state. Okay, so that gives us a result function that basically says,

what do we do in a belief state B? If we do action A, then this is essentially how we can predict

the belief state to be, which is essentially we predict the outcome of the action, we determine

the possible percepts to be the percepts that actually meet the preconditions. And then we

update with respect to those, that's actually the update stage is actually this set of this

update of the belief state, which gives us more information. One of the things you can actually,

if you look at this, you can see that the update is always a subset of the belief state with

respect to an observation O is always a subset. So this picture here is actually correctly drawn.

And if we have sensing to be, if we have sensing, which is deterministic, then actually the

possible percepts are disjoint, which means we have a partitioning of the original

predictive belief state. That's the math of it. And this function pred, which is really the action

prediction model and PERC, which is the sensor model is actually the main parameters in this model.

And those we're going to see from time to time. So let's look at this in a very easy case where

we have the vacuum cleaner. So the kind of two step thing here is we have a belief state that says,

we believe the robot to be on the left and we don't know anything about the dirt except that

the left hand room is dirty. By going right, we end up in this belief space two, four, and then

the sensing actually partitions this into two and four. For the slippery vacuum world, things are a

little bit more difficult. So we have the same initial state, one, three, then we go right, which

actually in the physical world actually gives us a non-deterministic action. So we end up with four

states in the intermediate, in the updated. And now we sense, and we can sense that B is dirty,

which gives us a singleton. We have A is dirty, which actually gives us a two state belief space,

Teil eines Kapitels:
Planning & Acting in the Real World

Zugänglich über

Offener Zugang

Dauer

00:23:49 Min

Aufnahmedatum

2021-01-31

Hochgeladen am

2021-01-31 19:29:00

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen