7 - Online Search [ID:29185]
50 von 116 angezeigt

Welcome everybody to this video nugget on online search.

We remark that so far we've concentrated on offline problem solving.

Basically the agent only acts when what we call plan execution after a full plan exists.

The percepts come in and then you actually decide on the sub plan you want to pursue.

Online problem solving, computation and action are interleaved.

It uses the fact that we can compute more after we've seen the action.

We know we don't have to compute all the eventualities, but we basically do it as we go along.

This is helpful in dynamic or semi-dynamic environments.

Basically long computation times in the beginning can be harmful.

Think about you standing on a highway, a truck is coming, but you're still computing your plan.

Chances are that your agent is not going to be very successful.

Of course in stochastic environments where your actions may be undeterministic,

many, many, many contingencies may arise.

If you have to compute sub plans for all of them, the plans might actually become extremely large.

As a side remark, online problem solving is necessary if you have unknown environments.

Those we call exploration problems and we won't cover them in this course.

Basically if the environment is unknown, you don't even know the actions,

and then you have no chance but to actually sense and find out about your actions and your environment as you go along.

Online problem solving even makes sense in deterministic and fully observable environments just like the ones we've been talking about.

For instance, if we basically restrict the things that the agent can know a priori.

Basically we'll call an online search problem being one that consists of states,

a function that gives us the list of actions that are allowed in some state.

S, which means we can perceive or we know about the actions once we're in state S,

maybe a cost function where basically the cost of having the action and its result,

the result of the action basically only are given when we are in S.

We don't even know the out, we may know the actions in S, but we don't know the outcome unless we actually perform the action.

Then we have a goal test function.

The thing is, if we think back on this model we've constructed for offline searching,

this kind of a result function that gives us in a state and an action what the expected results are,

basically is not something we can predict, but we have to go there, do it, and then see the results.

This is kind of a very much restricted problem, so very much less knowledge to the agent.

We expect that from less information the agent will behave less well than in a fully, in an offline version of this problem.

The offline performance is the cost of an optimal solution with full information,

and the online performance is the one which we can actually get by online problem solving.

They're not going to be the same.

If you compute the ratio of those, that's kind of a quality measure for your online problem solving app.

The closer offline and online performance are, the better.

A typical thing here is we have a little maze, we start here, we want to go there,

but the agent knows nothing of the environment.

It does not know that up 1, 1 results in 1, 2, going up from here goes to there.

It doesn't know that if we go back from 2, 1 to 1, 1, that it actually gives you that kind of up and down, the opposite.

All of those, typically an agent doesn't know.

So, if we think back at the competitive ratio, then you can note that the competitive ratio might actually be infinite.

That comes from actions that we call irreversible, which are actions that lead you into a dead end,

and a dead end is a state which you cannot actually escape from anymore, where basically no state is reachable by any action.

So, if we have dead ends, which you can walk into, then you can have irreversible actions.

Irreversible actions can actually make the competitive ratio infinite.

The typical dead end is if you combine a dead end street with a one-way street.

If you do it right, then the dead end is no longer reachable.

So, you can basically orient the one-way street away from the dead end. If you orient the one-way street into the dead end,

Teil eines Kapitels:
Planning & Acting in the Real World

Zugänglich über

Offener Zugang

Dauer

00:18:11 Min

Aufnahmedatum

2021-01-31

Hochgeladen am

2021-01-31 19:16:21

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen