Remember, we were doing adversarial search and the problem is that instead of having
one agent who wants to find a solution, we have two agents of which we're planning one.
So we have an opposing agent and we want to find not a solution because that doesn't make
sense and we don't want to find a strategy because that's too difficult.
We want to find a way of how to plan the next action from whatever state we're in.
And ideally, of course, we would like to optimize our next action, not only greedily, what is
the best action for one action, but we really want to have an action that would optimize
our way towards winning, which means typically that we need to look ahead, which of course
is a search tree.
Why?
Because we have different actions we can do, so we have multiple possible futures and then
our opponent can do whatever it likes and then we have multiple actions again and then
the opponent can do whatever.
So that gives us a wonderful search tree, kind of a branching future, tree-like future.
But we also want to know which one should we take.
And that's easy if we only have one player and it's more difficult if we have two.
So we want to optimize the utility U of S. Remember the utility we're getting at the
end in the terminal state.
We don't know whether we win the chess game before we checkmate the opponent or the opponent
checkmates us.
And of course, Min wants to win itself, so it wants to minimize the utility, the max
utility.
Winning is, we have a zero-sum game, is minimizing what max gets.
So the computation we have to do alternates between minimizing, namely what Min does,
and maximizing what max does.
So if we look at this in the tic-tac-toe world, then the world looks like this.
We start the game.
Max's turn, as always, so max has a couple of alternatives.
That's the kind of max branching future.
And then Min has a couple of alternatives.
And then max and Min and so on.
At the end, we have various configurations.
There Min wins, there max wins, and this is a draw.
And max wants to kind of propagate the information about winning up to this state by saying,
aha, I have to put my x into the middle.
That's the best thing we can do.
So he does it, because it's his move.
By the way, most people do this.
For tic-tac-toe, we have a strategy.
We know what is actually best, only I don't.
That's the problem.
So how do we do this?
Well, obviously, we minimize, we maximize, and so on.
So what we do is essentially we do a depth-first search in the game tree where max is at the
root.
Why do we do depth-first search?
Because if we have a more complex game than, for example, tic-tac-toe, we won't be able
to descend until a goal state, maybe, because our tree just gets too large, and so does
our fringe.
Yes.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:21:23 Min
Aufnahmedatum
2020-10-28
Hochgeladen am
2020-10-28 13:47:01
Sprache
en-US
Explanation and examples for the Minimax Search.