So, where are we?
This was, I should really change this because at least in this part it's raw.
This was pre-2017.
So chess is something we know 20 years ago the best human was beaten.
Othello, checkers, all are beaten.
Checkers is actually solved, which means we have a policy, which makes the game
extremely boring because if you have a little booklet with a policy it tells
you exactly what to do to win in every situation, which makes it easy
to win or at least not lose.
Up to two years ago, Go was unsolved.
There's a little bit more behind checkers.
The way this came about with winning the game was there was one person who was
just way better than any other human and he was never beaten only because he died
first.
So, everyone else but him was beaten by computers and this guy held out and then
he died.
So, he remained unbeaten.
But now we have a strategy.
That would beat him.
So, 1997, Deep Blue against Kasparov was a big thing.
We have more spectators than the championship that ended yesterday.
Nowadays what you do there is alpha beta search plus human knowledge.
Nowadays if you run your laptop with a good program, you're playing at grant
master level.
You could have won the championship yesterday.
We don't have a strategy.
We will probably never have one, but the machines are good enough.
I'm not going to actually go all of those.
What have we seen?
We've seen essentially for a very specific kind of games, two player, zero sum, totally
observable, deterministic and so on, which is kind of a minimal step up from search.
We've developed algorithms.
These algorithms can be extended.
So, for instance, if you go from two player to three player, instead of doing min, max,
min, max, you do min, max, mu, min, max, mu, something like that.
I'm sure you could develop the necessary algorithms and I'm also sure that you could develop alpha
beta gamma pruning with what you've heard in the last lectures.
Of course, Monte Carlo's research doesn't really even notice whether you have, whether
you sample through three layers or seven layers or something like this.
You just go down all the way, sample, and then collect your utilities and pass them
back up, and then average.
That's essentially what we can do with these.
You can generalize some of this.
To go towards a partially observable world, you have to do more.
You essentially have to involve probability theories because, and that's something that
is crucial to note.
If we're losing observability or we're losing determinism, then we're actually going from
the agent knowing which world state it is in and therefore computing the next state,
given the actions and so on.
We only know that we could be in one of many states.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:08:24 Min
Aufnahmedatum
2020-10-30
Hochgeladen am
2020-10-30 09:16:46
Sprache
en-US
What is the state of the art for the use of AI in board games? And an overview over this chapter.