20 - Artificial Intelligence II [ID:57522]
50 von 604 angezeigt

Okay, so we're still looking at learning agents essentially.

The stuff we've done so far is a great big collection of stuff about which is really

learning from examples or learning from observations.

And then yesterday we started kind of fixing the problems of inductive learning, learning

by being told learning, supervised learning.

There are many names for that.

One of the problems of supervised learning is that very often there is no supervisor.

Nobody tells you what you should have done immediately.

So essentially we have to do something like reinforcement learning.

And yesterday we learned how to in a couple of very simple cases, we can actually reduce

reinforcement learning into supervised learning.

And essentially the trick was we're using sampling techniques to actually artificially

make examples from which we can then kind of solve a lifted problem by supervised learning.

In some cases this can be done and then we're relatively successful with these.

But in many cases we cannot.

We use the fact that we can sample.

And very often that is not the case.

In gameplay, sampling is very simple because we can just basically have two copies of a

program and let the two copies play against each other.

If we're trying to learn strategies for automated theorem proving, then we can't do that.

Because while game engines or game playing engines kind of can do stuff until one of

them actually loses or wins, with a theorem prover the natural end would be to find a

proof if there is a proof.

We know by semi-decidability results that that can take forever.

In fact, it usually takes forever.

So just you can let theorem provers run in parallel competitively only you're not getting

a reward.

You never know.

The sampling techniques sometimes don't work.

And not for every problem that we want to learn.

We can do sampling techniques.

So the stuff I've shown you doesn't always work.

Sometimes the researchers have some other clever way of finding a supervised learning

or other ways of making synthetic data.

But so far there are still lots of problems for reinforcement learning that are completely

open and nobody really has an idea of how to solve them.

So that's kind of one of the problems is kind of...

Really reinforcement is the more natural habitat of machine learning.

It's just so much harder.

The other thing I want to address today is knowledge in learning.

So if we look at what we've done, we've basically taken a bunch of data, the training examples,

and we've tried to distill a function from them.

And our restrictions in this is we had some handle or gave ourselves some handle to steer

this search by choosing a hypothesis space.

The thing you want to understand is that every learning task starts from zero.

Okay?

Tabula rasa learning.

And indeed, if you kind of change the function while you're learning it, you're almost

always going to end up in an inconsistent state.

If you have a data set that has a linear regression that looks like that, and then in the middle

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:26:40 Min

Aufnahmedatum

2025-07-09

Hochgeladen am

2025-07-10 23:49:07

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen