26 - Artificial Intelligence II [ID:9445]

50 von 516 angezeigt

богatellism. uniformly oriented.

Welcome to the last lecture of AI at FAU.

So we were coming to an end.

There is no lecture AI 3, which is a pity because there's more AI than I've covered

here or we have covered together, I think we should say.

I would like to basically complete a little bit the reinforcement learning stuff we started

yesterday and then kind of go back over, wrap up with what we've learned and answer any

questions that may have come up in the last days.

Maybe the most important question first, when is the exam?

I've received confirmation this morning when I met Mr Hoffman that the exam will be Tuesday.

But apparently he had scheduled it at 12.30 instead of 14.00 as I had asked him and announced

and so he said, oh yeah, then I'll move it to 14.00, not a problem.

And now there's an email from him saying, we have to talk, so I don't know.

But we do have a room, it's a Hörsaal 11.

Okay so I hope to resolve this today when I actually get him on the phone.

The Nachklausur KI1 is going to be on Monday, 10.30 in H10 in case one of you is affected.

So we know that, even though the official database doesn't know it yet.

So that's the state in the never ending story of the date and time.

Any questions so far, admin stuff?

We do plan to have the, do the corrections directly and then have a close way in the

days after.

Good, so reinforcement learning is a form of unsupervised learning and unsupervised

learning is learning without labeled examples.

The kind of, it's a slightly more tedious way of learning because you don't have examples

you can just optimize for.

But it's in a way more realistic.

So we're learning from rewards, which we in this case we also call reinforcements and

they can, these goals can kind of come at the end or can be hints of the environment

to the agent in between.

So the topic of having rewards is actually something that points us into the direction

in which a solution here to reinforcement learning could lie.

We've introduced rewards as part of Markov decision procedures and the idea here that

is the main idea in reinforcement learning is that you want to look at reinforcement

learning in a way as an MDP with the only difference that in Markov decision procedures

the reward function was totally observable, whereas it's only partially observable in

unsupervised learning.

You should think of these kind of delayed rewards, right, the reinforcements don't

come after every action.

In MDPs we had a reward in every action.

So the reinforcements really come at intervals or at the end, so we interpret that as a reward

function which is only partially observable.

You're in theory, in fiction, you're getting a reward after every action except nobody

tells you what it is, which is realistic.

You come to AI lectures every Wednesday and Thursday and you get a reward for that even

if you don't know that because by learning, right, it's not directly observable but you're

getting something out of it apparently or you're expecting to get something out of that.

And then of course the day of reckoning is Tuesday, you get your reinforcement in kind

of ultimately in the exam.

Of course there's intermediate rewards in getting points for their homeworks as well.

So it's partially, your actions get partially observable rewards.

Teil einer Videoserie :

Artificial Intelligence II

Presenters

Prof. Dr. Michael Kohlhase

Zugänglich über

Offener Zugang

Dauer

01:05:16 Min

Aufnahmedatum

2018-07-12

Hochgeladen am

2018-07-13 09:56:10

Sprache

en-US

Dieser Kurs beschäftigt sich mit den Grundlagen der Künstlichen Intelligenz (KI), insbesondere mit Techniken des Schliessens unter Unsicherheit, des maschinellen Lernens und dem Sprachverstehen.
Der Kurs baut auf der Vorlesung Künstliche Intelligenz I vom Wintersemester auf und führt diese weiter.

Lernziele und Kompetenzen
Fach- Lern- bzw. Methodenkompetenz

Wissen: Die Studierenden lernen grundlegende Repräsentationsformalismen und Algorithmen der Künstlichen Intelligenz kennen.
Anwenden: Die Konzepte werden an Beispielen aus der realen Welt angewandt (bungsaufgaben).
Analyse: Die Studierenden lernen über die Modellierung in der Maschine menschliche Intelligenzleistungen besser einzuschätzen.

Einbetten

Wordpress FAU Plugin

 https://www.fau.tv/clip/id/9445

iFrame

<iframe src="https://api.video.uni-erlangen.de/services/oembed/?url=https://www.fau.tv/clip/id/9445&format=iframe&maxwidth=1280&maxheight=720" width="1280" height="720"seamless allowfullscreen style="border: 0; padding: 0; margin: 0;overflow: hidden;"></iframe>

Herunterladen

Video

Per RSS abonnieren