18 - Artificial Intelligence II [ID:9310]
50 von 491 angezeigt

Okay, so we're still in the intro phase to machine learning and we are essentially looking

at a problem that plagues all the all learning paradigms which is the problem of overfitting

versus underfitting. So machine learning is an optimization process and we're trying to

optimize essentially as an agent what we do with respect to this external performance

measure and at the moment we are looking at machine learning from examples and there's

always depending on what kind of examples we get which only partially mirror the underlying

processes we really want to learn. So there's the tendency that instead of learning the

underlying processes we're just learning the examples we've seen so far which of course

is only fair because that's the only thing we're given. So the question is can we do

something about overfitting? And can we do something about underfitting? Underfitting

we can normally solve relatively easily by giving or looking at more examples if we have

them. Overfitting can also sometimes be cured by more examples because they will naturally

generalize the behavior but we want to do something about that actively. So we want

to sometimes generalize our solutions by just looking at them in and of themselves. That

sometimes helps. And we looked at one example there which was decision tree pruning which

was remember decision tree learning found out a nice decision tree and the question

is can we make that better? And the idea here is that we'll go through the terminal nodes

of that tree and for every one of those look at whether it offers us enough information

gain. And the really only question there is because we can compute the information gain

already, the only question is how much is enough to think of a node as irrelevant? Unfortunately

statistics have an answer for us by just using standard significance tests. And the idea

for these significant tests is that we want to use the information gain kind of as a measure

and we want to distinguish the information gain a node gives us with, well compare to,

instead of distinguish I must say, compare the information gain that a node, a particular

terminal node gives us to the information gain we're expecting with the null hypothesis,

namely that everything is just random. And if it's sufficiently near, for some value

of sufficiently near randomness we can say this node doesn't give us enough information

so we throw it out and we can make our decision trees smaller and of course smaller decision

trees make less decisions so possibly they generalise better. That's the idea. And then

there are some standards, tricks of the trade. For instance looking at these, looking at

the errors in a kind of a squared, sum of squared errors way and we know something about

how sum of squared errors should be distributed and comparing to this function how squared

errors should be distributed will give us a measure. And by some statistics voodoo we

know that if this quantity up there is of that size then a statistician would say it's

significant. Don't erase me. Okay? That's the idea. What we're seeing here, I mean what

we've been seeing the whole semester is that AI has been on a huge shopping tour everywhere

into logic, into probabilities, into statistics, into control theory, into all kinds of things

and has been kind of reinterpreting them in the context of trying to build intelligent

agents and actually getting down to the nitty gritty details and making those things efficient

and practical. But for many things, many of the ideas come from other sciences. The same

thing happens by the way with psychology and philosophy and all of those kind of things.

Here AI actually takes in everything that's needed and munges it up under the heading

of oh we make intelligent agents. Which is why I'm kind of in this course always alternating

between showing you funny agent pictures and essentially doing math. You could think of

this as an applied math course. For many intents and purposes it is, but it's also kind of

an applied math course where we're implementing math more than applied mathematicians do it.

Here we're on a shopping tour into statistics. Information gain and so on is on a shopping

tour into information and control theory and so on.

What we looked at next was looking at not only optimizing for the best hypothesis under

some criteria, but also finding the right hypothesis. One of the ways we're doing that

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:25:36 Min

Aufnahmedatum

2018-06-14

Hochgeladen am

2018-06-21 08:09:39

Sprache

en-US

Dieser Kurs beschäftigt sich mit den Grundlagen der Künstlichen Intelligenz (KI), insbesondere mit Techniken des Schliessens unter Unsicherheit, des maschinellen Lernens und dem Sprachverstehen. 
Der Kurs baut auf der Vorlesung Künstliche Intelligenz I vom Wintersemester auf und führt diese weiter. 

Lernziele und Kompetenzen
Fach- Lern- bzw. Methodenkompetenz

  • Wissen: Die Studierenden lernen grundlegende Repräsentationsformalismen und Algorithmen der Künstlichen Intelligenz kennen.

  • Anwenden: Die Konzepte werden an Beispielen aus der realen Welt angewandt (bungsaufgaben).

  • Analyse: Die Studierenden lernen über die Modellierung in der Maschine menschliche Intelligenzleistungen besser einzuschätzen.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen