19 - Artificial Intelligence II [ID:9320]
50 von 452 angezeigt

Little admin first, I've finally gotten an answer from the, for the exam and it's going

to be on the 16th or 17th. We don't know where, we don't know when and I'm sure we'll find

out in time before the exam, let me put it that way. So probably a couple of days earlier.

I'm sorry about this, this is the best I can do. So we've been talking about machine learning

and what we did last week was essentially posing the whole problem as a minimization problem.

And the idea is that we want to minimize errors, namely misclassified new examples.

Okay, so what better than just writing down the error function or generalize the loss function

because that allows us to talk about utility as well and then just minimizing that over all the

hypotheses. Mathematically very simple, conceptually very simple, there's a problem there of course,

namely the hypothesis space is very big. And of course there's this question of what exactly is

the loss function, how do we measure the loss function and so on. And we have a couple of say

preliminary examples. We typically use three loss function, one is the zero one loss which

is essentially error rates, it counts one if it's correctly classified in zero, if it's

not non classified and then we have more continuous functions like the absolute value loss or the

squared error loss. Squared error loss is kind of what we like best. And so essentially we looked

at the general loss, generalization loss being the loss measure of a pair x, y being misclassified

by our hypothesis times the probability of that occurring. Nice theory, wonderful except we don't

know P. So what instead we're doing is what we call the empirical loss is where we basically

only look at the examples we had so far and we're getting instead of a sum over all hypotheses,

over all examples with the probability we're actually getting a sum that's just equally

distributed. And of course this is the best we can do, it's still not very good because we're

only looking into the past and we have the usual problems here. So but still it gives us a good

formulation of the learning problem and we are going to use that in the future. One of the kind

of wrinkles on this is that you can use the same idea essentially to combat overfitting in the same

at the same time and the only thing we do is we just extend the empirical loss function with a

complexity term which just measures and thus penalizes complex hypotheses. The typical example

is something like for polynomials you just square some of the squares of dimensions which basically

penalizes the squiggly curves and gives you small degree polynomials which are just one way of being

simpler and we call this technique regularization and if you're formulating your machine learning

problem as a minimization problem where you can do those kind of things by just tinkering with

the function you want to minimize. It gives you a very kind of principled way of looking at this.

And we looked at one such kind of theoretical computer science inspired way of doing that and

where we just basically postulate a complexity measure by just encoding everything in Turing

machines, encoding those typically in terms of a program for a universal Turing machine and then

just counting bits. Okay so that's something you can do and in the large this works very well

because it's very well thought out and in particular it gets you around having to somehow

specify meaning conjure out of thin air this factor lambda which is basically what you need to

put both of these measures here onto the same scale. Here with minimal description length we

have everything on the same scale namely bits and that works quite well. Okay so far say the classical

theory developed essentially in statistics and so on using probability for small examples and

everybody thought this is quite nice but it's not real fun when you do it with pen and paper. So if

you have huge amounts of data which really is something you only get with the internet and huge

amounts of computation data of computation power then you can actually turn this into the scientific

reputation making machine and money making machine that machine learning has turned into in the last

really two decades one or two decades which is about the time we had the internet

and huge amounts of computation. So basically what we started doing now is what we now think of as

machine learning. First we looked at some theory and this is kind of this is the Turing machines

of learning in a way right asking all kinds of questions and getting negative answers just like

we have the halting problem in theoretical computer science where basically says not everything is

computable and the things that are are kind of yeah kind of boring. Here we have a couple of

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:26:39 Min

Aufnahmedatum

2018-06-20

Hochgeladen am

2018-06-21 17:30:35

Sprache

en-US

Dieser Kurs beschäftigt sich mit den Grundlagen der Künstlichen Intelligenz (KI), insbesondere mit Techniken des Schliessens unter Unsicherheit, des maschinellen Lernens und dem Sprachverstehen. 
Der Kurs baut auf der Vorlesung Künstliche Intelligenz I vom Wintersemester auf und führt diese weiter. 

Lernziele und Kompetenzen
Fach- Lern- bzw. Methodenkompetenz

  • Wissen: Die Studierenden lernen grundlegende Repräsentationsformalismen und Algorithmen der Künstlichen Intelligenz kennen.

  • Anwenden: Die Konzepte werden an Beispielen aus der realen Welt angewandt (bungsaufgaben).

  • Analyse: Die Studierenden lernen über die Modellierung in der Maschine menschliche Intelligenzleistungen besser einzuschätzen.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen