54 - Recap Clip 8.16: Artificial Neural Networks (Part 3) [ID:30457]
48 von 48 angezeigt

We looked at learning and the maths is essentially looks from a slight distance looks exactly

like it's always looked before.

We're minimizing a loss function which turns out to be usually squared error loss which

allows us to do gradient descent.

For that we need to compute partial derivatives and if we do that we get a weight update function

that is relatively simple.

So you iterate over this weight update in single layer perceptrons until you have good

weights.

Something terribly interesting here but we're getting for certain functions we're getting

extremely good learning behavior.

Okay, so for the majority function we're getting vastly better performance than say decision

tree I'm sorry decision tree learning.

For other functions say the restaurant data we looked at first perceptrons have no chance.

Why not?

Because there's a realizability problem the one layer perceptrons cannot even cannot even

express the Boolean function that's behind the restaurant data.

So we get good performance here with decision tree learning and perceptrons just have no

chance.

So how do we change that?

Well instead of having single layer perceptrons we have multi-layer neural networks where

you have the input layer you have the output layer I've chosen here to have a single output

so this is a Boolean function and we have hidden layers here.

The neurons on this layer and on that layer look the same.

So these hidden layers the hidden units can do things they can actually nest behavior

right the typical thing is in this network there's also a multi-layer one just turn it

by 90 degrees then you have a two output neural network and we can go from we can get non-linearities

by just nesting linear behaviors.

If you nest two time you get something that's quadratic.

And if you think about it graphically then if you combine two of these cliffs you can

get an edge if you do combine two ridges then you get a point and then you can kind of build

stuff on top of that.

So these shared neural networks have the potential to kind of approximate any surface.

The problem is how do we learn with them?

And the idea here is that you actually given an input an actual input output pair you look

at the current weights we do an iterative procedure again you look at the current weight

which allow you to compute forward from inputs to outputs.

During learning we're in the good situation that we know what the value should have been

in this case true or false.

So we can see whether we've gotten it right or not which allows us to compute backwards

like we do in linear regression one level which gives us corrected weights on these

here and the inputs the virtual inputs the corrected virtual inputs which then allow

us to iterate that.

So that's what this back propagation rule does it's just the update rule except there's

one thing that we have to change we can't rely on we can't separate each layer into

single output cases we have to do for output vector and that's the only thing we really

have to do that's the only change in the math we have to kind of make it multi output from

the start and then everything becomes vectors but we get some kind of an update rule that

is essentially the same.

Teil eines Kapitels:
Recaps

Zugänglich über

Offener Zugang

Dauer

00:06:23 Min

Aufnahmedatum

2021-03-30

Hochgeladen am

2021-03-31 11:46:35

Sprache

en-US

Recap: Artificial Neural Networks (Part 3)

Main video on the topic in chapter 8 clip 16.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen