2 - Deep Learning [ID:9595]
50 von 906 angezeigt

Okay, so welcome back everybody. It seems finding this place is still difficult. We

still have 207 registrations in Stuttgart and about 150 people in the exercises. So

I hope everybody got an extra spot in the exercises. Nodding, excellent, very well.

Then we can continue with our lecture and today we really want to talk about talking,

we really want to start learning about neural networks because we've seen last time that this

perceptron is quite nice idea but we can only find linear decision boundaries and we couldn't solve

the XOR problem. So today we want to see how we can actually solve that and we'll see even more.

We can even see that with neural networks we have general function approximators. So a neural

network is actually able to learn any kind of function and that will be extremely useful and

give us a theoretical basis to go towards deep learning. But also there's a couple of important

messages that you should also take away from this lecture. So one of the messages will be that

already with one hidden layer you have this general function approximator and that is to some extent

contradictory to what we see right now with deep learning but we will also give a couple of examples

why it would make sense actually to construct deeper networks and not just a single hidden layer.

Okay so today we want to talk a bit about this neural network model and extract,

expand on our previous observations on the perceptron. We want to develop perceptrons

further to functional universal function approximators and then we want to want to look a bit into

activations and how we can actually use them for classification and one way how you can do that

quite easily is the softmax function that you will also see quite often and frequently used in

literature in order to get some general function output always mapped onto a probability like

structure and we want to talk a bit about the optimization how to determine the weights and

this is going to be the back propagation algorithm. So today we really learn the basics about neural

networks and the training process and then we will follow this with this layer abstraction and how

we can reuse the not only look into a single neuron how to train the single neurons weights

but also how we can update an entire layer and all the weights in the layer in a quite efficient way.

So we will look into examples for that. Okay so let's start with the model we've seen that this

perceptron has a very simple decision rule where we essentially have this let's say hyperplane or

line of this we have a linear decision boundary and we can say with the sine function okay we are on

the one side or the other side of this plane. That's all that our perceptron can do and you

remember the structure was that we are extending our vector here our input vector by one and then

the further weights and then we connect them to the respective weights. Note there's a small mistake

supposed to be a two another two but a one for all the other weights and then we sum them up put

them into the sine function and we can decide on which side we are actually are and this is nothing

else than an inner product of two vectors that we're evaluating by this multiplication and

sum so we can even use this short notation such they can very easily write this up. So for a new

sample we would then just evaluate this inner product compute the assigned distance and the

sign tells us which class we should associate with. Technically we are operating here in the

domain of pattern recognition and in a traditional sense these perceptrons they are located here so

these are classifiers and in this class we are not talking about preprocessing and feature

extraction in fact we will even show that we can make the preprocessing and feature extraction part

of the whole classification system and also train all the parameters that are required here together

with the classification such that all of this is optimized together in a training process. Previously

these steps here have been driven optimally by engineers that knew something about the problem

so they would adjust this step such that the signal quality is improved and this feature

extraction step such that you compute numbers that are decisive for your problem and this is

often called handcrafted features or handcrafted feature design and literature and we want to go

now ahead and support this handcrafted feature design in addition by state-of-the-art neural

network architectures that let you determine all the weights in this feature extraction also

automatically and the concepts right now they are also important across the architectures. So this

is the famous XOR problem and you can see this can't be solved with a line because you would

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:24:57 Min

Aufnahmedatum

2018-10-23

Hochgeladen am

2018-10-24 19:29:34

Sprache

en-US

Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:

  • (multilayer) perceptron, backpropagation, fully connected neural networks

  • loss functions and optimization strategies

  • convolutional neural networks (CNNs)

  • activation functions

  • regularization strategies

  • common practices for training and evaluating neural networks

  • visualization of networks and results

  • common architectures, such as LeNet, Alexnet, VGG, GoogleNet

  • recurrent neural networks (RNN, TBPTT, LSTM, GRU)

  • deep reinforcement learning

  • unsupervised learning (autoencoder, RBM, DBM, VAE)

  • generative adversarial networks (GANs)

  • weakly supervised learning

  • applications of deep learning (segmentation, object detection, speech recognition, ...)

Tags

multi-layer abstraction layer softmax decision feedback activation perceptron problem output example analytic loss breininger propagation input forward universal networks learning neural layers functions gradient function network ravikumar
Einbetten
Wordpress FAU Plugin
iFrame
Teilen