Okay, so welcome back everybody. It seems finding this place is still difficult. We
still have 207 registrations in Stuttgart and about 150 people in the exercises. So
I hope everybody got an extra spot in the exercises. Nodding, excellent, very well.
Then we can continue with our lecture and today we really want to talk about talking,
we really want to start learning about neural networks because we've seen last time that this
perceptron is quite nice idea but we can only find linear decision boundaries and we couldn't solve
the XOR problem. So today we want to see how we can actually solve that and we'll see even more.
We can even see that with neural networks we have general function approximators. So a neural
network is actually able to learn any kind of function and that will be extremely useful and
give us a theoretical basis to go towards deep learning. But also there's a couple of important
messages that you should also take away from this lecture. So one of the messages will be that
already with one hidden layer you have this general function approximator and that is to some extent
contradictory to what we see right now with deep learning but we will also give a couple of examples
why it would make sense actually to construct deeper networks and not just a single hidden layer.
Okay so today we want to talk a bit about this neural network model and extract,
expand on our previous observations on the perceptron. We want to develop perceptrons
further to functional universal function approximators and then we want to want to look a bit into
activations and how we can actually use them for classification and one way how you can do that
quite easily is the softmax function that you will also see quite often and frequently used in
literature in order to get some general function output always mapped onto a probability like
structure and we want to talk a bit about the optimization how to determine the weights and
this is going to be the back propagation algorithm. So today we really learn the basics about neural
networks and the training process and then we will follow this with this layer abstraction and how
we can reuse the not only look into a single neuron how to train the single neurons weights
but also how we can update an entire layer and all the weights in the layer in a quite efficient way.
So we will look into examples for that. Okay so let's start with the model we've seen that this
perceptron has a very simple decision rule where we essentially have this let's say hyperplane or
line of this we have a linear decision boundary and we can say with the sine function okay we are on
the one side or the other side of this plane. That's all that our perceptron can do and you
remember the structure was that we are extending our vector here our input vector by one and then
the further weights and then we connect them to the respective weights. Note there's a small mistake
supposed to be a two another two but a one for all the other weights and then we sum them up put
them into the sine function and we can decide on which side we are actually are and this is nothing
else than an inner product of two vectors that we're evaluating by this multiplication and
sum so we can even use this short notation such they can very easily write this up. So for a new
sample we would then just evaluate this inner product compute the assigned distance and the
sign tells us which class we should associate with. Technically we are operating here in the
domain of pattern recognition and in a traditional sense these perceptrons they are located here so
these are classifiers and in this class we are not talking about preprocessing and feature
extraction in fact we will even show that we can make the preprocessing and feature extraction part
of the whole classification system and also train all the parameters that are required here together
with the classification such that all of this is optimized together in a training process. Previously
these steps here have been driven optimally by engineers that knew something about the problem
so they would adjust this step such that the signal quality is improved and this feature
extraction step such that you compute numbers that are decisive for your problem and this is
often called handcrafted features or handcrafted feature design and literature and we want to go
now ahead and support this handcrafted feature design in addition by state-of-the-art neural
network architectures that let you determine all the weights in this feature extraction also
automatically and the concepts right now they are also important across the architectures. So this
is the famous XOR problem and you can see this can't be solved with a line because you would
Presenters
Zugänglich über
Offener Zugang
Dauer
01:24:57 Min
Aufnahmedatum
2018-10-23
Hochgeladen am
2018-10-24 19:29:34
Sprache
en-US
Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:
-
(multilayer) perceptron, backpropagation, fully connected neural networks
-
loss functions and optimization strategies
-
convolutional neural networks (CNNs)
-
activation functions
-
regularization strategies
-
common practices for training and evaluating neural networks
-
visualization of networks and results
-
common architectures, such as LeNet, Alexnet, VGG, GoogleNet
-
recurrent neural networks (RNN, TBPTT, LSTM, GRU)
-
deep reinforcement learning
-
unsupervised learning (autoencoder, RBM, DBM, VAE)
-
generative adversarial networks (GANs)
-
weakly supervised learning
-
applications of deep learning (segmentation, object detection, speech recognition, ...)