So welcome everybody to our lecture in deep learning.
So it's a great pleasure today to talk about visualization and attention mechanisms.
So we will shortly discuss why we want to talk about visualization and why this is an
actually quite interesting technique, in particular in combination with deep networks.
Then we will talk a bit about different things to visualize.
So first of all we want to also visualize architectures because we want to communicate
the structure and topology of our networks to other people like other researchers, colleagues
and so on.
And there's different ways of visualizing this.
Then we want to talk a bit about visualizing of the training and what we can learn from
the training process itself.
Some things we already discussed about but we thought that we could describe this in
a little more clarity here as well.
And then we want to look into the visualization of parameters like single parameters and convolution
masks and then gradient-based visualization techniques and parameter visualization via
optimization.
So these are the parts in visualization and then we will shortly switch our focus, we
will switch the attention towards something slightly different but it's to some degree
based on what we are discussing here, in particular the gradient-based visualization and that
are attention mechanisms.
And attention mechanisms turned out to be really useful in particular if you talk about
automatic translation systems but attention became very popular also in other network
topologies so this is why we are discussing this at this point.
Okay, so let's motivate this a bit.
Well, why do we need visualization?
So far we are considering neural networks somehow as a kind of black box.
It just learns something, we have some topology that we put into this black box, then we train
it but then we typically have problems in understanding what's happening in this black
box.
So we would like to understand and communicate first of all the inner workings which means
the topology but then also the learned state of a network and try to interpret and figure
out what it's actually doing.
And these are the main topics that we want to focus today on.
So why does this matter?
Of course we want to communicate with others, we want to identify issues during the training,
so if there's no convergence or if there's problems emerging like the dying relos and
obviously we want to be able to identify faulty training and test data.
Also something we haven't discussed so far, so far we are living in this ideal world and
we have this training and test sampling and they are correctly sampled and all is good
but sometimes when you work with real data then you can actually have trouble with the
collection of the data sets and they may not be really appropriate for your problem and
you should really think about that and identify issues if they happen.
And then also we would like to identify why and what the network is actually learning.
So there's three types of visualization, we have the architecture, the training visualization
and the learned parameters, weights visualization where we focus on two different parts, so
first of all we focus on the structure, second we focus on what is being trained and how
the training process is done and the last part is on a train network where we already
are done with the training.
So let's talk a bit about the network architecture visualization.
Of course we want to communicate architectures effectively and obviously priors may be imposed
Presenters
Zugänglich über
Offener Zugang
Dauer
01:24:33 Min
Aufnahmedatum
2019-06-27
Hochgeladen am
2019-06-27 19:29:02
Sprache
en-US
Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:
-
(multilayer) perceptron, backpropagation, fully connected neural networks
-
loss functions and optimization strategies
-
convolutional neural networks (CNNs)
-
activation functions
-
regularization strategies
-
common practices for training and evaluating neural networks
-
visualization of networks and results
-
common architectures, such as LeNet, Alexnet, VGG, GoogleNet
-
recurrent neural networks (RNN, TBPTT, LSTM, GRU)
-
deep reinforcement learning
-
unsupervised learning (autoencoder, RBM, DBM, VAE)
-
generative adversarial networks (GANs)
-
weakly supervised learning
-
applications of deep learning (segmentation, object detection, speech recognition, ...)
durch technische Probleme fehlen die ersten Minuten der Vorlesung. Wir bitten das zu entschuldigen.