So welcome back everybody to our lecture, Deep Learning. Just a side note, the exam
registrations for the oral exam you will be able to do next week in the lecture. So I
will bring the registration forms. Just a note, after next week's lecture you will be
able to register for the oral exam. So let's continue with our lecture and today we want
to discuss visualization. And in visualization we have a couple of topics. So first of all
we want to discuss why we want to visualize things. Then obviously the neural network
architectures that we want to visualize because we can use them to communicate with other
researchers and other people involved in deep learning. And then obviously also the visualization
of the training is very interesting because it helps us to understand how the training
process is actually done. And then we have a couple of methods to actually visualize
parameters and these are based from simple parameter visualizations towards gradient
based visualizations and then parameter visualization via optimization. Okay, so let's start. Why
do we want to work with visualization? Well, you can see that neural networks they are
often treated as a black box. So we train something but then we have those trained parameters
and we don't know how we can interpret them. And what we want to do today is really understand
how to communicate the inner workings of such a network. So well we have an incomplete list
here why this is important. Obviously we want to communicate between researchers. You want
to identify issues during training like if you have problems with convergence or if you
have this problem like with dying relu and so on. So you want to be able to identify
those issues but you also want to see if there's a problem with your training or your test
data set. So obviously once we trained our network then we also want to understand how,
why and what the network actually learned. So that's also a very important feature. So
there's three types, main types of visualization that we will discuss today and this is the
architecture training and the learned parameters. Okay, so let's start with the network architecture
visualization and to be honest we already use this quite extensively in particular if
you look at the lecture of architectures we already had plenty of methods for visualizing
the architectures. So you want to communicate this architecture efficiently and if you look
at deep learning papers and you see different publications one of the first things when
you look at the new paper that you want to see is that you actually look at the images
that display the architecture. So the architectures are important for us to understand what is
really done and obviously you want to be able to understand quickly what the architecture
is good for and what ideas have been used in order to build this architecture. There's
essentially graph-based structures and they have different degrees of granularity and
we already shown them in the previous lecture. So there's three categories you could say
there's the node link diagrams where you have one node as a neuron and then the weighted
connections are the edges then they typically have this shape so you have input neurons
then the hidden neurons the output neurons and you directly draw the connections in between.
So this is very detailed it focuses on the connectivity and for small sub networks and
building blocks this is typically the method of choice. Then later if you want to display
bigger networks it may not be so very well suited and obviously there's variants with
explicit weights recurrent connections and so on. So for the larger networks we've already
seen that there you choose the block diagrams. So one layer is a solid block and then there
is connections between the layers and that's also a very typical visualization we've seen
that for example here for our resonant module where you can see there's a weighting layer
then there's batch normalization there's a rectified linear unit and these are always
entire layers so they're no longer a single neurons and the blocks are then essentially
operators or layers and then you have the direction of flow of the data and the blocks
can have different granularity often this is hierarchical and if you want to explain
what a block is doing you're trying to create a global view of the previous representation
type where you can go down to neuron level but often the function of the blocks are already
Presenters
Zugänglich über
Offener Zugang
Dauer
00:00:00 Min
Aufnahmedatum
2018-12-11
Hochgeladen am
2019-04-12 15:52:56
Sprache
en-US
Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:
-
(multilayer) perceptron, backpropagation, fully connected neural networks
-
loss functions and optimization strategies
-
convolutional neural networks (CNNs)
-
activation functions
-
regularization strategies
-
common practices for training and evaluating neural networks
-
visualization of networks and results
-
common architectures, such as LeNet, Alexnet, VGG, GoogleNet
-
recurrent neural networks (RNN, TBPTT, LSTM, GRU)
-
deep reinforcement learning
-
unsupervised learning (autoencoder, RBM, DBM, VAE)
-
generative adversarial networks (GANs)
-
weakly supervised learning
-
applications of deep learning (segmentation, object detection, speech recognition, ...)