Welcome back to deep learning. Today we want to look a bit more into visualization techniques
and in particular the gradient based and optimization based procedures.
You wanted to know what the matrix is, didn't you?
Okay, so let's see what I've got for you. Let's talk first about the gradient based
visualizations and here the idea is that we want to figure out which input pixel is most
significant to a neuron and if we would change it what would cause a large variation in the
actual output of our neural network. Try to relax. So what we actually want to compute is the partial
derivative of the neuron under consideration, maybe an output neuron like for the class cat,
and then we want to compute the partial derivative with respect to the respective input and this is
essentially back propagation through the entire network and then we can visualize this gradient
as type of image which we have been doing here for this cat image and you can see that of course
this is a color gradient you see that this is a bit of a noisy image but you can see that what
is related to cat here is obviously also located in the area where the cat is actually located in
the image. So we will learn several different approaches to do this and the first one is
based here on reference number 20. For back propagation we actually need a loss what we
want to back propagate and we simply take a pseudo loss that is simply the activation of an
arbitrary neuron or layer and typically what you want to do is you want to take neurons in the
output layer because they can be associated to a class and what you can also do is instead of
using back propagation you can build a nearly equivalent alternative which uses a kind of
reverse network and this is the dconfnet from reference and 26 so here the input is the trained
network and some image then you choose one activation and set all of the other activations
to zero then you build a reverse network and you can see the idea here that this is essentially
containing the same as the network but just in reverse sequence with so-called unpooling steps
and now with these unpooling steps and the reverse computation you can see that we can also produce
a kind of gradient estimate. The nice thing about this one is there's no training involved so you
just have to record the pooling location in the switches and the forward pass of the reverse
network effectively is the same as the backward pass of the network apart from the rectified
linear unit which we'll look at in a couple of slides. This is the construct and here we show
the visualizations of the top nine activations, the gradient and the corresponding patch. So for
example you can reveal with this one that this kind of feature map seems to focus on green patchy
areas and you could argue that this is more a kind of background feature that tries to detect grass
patches in the image. Anything we need. Right now we're inside a computer program. So what else?
Well there's guided backpropagation and guided backpropagation is a very similar concept and
the idea here is that you want to find positively correlated features so we are looking for positive
gradients because we assume that the features that are positive are the ones that the neuron
is interested in and the negative gradients are the ones that the neuron is not interested in. So
the idea is then to set all negative gradients in the backpropagation to zero and we can show
you now the different processes of the relu during the forward and backward passes with the different
kinds of gradient backpropagation techniques. Well of course if you have this input activations
then in the forward pass in the relu you would simply cancel out all the negative values and set
them to zero. Now what happens in the backpropagation for the three different alternatives? Let's look
at what the typical backpropagation does and note that we show here the negative entries that came
from the sensitivity in yellow and if you now try to backpropagate this you have to remember
which entries in the forward pass were negative and you set those values again to zero but you
keep everything that came from the sensitivity of the previous layer. Now if you do a deconf net you
don't need to remember the switches from the forward pass but you set all the entries that
are negative in the sensitivity to zero and backpropagate this way. Now the guided backpropagation
actually does both so it remembers the forward pass and sets all of those elements to zero and
it sets all of the elements of the sensitivities to zero so it's essentially a union of backpropagation
and deconf net in terms of canceling negative values and you can see that the guided backpropagation
Presenters
Zugänglich über
Offener Zugang
Dauer
00:21:27 Min
Aufnahmedatum
2020-06-06
Hochgeladen am
2020-06-06 12:36:37
Sprache
en-US
Deep Learning - Visualization Part 4
This video shows simple visualization techniques based on lesion studies and investigating activations.
Video References:
Matrix Simulation 1
Matrix Simulation 2
Matrix Bullet
Further Reading:
A gentle Introduction to Deep Learning