8 - Deep Learning [ID:11690]

50 von 1023 angezeigt

So welcome everybody to our lecture in deep learning.

So it's a great pleasure today to talk about visualization and attention mechanisms.

So we will shortly discuss why we want to talk about visualization and why this is an

actually quite interesting technique, in particular in combination with deep networks.

Then we will talk a bit about different things to visualize.

So first of all we want to also visualize architectures because we want to communicate

the structure and topology of our networks to other people like other researchers, colleagues

and so on.

And there's different ways of visualizing this.

Then we want to talk a bit about visualizing of the training and what we can learn from

the training process itself.

Some things we already discussed about but we thought that we could describe this in

a little more clarity here as well.

And then we want to look into the visualization of parameters like single parameters and convolution

masks and then gradient-based visualization techniques and parameter visualization via

optimization.

So these are the parts in visualization and then we will shortly switch our focus, we

will switch the attention towards something slightly different but it's to some degree

based on what we are discussing here, in particular the gradient-based visualization and that

are attention mechanisms.

And attention mechanisms turned out to be really useful in particular if you talk about

automatic translation systems but attention became very popular also in other network

topologies so this is why we are discussing this at this point.

Okay, so let's motivate this a bit.

Well, why do we need visualization?

So far we are considering neural networks somehow as a kind of black box.

It just learns something, we have some topology that we put into this black box, then we train

it but then we typically have problems in understanding what's happening in this black

box.

So we would like to understand and communicate first of all the inner workings which means

the topology but then also the learned state of a network and try to interpret and figure

out what it's actually doing.

And these are the main topics that we want to focus today on.

So why does this matter?

Of course we want to communicate with others, we want to identify issues during the training,

so if there's no convergence or if there's problems emerging like the dying relos and

obviously we want to be able to identify faulty training and test data.

Also something we haven't discussed so far, so far we are living in this ideal world and

we have this training and test sampling and they are correctly sampled and all is good

but sometimes when you work with real data then you can actually have trouble with the

collection of the data sets and they may not be really appropriate for your problem and

you should really think about that and identify issues if they happen.

And then also we would like to identify why and what the network is actually learning.

So there's three types of visualization, we have the architecture, the training visualization

and the learned parameters, weights visualization where we focus on two different parts, so

first of all we focus on the structure, second we focus on what is being trained and how

the training process is done and the last part is on a train network where we already

are done with the training.

So let's talk a bit about the network architecture visualization.

Of course we want to communicate architectures effectively and obviously priors may be imposed

Teil einer Videoserie :

Deep Learning

Presenters

Prof. Dr. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

01:24:33 Min

Aufnahmedatum

2019-06-27

Hochgeladen am

2019-06-27 19:29:02

Sprache

en-US

Deep Learning (DL) has attracted much interest in a wide range of applications such as image recognition, speech recognition and artificial intelligence, both from academia and industry. This lecture introduces the core elements of neural networks and deep learning, it comprises:

(multilayer) perceptron, backpropagation, fully connected neural networks
loss functions and optimization strategies
convolutional neural networks (CNNs)
activation functions
regularization strategies
common practices for training and evaluating neural networks
visualization of networks and results
common architectures, such as LeNet, Alexnet, VGG, GoogleNet
recurrent neural networks (RNN, TBPTT, LSTM, GRU)
deep reinforcement learning
unsupervised learning (autoencoder, RBM, DBM, VAE)
generative adversarial networks (GANs)
weakly supervised learning
applications of deep learning (segmentation, object detection, speech recognition, ...)

durch technische Probleme fehlen die ersten Minuten der Vorlesung. Wir bitten das zu entschuldigen.

Tags

Per RSS abonnieren