��는
So welcome back everybody to our Deep Learning Lecture.
And todays' lecture will be focused
on Recurrent Neural Networks.
So we will have
a short motivation,
then we will introduce, lets say simple recurrent networks.
go ahead and introduce more powerful concepts like the long short-term memory units
followed by gated recurrent units and then we have a comparison of simple RNN units, LSDM units and GRU's.
So that are the gated recurrent units and then some sampling strategies for RNNs so that you can actually
use RNNs not just to
recognize and process
chains of observations. They can also be used to sample and to produce
chains of observations. That is the sampling strategy and then we look into a couple of examples
that I found quite impressive.
Okay.
Motivation so far we have one input or
one fixed size input.
Let's say a single image or a feed-forward network input processing result.
But actually we have lots of data where we have sequences and they are time dependent.
Yeah, like speech, music, videos or other sensor data
that
collected over time. So it could be speed, temperature, energy consumption, you name it.
And if you just look at a very short snapshot, it may be not very informative because if you
think about for example translation tasks, if you just have a single word there might be multiple matching translations and only in the correct context
you can pick a good translation.
So we need temporal context. This is what we want to model with the recurrent units.
So how can we integrate context into a network?
One way how you could do that is you just could put a whole sequence into a big network and
just process it. But one problem that then occurs is that such a
sequence may have different lengths and how do you deal with this different length?
By the way, it may sound like a bad idea, but there's also
ways how to approach this with convolutional neural networks and
actually there's
newer papers out there that can show that instead of using recurrent networks, you can also use fully convolutional neural networks,
which we will talk about in more detail later, which is
actually
quite interesting, but just in a simple way that you would just
design a big network and just run a big network will not work.
You also have to put in some fault in order to make it work with convolutional neural networks.
So that would lead to, the simple approach would lead to inefficient memory usage. It's difficult or even impossible to drain and
the difference, yeah, what's then the difference between spatial and temporal dimensions?
So
furthermore, it may not be real-time. So think about translation tasks, for example.
The better approach then would be to model the sequence behavior within the architecture directly.
And this is where the recurrent neural networks come in and we will look at some more details how to do that in a bit.
So let's look at the, let's say, simple recurrent network and the simple recurrent networks have the following structure
that you have some cell A and a sequence of observations and
this is producing a sequence of outputs.
Now the cell is not just connected to the inputs and to the outputs,
Presenters
Zugänglich über
Offener Zugang
Dauer
00:55:50 Min
Aufnahmedatum
2018-05-30
Hochgeladen am
2018-05-30 14:39:05
Sprache
en-US