8 - Deep Learning [ID:9207]
50 von 470 angezeigt

��는

So welcome back everybody to our Deep Learning Lecture.

And todays' lecture will be focused

on Recurrent Neural Networks.

So we will have

a short motivation,

then we will introduce, lets say simple recurrent networks.

go ahead and introduce more powerful concepts like the long short-term memory units

followed by gated recurrent units and then we have a comparison of simple RNN units, LSDM units and GRU's.

So that are the gated recurrent units and then some sampling strategies for RNNs so that you can actually

use RNNs not just to

recognize and process

chains of observations. They can also be used to sample and to produce

chains of observations. That is the sampling strategy and then we look into a couple of examples

that I found quite impressive.

Okay.

Motivation so far we have one input or

one fixed size input.

Let's say a single image or a feed-forward network input processing result.

But actually we have lots of data where we have sequences and they are time dependent.

Yeah, like speech, music, videos or other sensor data

that

collected over time. So it could be speed, temperature, energy consumption, you name it.

And if you just look at a very short snapshot, it may be not very informative because if you

think about for example translation tasks, if you just have a single word there might be multiple matching translations and only in the correct context

you can pick a good translation.

So we need temporal context. This is what we want to model with the recurrent units.

So how can we integrate context into a network?

One way how you could do that is you just could put a whole sequence into a big network and

just process it. But one problem that then occurs is that such a

sequence may have different lengths and how do you deal with this different length?

By the way, it may sound like a bad idea, but there's also

ways how to approach this with convolutional neural networks and

actually there's

newer papers out there that can show that instead of using recurrent networks, you can also use fully convolutional neural networks,

which we will talk about in more detail later, which is

actually

quite interesting, but just in a simple way that you would just

design a big network and just run a big network will not work.

You also have to put in some fault in order to make it work with convolutional neural networks.

So that would lead to, the simple approach would lead to inefficient memory usage. It's difficult or even impossible to drain and

the difference, yeah, what's then the difference between spatial and temporal dimensions?

So

furthermore, it may not be real-time. So think about translation tasks, for example.

The better approach then would be to model the sequence behavior within the architecture directly.

And this is where the recurrent neural networks come in and we will look at some more details how to do that in a bit.

So let's look at the, let's say, simple recurrent network and the simple recurrent networks have the following structure

that you have some cell A and a sequence of observations and

this is producing a sequence of outputs.

Now the cell is not just connected to the inputs and to the outputs,

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:55:50 Min

Aufnahmedatum

2018-05-30

Hochgeladen am

2018-05-30 14:39:05

Sprache

en-US

Tags

update backpropagation search translation previous question compute gates basic train dependencies unit memory hidden output input networks learning neural sequence gradient network model
Einbetten
Wordpress FAU Plugin
iFrame
Teilen