31 - Deep Learning - Plain Version 2020 [ID:21165]
50 von 90 angezeigt

Welcome everybody to a new session of deep learning. Today we want to look into sequential

learning and in particular recurrent neural networks. So far we only had simple feedforward

networks where we had essentially a fixed size input and would then generate a classification

result like cat, dog or hamster. But if we have sequences like audio, speech, language

or videos that have a temporal context, the techniques that we've seen so far are not

that very well suited. So we're interested now into looking into methods that will be

applicable to arbitrary long input sequences. Recurrent neural networks are exactly one

method to actually do so. So after a first review of the motivation, we'll go ahead and

look into simple recurrent neural networks. Then we'll introduce the famous long short

term memory units followed by gated recurrent units. Then we will compare these different

techniques and discuss a bit the pros and cons. And finally we will talk about sampling

strategies for RNNs. Of course this is way too much for a single video, so we will talk

about the different topics in individual short videos. Okay, so let's look at the motivation.

Well we had one input for one single image, but this is not so great for sequential or

time dependent signals such as speech and music, video or other sensor data where you

could even talk about very simple sensors that measure energy consumption. So these

snapshots with a fixed length are often not that informative. So if you look at a single

word you probably have trouble into getting the right translation because the context

methods. And temporal context is really important and it needs to be modeled appropriately.

So the question is now how can we integrate this context into the network? The simple

approach would be to feed the whole sequence to a big network. And this is potentially

a bad idea because we have inefficient memory usage, it's difficult to train or even impossible

to train and we would never figure out the difference between spatial and temporal dimensions.

We would just handle all the same. Actually maybe it's not such a bad idea for rather

simple tasks as you can see in the reference down on the slide because they actually investigated

this and found quite surprising results with CNNs. Well one problem that you have of course

is it won't be real time because you need the entire sequence for the processing. So

the approach that we are suggesting in this and the next couple of videos is to model

sequential behavior within the architecture and that gives rise to recurrent neural networks.

So let's have a look at the simple recurrent neural networks and the main idea is that

you introduce a hidden state HT that is carried on over the time. So this can be changed but

it is essentially connecting back to the original cell A. So A is our recurrent cell and it

has this hidden state that is somehow allowing us to encode what the current temporal information

has brought to us. Now we have some input XT and this will then generate some output

YT and by the way the first models were from the 1970s and the early 1980s like Hopfield

networks. Here we will stick with the simple recurrent neural network or Elman network

as introduced in reference number 5. Now feed forward networks only feed information forward.

So with recurrent networks in contrast we can now model loops, we can model memory and

experience and we can learn sequential relationships. So we can provide continuous predictions as

the data comes in and this enables us to process everything in real time. Now this is again

our basic recurrent neural network where we have some input X that is multiplied with

some weight, then we have the additional input the hidden state from the previous configuration

and then we have essentially a feedback loop where we use the information from the present

and the recent past to compute the output YT. If we do that we essentially end up with

an unfolded structure. So if you want to evaluate the recurrent unit what you do is you start

with some X0 that you process with your unit that generates a new result Y0 and a new hidden

state H0. Now H0 is feed forward to the next instance of A where essentially the weights

are coupled so we have exactly the same copy of the same unit in the next time state but

H is of course different. So now we feed in X1 process generate Y1 and produce a new hidden

state H1 and so on and so on and we can do that until we are at the end of the sequence.

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:10:38 Min

Aufnahmedatum

2020-10-12

Hochgeladen am

2020-10-12 19:06:21

Sprache

en-US

Deep Learning - Recurrent Neural Networks Part 1

This video introduces the topic of recurrent neural networks and the Elman Cell.

For reminders to watch the new video follow on Twitter or LinkedIn.

Further Reading:
A gentle Introduction to Deep Learning

Einbetten
Wordpress FAU Plugin
iFrame
Teilen