35 - Deep Learning - Plain Version 2020 [ID:21169]
50 von 141 angezeigt

Welcome back to the final part of our video series on recurrent neural networks and today

we want to talk a bit about sampling of recurrent neural networks.

And when I mean sampling, I mean that we want to use recurrent neural networks to actually

generate sequences of symbols.

So how can we actually do that?

Well, if you train your neural networks in the right way, you can actually create them

in a way that they predict the probability distribution of the next element.

So if I train them to predict the next symbol in the sequence, you can also use them actually

for generating sequences.

And the idea here is that you start with the empty symbol and then you use the RNN to generate

some output and then you take this output and put it into the next state's input.

And if you go ahead and do so, then you can see that you can actually generate whole sequences

from your trained recurrent neural network.

So the simplest strategy is to perform a greedy search.

So here we start with the empty symbol and then we just pick the most likely element

as the input to the RNN in the next state and generate the next one and the next one

and the next one.

And this then generates exactly one sample sequence per experiment.

So this would be a greedy search and you can see that we exactly get one sentence that

is constructed here and the sentence that we are constructing here is, let's go through

time.

Well, the drawback is of course, there's no look ahead possible.

So let's maybe the most likely after let's go.

So you could be generating loops like let's go, let's go and so on.

So you're not able to detect that let's go through time has a higher total probability.

So it tends to repeat sequences of frequent words and the sum and so on in speech.

And now we are interested in alleviating this problem and this can be done with a beam search.

Now the beam search's concept is to select the K most likely elements and K is essentially

the beam width or size.

So here you then out of all possible sequences, you have the one with these K elements as

prefix and take the K most probable ones.

So in the example that we show here on the right hand side, we start with the empty word

and then we take the two most likely ones, which would be let's and through and then

we generate let's as the next one.

If we take through, if we take let's, we generate go and we can continue this process and with

our beam of size two, we can keep the two most likely sequences in the beam search.

So now we generate two sequences at a time.

One is let's go through time and the other one is through let's go time.

So you see that we can use this beam idea to generate multiple sequences and then in

the end we can determine which one we like best or which one generated the most total

probability.

So we can generate multiple sequences in one go, which typically then also contains better

sequences than in the greedy search and I would say this is one of the most common techniques

actually to sample from an RNN.

Of course, there's also other things like random sampling and here the idea is that

you select the next one according to the output probability distribution.

You remember we encoded our vectors as one hot encoded vectors and then we can essentially

interpret the output of the RNN as a probability distribution and then sample from this distribution.

This then allows us to generate many different sequences.

So let's say if let's has an output probability of point eight, it is sampled eight out of

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

00:12:46 Min

Aufnahmedatum

2020-10-12

Hochgeladen am

2020-10-12 19:36:20

Sprache

en-US

Deep Learning - Recurrent Neural Networks Part 5

This video explains sequence generation using RNNs.

For reminders to watch the new video follow on Twitter or LinkedIn.

RNN Folk Music
FolkRNN.org
MachineFolkSession.com
The Glass Herry Comment 14128

Links
Character RNNs
CNNs for Machine Translation
Composing Music with RNNs

References
[1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural Machine Translation by Jointly Learning to Align and Translate”. In: CoRR abs/1409.0473 (2014). arXiv: 1409.0473.
[2] Yoshua Bengio, Patrice Simard, and Paolo Frasconi. “Learning long-term dependencies with gradient descent is difficult”. In: IEEE transactions on neural networks 5.2 (1994), pp. 157–166.
[3] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, et al. “Empirical evaluation of gated recurrent neural networks on sequence modeling”. In: arXiv preprint arXiv:1412.3555 (2014).
[4] Douglas Eck and Jürgen Schmidhuber. “Learning the Long-Term Structure of the Blues”. In: Artificial Neural Networks — ICANN 2002. Berlin, Heidelberg: Springer Berlin Heidelberg, 2002, pp. 284–289.
[5] Jeffrey L Elman. “Finding structure in time”. In: Cognitive science 14.2 (1990), pp. 179–211.
[6] Jonas Gehring, Michael Auli, David Grangier, et al. “Convolutional Sequence to Sequence Learning”. In: CoRR abs/1705.03122 (2017). arXiv: 1705.03122.
[7] Alex Graves, Greg Wayne, and Ivo Danihelka. “Neural Turing Machines”. In: CoRR abs/1410.5401 (2014). arXiv: 1410.5401.
[8] Karol Gregor, Ivo Danihelka, Alex Graves, et al. “DRAW: A Recurrent Neural Network For Image Generation”. In: Proceedings of the 32nd International Conference on Machine Learning. Vol. 37. Proceedings of Machine Learning Research. Lille, France: PMLR, July 2015, pp. 1462–1471.
[9] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, et al. “Learning phrase representations using RNN encoder-decoder for statistical machine translation”. In: arXiv preprint arXiv:1406.1078 (2014).
[10] J J Hopfield. “Neural networks and physical systems with emergent collective computational abilities”. In: Proceedings of the National Academy of Sciences 79.8 (1982), pp. 2554–2558. eprint: http://www.pnas.org/content/79/8/2554.full.pdf.
[11] W.A. Little. “The existence of persistent states in the brain”. In: Mathematical Biosciences 19.1 (1974), pp. 101–120.
[12] Sepp Hochreiter and Jürgen Schmidhuber. “Long short-term memory”. In: Neural computation 9.8 (1997), pp. 1735–1780.
[13] Volodymyr Mnih, Nicolas Heess, Alex Graves, et al. “Recurrent Models of Visual Attention”. In: CoRR abs/1406.6247 (2014). arXiv: 1406.6247.
[14] Bob Sturm, João Felipe Santos, and Iryna Korshunova. “Folk music style modelling by recurrent neural networks with long short term memory units”. eng. In: 16th International Society for Music Information Retrieval Conference, late-breaking Malaga, Spain, 2015, p. 2.
[15] Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, et al. “End-to-End Memory Networks”. In: CoRR abs/1503.08895 (2015). arXiv: 1503.08895.
[16] Peter M. Todd. “A Connectionist Approach to Algorithmic Composition”. In: 13 (Dec. 1989).
[17] Ilya Sutskever. “Training recurrent neural networks”. In: University of Toronto, Toronto, Ont., Canada (2013).
[18] Andrej Karpathy. “The unreasonable effectiveness of recurrent neural networks”. In: Andrej Karpathy blog (2015).
[19] Jason Weston, Sumit Chopra, and Antoine Bordes. “Memory Networks”. In: CoRR abs/1410.3916 (2014). arXiv: 1410.3916.

Further Reading:
A gentle Introduction to Deep Learning

Einbetten
Wordpress FAU Plugin
iFrame
Teilen