6 - Mathematical Basics of Artificial Intelligence, Neural Networks and Data Analytics II [ID:41408]

50 von 906 angezeigt

Good.

So now let's go on.

Before lunch we stopped with the point that the framework of HCMN is a nice idea, but

to bring it to a practical success, we need additional ideas and not only one.

The first point we have discussed is the handling of the learning rate is simple.

The next point was if you have so large systems, you have to think about sparsity and how to

do so.

At the end you can handle this by a simple rule of thumb with 50 divided by the dimensionality

of S or the dimensionality of the matrix here.

And so then you have to live with large sparse matrixes if you are really interested in large

dynamic systems.

And then that's the basic.

Otherwise you are unable to do the computation of large systems.

And then the next topic was to think about how can you improve the memory of the system.

And this is a big step in making it practical.

Because if you do this teacher forcing each step, this is too much help.

In the beginning you need it.

Please when you start with this thing here, first of all, start with P equal to zero and

then run it to as good as possible solution.

What is as good as possible?

If it's not good, then simply increase the dimensionality of S to the point so that you

really here are with target near to zero.

Exactly zero is nonsense, but so that you have a small value there.

So then you have the dimensionality, which is large enough.

Combined with the dimensionality, you have the sparsity level.

And about the length of the unfolding, we do not have to discuss because we unfolded

along the whole time series.

And then there are not many metaparameters to think about.

So the only point is if you have such a long unfolding over a hundred of steps, then long

memory is definitely a point you have to think about.

And we had three different possibilities about this.

Large spars alone or large spars in combination with partial teacher forcing or with LSTM.

And if you really have large spars, then it looks like that LSTM is not reasonable because

the learning itself absorbs it in your network part.

Only if you have smaller systems, not large spars, then the LSTM could be reasonable.

But the partial teacher forcing in every case is reasonable.

I have shown it to you in this exercise here.

It's starting here.

So this is a bad solution, even after 30,000 epochs of learning.

Even after a check if the learning rate is OK in the form that the distribution of the

parameters in the matrix A is fine.

And so therefore you could say even the start distribution here is OK.

So it's not too far away from uniform distribution.

And it's not artificially shrinked between a minus value and a plus value here.

So it is not minus 2.5 plus 2.5.

No, no.

It's what the system itself is learning.

So you have a technical, from a technical viewpoint, you have a good solution.

Nevertheless, from the viewpoint of the outcome and generalization behavior, it's awful.

So we have one instrument more.

Teil einer Videoserie :

Mathematical Basics of Artificial Intelligence, Neural Networks and Data Analytics II

Presenters

Dr. Hans Georg Zimmermann

Zugänglich über

Offener Zugang

Dauer

01:07:00 Min

Aufnahmedatum

2022-04-20

Hochgeladen am

2022-04-20 19:06:03

Sprache

en-US

Einbetten

Wordpress FAU Plugin

 https://www.fau.tv/clip/id/41408

iFrame

<iframe src="https://api.video.uni-erlangen.de/services/oembed/?url=https://www.fau.tv/clip/id/41408&format=iframe&maxwidth=1280&maxheight=720" width="1280" height="720"seamless allowfullscreen style="border: 0; padding: 0; margin: 0;overflow: hidden;"></iframe>

Herunterladen

Video

Per RSS abonnieren