5 - Mathematical Basics of Artificial Intelligence, Neural Networks and Data Analytics II [ID:41406]
50 von 936 angezeigt

So, good morning together for the other one.

So I had had now in repeating the stuff from yesterday before without recording.

Now I will go on with new material here.

But the starting point of all of this was we will continue to speak about large closed

dynamical systems. And large closed dynamical systems as an easy architecture is the picture

above. But this is unsolvable. So we had to invent the architectural teacher forcing,

something that I have done 12 years ago. And then you get this picture here, which allows

you to say, now in the past I have the learning here, so that we can have an extended learning.

But if you really are able to learn down to zero error, then there's no more information

flow along the green lines here and have solved the original problem. So this is the HCNN

architecture. You can say that's the most basic architecture with which you can try

to do learning of large closed dynamical systems. And what I want to do now is I want to show

you that there are even more ideas needed to learn this historical consistent neural

networks. Historical consistent was the naming first of all historical because you have only

one history that you learn because you have to extend the unfolding along the whole time

series and consistent because you have a story where you have symmetric handling of the past

and the future. So and first of all, if you work with data, you have to think about preprocessing,

preprocessing of the data. And the system here are so powerful so that you can minimize

the handling of the preprocessing to relatively simple types of preprocessing. And here are

three different descriptions I show you. And in this HCNN environments, I never have used

more complicated things than these three examples here. So what is the first type of preprocessing?

So the idea is that the original raw data are capital Y. And the data which are presented

to the learning of the neural network are the small y here. So the first idea is to

say I do a linear transformation such that the raw data, I compute the average value

of the raw data. And then I do this linear transformation. The obvious consequence of

this is that you have data which are fluctuating around zero because you have subtracted the

average value here. And you have a scaling parameter which makes this thing small enough

so that except outliers that the most of the data are in the interval between minus one

and plus one. So this is a linear transformation. You have to find the average value of the

raw data and the scaling parameter so that this thing here is between one and minus one.

It's not only an easy computation for the preprocessing, also for the post-processing.

Please have in mind your customers are not interested in a forecast of these values here.

They want to have a forecast of raw data of the time series in raw data units. Which means

in a post-processing step you have to turn around the preprocessing. And that's easy

if you have such a thing here. So if you have a forecast in this range here, then you have

to take one over A and offset here. And then you have the forecast of the post-processed

values in a form so that your customer will like it. I say again, they do not like the

forecast of this thing. They like the forecast of the values they want to have. The second

possibility is if you have a trend in the time series. An example of what you can do

here, upside here for the first example is let's speak about temperature. So normally

in our area here the temperature always is a positive number. So it might be most of

the time between zero and 30 or 35 degrees Celsius. So which means if you would do such

a change here, then you would have something which is fluctuating around zero. This is

for the computation, not that I like so deep temperatures. And then the next possibility

is if you speak about things which might have a trend. And as I told you before, trends

in your network descriptions are not good because the neural network adjusts all the

parameter in between so that the information flows are living between minus one and plus

one because that's the limitation of the Tangent's age. And if you would have a trend from the

past in the future direction, then in the generalization you would sit always in the

limitation of the Tangent's age on one or the other side.

Zugänglich über

Offener Zugang

Dauer

01:32:10 Min

Aufnahmedatum

2022-04-20

Hochgeladen am

2022-04-20 17:26:05

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen