2 - Machine Learning for Physicists [ID:7601]
50 von 715 angezeigt

The following content has been provided by the University of Erlangen-Nürnberg.

Okay, hello everyone. Good evening, welcome back to the second lecture. So the plan for today is

the following. I will first remind you of some of the things that we said last time about the

basics of neural networks. Then I will tell you about something that is a bit technical,

but it's a very useful trick how to process many samples in parallel. And then I will teach you

something about the universality of neural networks, which means that give me any nonlinear

function and I can produce in principle a neural network that does this function for you. And after

that we'll start with discussing the training. So how do you teach a neural network?

So just to remind you of what we said last time. So this is a neural network. The purpose is to

produce some output given some input and the input might be for example a picture and the

output could be a description of this picture. And you want to teach the network by showing it

very many training examples without ever having to hand code an algorithm yourself. Still you hope

that the neural network in the end will have learned the mapping from input to output. So

then we discussed how simple neural network looks like. And let me just switch to the description

right away. So this is a very simple neural network where you have some neurons that encode the input

values. So each blue dot represents a neuron or a unit and it will have a value. These values you

prescribe, these are the input values. These values the neural network computes for you. It's a very

simple network because it does not have any hidden layers rather it goes straight from input to

output. And in general any neural network works by a combination of linear mappings and very simple

nonlinear functions. So the linear mapping I've written down here. If you label the output neurons

in the upper layer by J and the input neurons by K then what happens is you can take some arbitrary

weighted superposition of the input neurons, possibly add some offset or bias term we call it,

and you get the value of an output neuron. And these weights in the superposition they depend

both on the input and output neurons so these weights are so to speak the strengths of the

connections that we're showing here. And both for artificial neural networks as well as for real

biological neural networks a training means that these weights will change in time. Now one big

part of all of this will be that linear algebra can make things simpler at least in terms of

notation. And so this is obviously a matrix applied to a vector so I can also write W matrix times Y

in that is a vector and then add another vector B that contains these bias values. So it's rather

simple if you write it like that and after this linear superposition you then apply a nonlinear

function. So for each output neuron you still plug the value that you got, ZJ, into some nonlinear

function. And this nonlinear function typically is very simple it could be a smooth step function

or it could be something that is zero for negative input values and rises linearly for positive

values. So there are many possible nonlinear functions the important thing is only you have

a nonlinear function. Okay so then I explained to you that we are doing this with Python. At the end

of this lecture of course we will discuss whether you have all succeeded in installing stuff or

whether you still have some questions. And then we said how do you program this little piece of

linear algebra. You would take a matrix W, you take a matrix vector product in Python which you

can write down like this dot W comma Y and then the output of this operation will of course have

dimensions N out. Just like for any matrix vector product this is the dimension of the index you

sum over and that's the remaining dimension. And then we went on to implement this. So we said for

example take this network we have three input neurons two output neurons we want to create a

matrix of weights W which has the dimension in this case two by three because input output

dimension versus input dimension and you also have this bias vector whose size is just given

by the number of output neurons each of them needs its own bias. And in this particular example I

implemented them to be random I took some arbitrary vector for the input values and then I simply

applied both the linear algebra operation and the subsequent nonlinear function. And in this case we

took a smooth step function also called sigmoid a physicist would say a Fermi distribution. Is

there still some question about this from last time? Okay so here's a graphical visualization

which I mentioned briefly last time. So suppose we have a network that has two input values and

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:23:50 Min

Aufnahmedatum

2017-05-11

Hochgeladen am

2017-05-14 09:02:01

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen