I think we'll start in English.
Did we stop here last week?
Maybe.
I think we're somewhere around here.
Okay, so let's begin.
Thank you very much for being here.
Today we'll talk about actually a theoretical analysis of performance modeling.
And our aim is to explain why the Blue Gene, the initial Blue Gene machine, and actually
also its successors were designed the way they were designed.
So what you can see here is, now let me skip back one slide.
So we were talking about Andal's Law and we were trying to motivate why usually people
say when I'm estimating the speed up, which is defined as the sequential time divided
by the parallel time, people usually say I would rather have few nodes but faster nodes
than many nodes which are each not as capable.
And the reason for that is, if I'm trying to estimate the parallel time, Andal basically
says the time with a problem size n and p processors can be estimated as the, what we
are calling this here sigma, the sequential calculations.
Depending on the problem size plus the parallelizable fraction or the parallelizable calculations,
phi of n divided by p.
And we assume that if I have a faster CPU, the CPU will also be able to speed up the
sequential part.
That's basically what we assume.
However, if we are really to parallelize a code, probably the parallel time needs to
encompass this kappa of a and p, which we used to describe the overhead.
Because usually when you do network communication, you also have to pay some sort of overhead,
right?
Network is never free.
Ideally, we can assume that communication and calculation happens in parallel.
That's what we hope.
It's not always true, but for the type of application we'll look at in the next slides,
this is actually true.
So this formula here changes to that formula.
I get a maximum term here and please assume this k is actually a kappa.
I'm just writing this down to the board because we'll use that formula later on a couple of
times.
Now, this is what we said last time, the under effect basically for large problem sizes,
your code might scale.
For smaller problem sizes, you will have problems.
Actually if you don't take into account overhead, then these curves should just approach a certain
saturation.
They should not go down.
If you also include overhead, it might actually go down here.
Now, this is what I would really like to get to.
This is an example where I try to actually make some performance estimates to see how
well certain architectures would be suited for a compute problem.
Let me just tune the slide size a bit.
I'm still battling with the Nvidia driver.
Sorry that the slides don't look very good.
Basically the problem we would like to solve is or the question we would like to answer
is we have a 3D sensor code.
Presenters
Zugänglich über
Offener Zugang
Dauer
01:21:14 Min
Aufnahmedatum
2014-12-09
Hochgeladen am
2019-04-03 20:59:04
Sprache
en-US