Welcome to the last lecture of super computers, architectures of super computers, sorry.
I probably should silence my mobile phone.
Last week we were talking about current systems in Germany and I would like to finish this
overview with a brief overview of systems in Erlangen and in Bavaria.
And afterwards I would like to try to give you a glimpse into the future.
Actually when I was looking at my slides I realized that I should delete some of the
slides because traditionally at the end of the term I give an outlook and much of the
outlook was actually already state of the art or even boring state of the art.
So that's why I deleted some slides on Power 7 because now we have Power 8 and I deleted
some slides on Stampede which is a system currently set up at the Texas Advanced Computing
Center which is no longer a future system but a current system they are already thinking
about replacing that.
However, let's move back to Erlangen.
We have two systems, two major systems currently.
One is Lima, 500 nodes, about 6,000 Intel Xeon cores and 40 gigabit InfiniBand.
In total 12 terabytes of RAM.
So that's about 2 gigabyte per core.
That used to be the ratio for most computer clusters, 2 gigabytes RAM per core.
So if you look at some of the other super computers especially the larger machines you
will almost always find this 1 to 2 ratio.
This is slowly changing.
RAM has become so much cheaper than cores that today in newer systems you might find
a ratio of 1 to 3 or 1 to 4.
If we look at EMI which is currently the newest system you will see that it's got almost the
same number of nodes.
It's just 60 nodes more.
That's not too much of an improvement.
It does have almost twice the number of cores simply because we now have basically we still
have two sockets per node but simply the CPUs we can buy today have more cores.
If you look at the clock speed that's interesting because current CPUs are clocked at 2.2 gigahertz
but the previous ones were running at 2.6 gigahertz.
So that's just 420 megahertz, 460 megahertz if you think about that but it's still a sizable
ratio.
How much percent is less than the original clock speed?
Let's say about 20 percent.
If we then account for the increased number of cores you will obviously see that still
the performance went up but it's not a factor of 2 it's perhaps a factor of 1.5.
And if your code doesn't scale to this number of cores it probably won't achieve a better
performance on EMI.
Actually I think Thomas Seidel from the high performance computing group did some measurements
and what he came up with was that actually for their systems which they are buying the
single thread performance is decaying.
So on the previous system Lima they had a higher single thread performance than now
on EMI and I think even the system before Woody had a higher single thread performance
than Lima.
So it's actually slightly going down.
If you're able to vectorize, if you're able to use multithreading, if you're able to use
multiple nodes then you should be alright.
And of course if you assume that something is running on a high performance computing
system then you always assume that it's able to scale.
Presenters
Zugänglich über
Offener Zugang
Dauer
01:32:08 Min
Aufnahmedatum
2015-01-27
Hochgeladen am
2019-04-03 17:19:04
Sprache
en-US