11 - Architectures of Supercomputers [ID:10235]
50 von 1064 angezeigt

Welcome to the last lecture of super computers, architectures of super computers, sorry.

I probably should silence my mobile phone.

Last week we were talking about current systems in Germany and I would like to finish this

overview with a brief overview of systems in Erlangen and in Bavaria.

And afterwards I would like to try to give you a glimpse into the future.

Actually when I was looking at my slides I realized that I should delete some of the

slides because traditionally at the end of the term I give an outlook and much of the

outlook was actually already state of the art or even boring state of the art.

So that's why I deleted some slides on Power 7 because now we have Power 8 and I deleted

some slides on Stampede which is a system currently set up at the Texas Advanced Computing

Center which is no longer a future system but a current system they are already thinking

about replacing that.

However, let's move back to Erlangen.

We have two systems, two major systems currently.

One is Lima, 500 nodes, about 6,000 Intel Xeon cores and 40 gigabit InfiniBand.

In total 12 terabytes of RAM.

So that's about 2 gigabyte per core.

That used to be the ratio for most computer clusters, 2 gigabytes RAM per core.

So if you look at some of the other super computers especially the larger machines you

will almost always find this 1 to 2 ratio.

This is slowly changing.

RAM has become so much cheaper than cores that today in newer systems you might find

a ratio of 1 to 3 or 1 to 4.

If we look at EMI which is currently the newest system you will see that it's got almost the

same number of nodes.

It's just 60 nodes more.

That's not too much of an improvement.

It does have almost twice the number of cores simply because we now have basically we still

have two sockets per node but simply the CPUs we can buy today have more cores.

If you look at the clock speed that's interesting because current CPUs are clocked at 2.2 gigahertz

but the previous ones were running at 2.6 gigahertz.

So that's just 420 megahertz, 460 megahertz if you think about that but it's still a sizable

ratio.

How much percent is less than the original clock speed?

Let's say about 20 percent.

If we then account for the increased number of cores you will obviously see that still

the performance went up but it's not a factor of 2 it's perhaps a factor of 1.5.

And if your code doesn't scale to this number of cores it probably won't achieve a better

performance on EMI.

Actually I think Thomas Seidel from the high performance computing group did some measurements

and what he came up with was that actually for their systems which they are buying the

single thread performance is decaying.

So on the previous system Lima they had a higher single thread performance than now

on EMI and I think even the system before Woody had a higher single thread performance

than Lima.

So it's actually slightly going down.

If you're able to vectorize, if you're able to use multithreading, if you're able to use

multiple nodes then you should be alright.

And of course if you assume that something is running on a high performance computing

system then you always assume that it's able to scale.

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:32:08 Min

Aufnahmedatum

2015-01-27

Hochgeladen am

2019-04-03 17:19:04

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen