6 - Architectures of Supercomputers [ID:10230]
50 von 563 angezeigt

Okay, go, go, go. Thank you all for coming here. I'm sorry for the last minute change

we had to do for our schedule, so we had to evade from last week's, from yesterday's schedule.

Today we won't be talking about the exercises, but really just about the lecture. Vanna,

that's another chair. The reason for that is I initially planned to squeeze both together

into one session, but I don't think that will work out very well because the next series

of exercises will be more complex than the previous ones and I would like to spend some

more time discussing the previous times and explaining the next ones. So we'll do all

those exercises next week. And yeah, I would like to catch up with the lecture because

the lecture is not in schedule, the exercises are, even if we shift them to next week.

Before we start with the actual lecture, I would like to pick up some questions asked

last week. Someone asked if Nihalem was also an architecture which featured an onboard

graphics core from Intel. Actually, it didn't, but its successor Westmere, oh sorry, I really

just dug that up from Wikipedia, its successor Westmere did contain a graphics core which

was attached to the memory controller. So if you recall Intel's TikTok rhythm of propagating

new architectures, you may remember that Westmere was not a new microarchitecture but the die

shrink of Nihalem. So very few things were changed and this is probably also an explanation

why the graphics core was attached to the memory controller instead of moving it here

into the core part of the CPU. Because this way they could make those changes with minimal

impact on the actual CPU design. Today, memory, the graphics core is usually attached to the

last level cache, level 3 cache for most architectures from Intel. So new microarchitectures have

changed the way graphics cores are integrated but Westmere was the first to feature an on-chip

graphics core.

Before we start with the actual lecture, I have another, I would like to tell you something

from my research life. You probably know when you're doing software engineering that you

should write unit tests to test if your code or at least your functions are doing what

you expect them to do and you should also run them in automatic fashion. What few people

do is performance testing. Quite often when I'm talking to my colleagues, they say performance

testing is something they will do for the next release or before they publish a new

release they do some performance testing. But usually, in my experience, that's too

late because if you have been working for three months and you've been changing your

code a lot and you suddenly realize that your code is running just half the speed of the

previous release, I'm not sure if you really want to dig through thousands of lines of

code to see which change actually caused the performance degradation. So my view is if

you're testing for functionality, you should also test for performance at least if you're

doing HPC. And this should be done automatic on different platforms and all the time basically.

What I'm showing you here is from my own library a performance plot and usually I just care

for this line which should be red and this should be green actually. Anyway, what happened

is somewhere around the 17th of November, performance dropped. When I saw this first,

I wasn't really sure what happened. I took a look at my log and I discovered that I didn't

actually change the code. So the same code was running on the same machine suddenly with

one third less performance which was really surprising to me. So I dug through the log

files on that machine to figure out what happened and actually it was rebooted at that time.

So something changed in the system configuration. I tried to figure out what changed in the

system configuration but actually there was no change. At least none I could discover.

So yesterday I was really frustrated. You probably remember when I was sitting in the

same room with you. What happened yesterday evening was our head node also rebooted. I'm

not sure why that was. I think it ran out of memory and the watch dock kicked in. Anyhow,

so I re-ran the performance tests. Just a second. From this morning the performance

tests looked like this. It's all back to normal. Actually it's a bit above that because when

I was debugging my performance tests I also discovered that I wasn't doing any process

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:12:45 Min

Aufnahmedatum

2014-12-03

Hochgeladen am

2019-04-03 10:29:04

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen