72 - HPC Cafe on October 8, 2024: HPC Café: File systems and efficient data handling [ID:55103]
50 von 475 angezeigt

So, welcome everybody.

My name is Johannes Fee and I will tell you something about file systems and efficient

data handling.

And as you may know, this is a rather important topic for us because from time to time we

see issues with file systems, sometimes because something in the file system went wrong, oftentimes

because some user is doing things which the file system was not designed for.

So let's talk about what our file system can and cannot do and learn what we can do better

in the future to not overload file systems.

But before we get started with file system themselves, we have to think about what the

hardware is underneath.

So you may have seen these building blocks in your local machine, your notebook, your

computer.

So we have HDDs, these are spinning disks.

It's a technology which is a few decades old, quite reliable, cheap, but slowish because

there's a rotating disk inside and the reading header has to get to the point where your

data is.

So you have to wait one entire spin before you can read data.

You have to wait an entire spin before you can write data.

So it's slowish.

Then we have SSDs.

I think this technology is about 20 years old or something like that.

You probably have heard about them.

And we have the NDMEs, which is basically SSD on steroids.

They are about a factor of 10 faster.

And yes, and this is so the NDMEs is what you probably have in a modern notebook on

your desk when you're working.

And our storage systems are built from these disks as well.

But as you may know, a single disk has a quite limited capacity.

So we don't use a single disk.

We use a few hundred of them and build several racks of storage.

So what you can see here are two pictures of the storage array, which is underneath

slash home HPC and slash home vault.

So this is our main storage.

Let's call it like that.

It consists of roughly a thousand HDDs, 20-ish SSDs, and it almost weighs three tons.

OK, so what we do by building a storage array out of these many disks is we increase capacity.

We somewhat increase speed because we now have several disks which can all read at the

same time and write at the same time with some limit.

So scaling is not with a factor of disks, so we don't get a thousand times faster than

a single disk.

We get maybe a factor of 10 or 20 or something like that because there is a lot of software

between to get all these things done.

So we are still slowish because it's HDDs, it's not SSDs.

And building something like this entirely out of SSDs would be quite expensive, so we

don't do that.

So this is the hardware underneath.

So what about file systems?

I said I will talk about file systems.

So a file system is a directory structure.

And you know this again from your local machine, from your laptop.

Teil einer Videoserie :
Teil eines Kapitels:
HPC Café

Zugänglich über

Offener Zugang

Dauer

00:35:30 Min

Aufnahmedatum

2024-10-08

Hochgeladen am

2024-10-28 15:36:07

Sprache

en-US

Speaker: Dr. Johannes Veh

Slides: TBA

Abstract: Performance issues often arise from the inefficient handling of data stored in file systems. Many of these problems can be avoided with basic knowledge of file systems and of what they can (not) do. This talk will cover the file systems available at NHR@FAU, their properties, usage scenarios, and common best practices for managing files.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen