So, welcome everybody.
My name is Johannes Fee and I will tell you something about file systems and efficient
data handling.
And as you may know, this is a rather important topic for us because from time to time we
see issues with file systems, sometimes because something in the file system went wrong, oftentimes
because some user is doing things which the file system was not designed for.
So let's talk about what our file system can and cannot do and learn what we can do better
in the future to not overload file systems.
But before we get started with file system themselves, we have to think about what the
hardware is underneath.
So you may have seen these building blocks in your local machine, your notebook, your
computer.
So we have HDDs, these are spinning disks.
It's a technology which is a few decades old, quite reliable, cheap, but slowish because
there's a rotating disk inside and the reading header has to get to the point where your
data is.
So you have to wait one entire spin before you can read data.
You have to wait an entire spin before you can write data.
So it's slowish.
Then we have SSDs.
I think this technology is about 20 years old or something like that.
You probably have heard about them.
And we have the NDMEs, which is basically SSD on steroids.
They are about a factor of 10 faster.
And yes, and this is so the NDMEs is what you probably have in a modern notebook on
your desk when you're working.
And our storage systems are built from these disks as well.
But as you may know, a single disk has a quite limited capacity.
So we don't use a single disk.
We use a few hundred of them and build several racks of storage.
So what you can see here are two pictures of the storage array, which is underneath
slash home HPC and slash home vault.
So this is our main storage.
Let's call it like that.
It consists of roughly a thousand HDDs, 20-ish SSDs, and it almost weighs three tons.
OK, so what we do by building a storage array out of these many disks is we increase capacity.
We somewhat increase speed because we now have several disks which can all read at the
same time and write at the same time with some limit.
So scaling is not with a factor of disks, so we don't get a thousand times faster than
a single disk.
We get maybe a factor of 10 or 20 or something like that because there is a lot of software
between to get all these things done.
So we are still slowish because it's HDDs, it's not SSDs.
And building something like this entirely out of SSDs would be quite expensive, so we
don't do that.
So this is the hardware underneath.
So what about file systems?
I said I will talk about file systems.
So a file system is a directory structure.
And you know this again from your local machine, from your laptop.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:35:30 Min
Aufnahmedatum
2024-10-08
Hochgeladen am
2024-10-28 15:36:07
Sprache
en-US
Speaker: Dr. Johannes Veh
Slides: TBA
Abstract: Performance issues often arise from the inefficient handling of data stored in file systems. Many of these problems can be avoided with basic knowledge of file systems and of what they can (not) do. This talk will cover the file systems available at NHR@FAU, their properties, usage scenarios, and common best practices for managing files.