The talk of today is about the new features in MPI 4.
Here is the history of the MPI standards and you see here is MPI 4 and here is already
the upcoming MPI 4.1 which is scheduled for end of next year.
And in principle I have here about 19 topics about MPI 4 and about another 5 topics about
what is scheduled for next year MPI 4.1 and this means I have only between 1 and 2 minutes
per topic.
But I have many many slides with background information.
This means I want to be really fast and going only through the main things.
My co-author is Tobias Haas and he has for example looked at politicians and has provided
the example there and we will go over that later.
And all these slides are based on my complete MPI course at the HLIS which is a 5 day course
on how MPI from MPI 1 until MPI 4.
Therefore the acknowledgments also apply to this course.
First topic is large counts.
With MPI 3 they added additional routines that you can examine how long is a real message
and so on which goes beyond the 2 billion of an integer because the counts in MPI are
called integer.
And now with MPI 4 we really define for each MPI routine that has an integer count and
then this is a large routine with an MPI count argument.
And it looks like that you will see here between the original MPI received in C and the modern
FORTRAN version you see for both you see the large count version where the count arguments
are.
Now MPI count this means 8 by integers.
And in C it is with an additional subroutine name underscore C and in FORTRAN it is directly
overloaded with the same name.
And with the old FORTRAN interface nothing was done.
It is already background information in principle on the way of being outdated and going away.
So, here are some real examples of how such interfaces now look like.
Next topic, persistent collectives.
There was an idea let's do persistent collectives.
In MPI 3 or MPI 1 already we had these persistent point of point routines, communication routines
and the idea behind persistent routines is when you have a time step loop for example
and inside of the time step loop all at the time the same communication panel that's MPI
library then you can optimize your communication panel.
You initialize all the communication and then in each time step you are going start all,
start all, start all and then the MPI library has a chance to optimize.
And of course also collectives may be additionally optimized on the given hardware network and
that's the idea behind it.
And the problem was when we tried to do this that the whole wording in MPI about what is
a monoclock routine and so on, all this wording was not really written in a precise way.
Therefore the idea was really to write it now in a perfect way and as a result of that
we really changed the definition of non-blocking.
In the past non-blocking meant only incomplete and now we changed it to non-blocking is now
incomplete and must be a local team because the reason is very similar.
These unit routines of the persistent collectives they will be incomplete but they will be not
local because for the optimizations they need to communicate.
They have to be connected and to enable a team non-blocking that is not local, that
blocks, that would be it.
You can't do such definitions, it makes no sense for the users.
And the result of that was that we have a complete new naming.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:56:16 Min
Aufnahmedatum
2022-10-18
Hochgeladen am
2022-10-31 19:56:04
Sprache
en-US