89 - NHR PerfLab Seminar 2025-05-6: The Artificial Scientist: In-Transit Machine Learning of Plasma Simulations [ID:57988]
50 von 740 angezeigt

So thanks, Gaurav, for inviting me to this seminar.

I see that you folks have been running this for a while.

So it's a pleasure to give a talk here.

All right.

So this is work with a lot of collaborators.

I will show you who they are in a minute.

But I am from University of Delaware.

If you don't know where Delaware is, it's about 90 minutes train ride from New York,

Manhattan, if that helps put things into perspective.

So it's the East Coast in the US.

So this work is basically several years of work with many scientists in the making,

where we have been trying to try different pieces of using HPC, AI, machine learning

throughout the process.

Lots of tangible outcomes have come out of this work in the past several years.

So to quickly give you a background of my group.

So this is my group in Delaware.

It's a group, PhD students, masters, and a lot of undergrad students.

We do both foundational work as well as interdisciplinary science work,

where we build compiler implementations and open source compiler tool chains,

LLVM, for example.

We explore directive page programming models, OpenMP and OpenACC,

for real world applications, migrate them to large supercomputers.

As you know, that is easier said than done, right?

So it exposes a lot of challenges both with the hardware and the software framework.

So we work our way through those.

In the last four or five years, we have also been engaging with local hospitals,

with their data sets and building predictive models for cancer research.

So there is an ML and AI component over there.

And very recently, we started to use large language models for exploring

how we could generate tests to validate compiler implementations for OpenMP

as well as OpenACC, which are, you may know this,

directive-based programming models for CC++ Portland.

So that's a group.

And many, many thanks to all the funders who have enabled research

in the last several years.

So this project, while it's primarily focusing on Python GPU,

particle and cell-on-GPU, the long and the short of it is,

there is a grander challenge, right?

As you know, there is a lot of applications which have a ton of data set,

volumes and volumes of data set.

And sometimes it isn't feasible to envision all data set all at once

in one big dashboard and try and extract sensible information

out of this data set, right?

So you could think of data reduction, but data reduction also need to be

done cleverly so you're not losing meaningful info.

So we often struggle to put a pin on the scientific questions

when we're dealing with data set beyond what our machines can handle, right?

So that being the question, we are wondering how to do this,

how to build a framework, how to build a methodology

where we can look into large data set.

Teil einer Videoserie :
Teil eines Kapitels:
NHR@FAU PerfLab Seminar

Zugänglich über

Offener Zugang

Dauer

00:42:39 Min

Aufnahmedatum

2025-05-28

Hochgeladen am

2025-05-28 17:36:06

Sprache

en-US

Speaker: Sunita Chandrasekaran, Department of Computer and Information Sciences, University of Delaware
Title: The Artificial Scientist: In-Transit Machine Learning of Plasma Simulations
Abstract: 
With the rapid advancements in the computer architecture space, the migration of legacy applications to new architectures remains a continuous challenge. To effectively navigate this ever-evolving hardware landscape, software and toolchains must evolve in tandem, staying ahead of the curve in terms of architectural innovation. While this synchronization between hardware and software is inherently complex, it is essential for fully harnessing the potential of advanced hardware platforms. In this context, a marriage between HPC and AI is gaining increasing prominence. By effectively orchestrating the workflow of HPC and AI, we can not only accelerate scientific progress but also achieve significant gains in computational efficiency. One promising strategy to further optimize large-scale workflows is to stream simulation data directly into machine learning (ML) frameworks. This approach bypasses traditional file system bottlenecks, allowing for the transformation of data in transit—asynchronously with both the simulation process and model training. This talk will explore these strategies in detail, demonstrating the synergy between hardware innovation and software adaptation. Using real-world scientific applications as a case study, Plasma-in-Cell on GPU, i.e. PIConGPU will showcase how these techniques can be applied at scale to drive both scientific and computational advancements.
Slides: 
Short Bio:
Sunita Chandrasekaran is an Associate Professor with the Department of Computer and Information Sciences at the University of Delaware, USA and runs the Computational Research Programming Lab. She co-directs the AI Center of Excellence. She also leads the NSF Democratization Access to RSE (DARSE) program at UD. Her research spans high performance computing, exascale computing, machine learning and interdisciplinary science. She is a member of the DOE Advanced Scientific Computing Advisory Committee (ASCAC) committee and the Vice Chair for the State of Delaware AI commission. She has held various leadership positions in HPC conferences and workshops over the past several years.
For a list of past and upcoming NHR PerfLab seminar events, see: https://hpc.fau.de/research/nhr-perfl...
Einbetten
Wordpress FAU Plugin
iFrame
Teilen