So thanks, Gaurav, for inviting me to this seminar.
I see that you folks have been running this for a while.
So it's a pleasure to give a talk here.
All right.
So this is work with a lot of collaborators.
I will show you who they are in a minute.
But I am from University of Delaware.
If you don't know where Delaware is, it's about 90 minutes train ride from New York,
Manhattan, if that helps put things into perspective.
So it's the East Coast in the US.
So this work is basically several years of work with many scientists in the making,
where we have been trying to try different pieces of using HPC, AI, machine learning
throughout the process.
Lots of tangible outcomes have come out of this work in the past several years.
So to quickly give you a background of my group.
So this is my group in Delaware.
It's a group, PhD students, masters, and a lot of undergrad students.
We do both foundational work as well as interdisciplinary science work,
where we build compiler implementations and open source compiler tool chains,
LLVM, for example.
We explore directive page programming models, OpenMP and OpenACC,
for real world applications, migrate them to large supercomputers.
As you know, that is easier said than done, right?
So it exposes a lot of challenges both with the hardware and the software framework.
So we work our way through those.
In the last four or five years, we have also been engaging with local hospitals,
with their data sets and building predictive models for cancer research.
So there is an ML and AI component over there.
And very recently, we started to use large language models for exploring
how we could generate tests to validate compiler implementations for OpenMP
as well as OpenACC, which are, you may know this,
directive-based programming models for CC++ Portland.
So that's a group.
And many, many thanks to all the funders who have enabled research
in the last several years.
So this project, while it's primarily focusing on Python GPU,
particle and cell-on-GPU, the long and the short of it is,
there is a grander challenge, right?
As you know, there is a lot of applications which have a ton of data set,
volumes and volumes of data set.
And sometimes it isn't feasible to envision all data set all at once
in one big dashboard and try and extract sensible information
out of this data set, right?
So you could think of data reduction, but data reduction also need to be
done cleverly so you're not losing meaningful info.
So we often struggle to put a pin on the scientific questions
when we're dealing with data set beyond what our machines can handle, right?
So that being the question, we are wondering how to do this,
how to build a framework, how to build a methodology
where we can look into large data set.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:42:39 Min
Aufnahmedatum
2025-05-28
Hochgeladen am
2025-05-28 17:36:06
Sprache
en-US