38 - Adaptive Parallel and Distributed Software Systems [ID:6684]
50 von 657 angezeigt

The following content has been provided by the University of Erlangen-Nürnberg.

Thank you very much for the nice introduction as well as thank you very much for inviting me here.

So actually I just initially I thought I would give an overview of the research group that I'm heading at the University of Dresden.

But if some of you are interested in a technical talk that covers a project in detail, then I'm flexible to actually cover one of the projects in detail.

So if one option is that I just give you slightly an overview so that you get an idea of what kind of projects that we are doing in Dresden,

or the other option is that I cover one of the projects in detail.

So if any preference as such, then I can adopt it accordingly.

So let me know if you think one way or the other, I guess.

If not, then I just go with an overview and later half if you have time or interest, then I can go over one of the projects in detail.

Okay. So as Wolfgang said, I'm at the University of Dresden.

And just to give a very brief background of my research so far, I'm at TU Dresden since May 2015, since May last year.

And there I'm responsible for heading a group in Center for Excellence for doing resilient software systems.

So we are trying to build new processor architectures.

And my role there is to build software systems that are resilient in this context for the new processors.

Before taking a position at TU Dresden, I did my PhD at the Max Planck Institute for Software Systems.

And there I spent five and a half years in the software systems group headed by Professor Peter Dressels.

During my PhD, I also did an internship at the MSR Cambridge in the systems and networking group.

And there I worked on doing in-memory data analytics, basically doing large-scale distributed graph processing in memory.

And before taking up the role at starting my PhD at MPI, I was a technical staff member at IBM Research in India,

as well as in Watson, where I was part of the high-performance computing group optimizing parallel algorithms on IBM architectures, basically.

So in general, my research interests are systems, and I mainly work in systems for parallel and distributed computing.

But as you will see that I do projects in big data storage systems and operating systems as well.

So it's a sort of a mixed bag of everything but targeting, designing, implementing, and evaluating computer systems.

So that's the main my research interests are.

So in this talk, what I wanted to do is to give a sort of a brief background of what I did as a PhD student in Sir Brooklyn for five and a half years,

and then what kind of projects that we are doing in Dresden since last May.

So the first half of the talk would be talking about the work that I'm doing as a PhD student.

So my PhD was about building systems for incremental computation, in particular when I say systems, systems for parallel and distributed computing.

So this work was done, of course, with my advisors as well as my fellow collaborators at Max Planck Institute for Software Systems.

So the high level motivation behind my PhD was that we often see a workflow where we have to read in our application over slowly evolving input data sets.

So as the input changes, we have to recompute our output based on those input changes.

And this workflow is quite prevalent across different domains, be it scientific computing, big data analytics, or reactive systems.

All of them have this common theme of recomputing the output as the input changes over time.

For instance, if you think about big data analytics, companies like Google or Facebook, they repeatedly recompute their indexes as the new web crawl comes in.

So you basically are incrementally updating your output as the new input comes in.

So at a very high level, my thesis was about how do we enable efficient execution of applications with slowly evolving data sets in successive runs.

So can we do it more efficiently in terms of performance or energy as the input changes?

And the high level idea or the observation on which we built our work was that often it's the case when the input changes with small amount,

we can actually update the output by doing small amount of work. So this way we can actually get efficiency.

So before I tell you how did we do that, let me tell you what was the state of the art when dealing with this kind of workflow or the incremental workflow.

So broadly speaking, the state of the art can be classified into two main categories.

The first approach is quite naive, which is that as the input changes, we simply recompute everything from scratch.

Of course, this is highly inefficient because you are redoing the entire work from scratch.

But on the other hand, it is quite useful because it is quite easy to design.

Here the programmer or the application programmer doesn't have to redesign its application for this workflow.

You can actually use your existing applications as it is without any changes.

On the other hand, the user or the programmer can actually design an application specific algorithm to incrementally update the output.

So here the idea is that you can change your application to incorporate a logic or an algorithm to incrementally update the output.

And these algorithms actually are well studied in the context of in the algorithms community under the umbrella called dynamic algorithms.

The literature shows that these algorithms can be asymptotically faster than recomputation from scratch.

Presenters

Dr. Pramod Bhatotia Dr. Pramod Bhatotia

Zugänglich über

Offener Zugang

Dauer

00:55:43 Min

Aufnahmedatum

2016-08-08

Hochgeladen am

2016-08-12 11:06:52

Sprache

de-DE

Parallel and distributed systems are a pervasive component of the modern computing environment. Today, large-scale data-centers or supercomputing facilities have become ubiquitous, consisting of heterogeneous geo-distributed clusters with 100s of thousands of general-purpose multicores, energy-efficient cores, specialized accelerators such as GPUs, FPGAs, etc. Such computing infrastructure powers not only some of the most popular consumer applications--Internet services such as web search and social networks--but also a growing number of scientific, big data, and enterprise workloads. Due to the growing importance of these diverse applications, my research focuses on building software systems for this new computing infrastructure.
In this talk, I present an overview of my research group "Parallel and Distributed Systems" at TU Dresden. The mission of my group is to build adaptive software systems targeting parallel and distributed computing. For adaptiveness, we follow three core design principles: (1) _Resiliency_ against fail-stop and Byzantine faults for ensuring the safety and security of applications; (2) _Efficiency_ of applications by enabling a systematic trade-off between the application performance (latency/throughput) and resources utilization/energy consumption; and (3) _Scalability_ to seamlessly support ever growing application workload with increasing number of cores, and at the same time, embracing the heterogeneity in the underlying computing platform.
As I show in my talk, we follow these three design principles at all levels of the software stack covering operating systems, storage/file-systems, parallelizing compilers and run-time libraries, and all the way to building distributed middlewares. Our approach transparently supports existing applications -- we neither require a radical departure from the current models of programming nor complex, error-prone application-specific modifications.

Einbetten
Wordpress FAU Plugin
iFrame
Teilen