14 - Programming Techniques for Supercomputers/ClipID:32668 previous clip next clip

Recording date 2021-05-11

Language

English

Organisational Unit

Friedrich-Alexander-Universität Erlangen-Nürnberg

Producer

Friedrich-Alexander-Universität Erlangen-Nürnberg

This lecture investigates the performance of the Schoenauer Vector triads benchmark over the full memory heirarchy of a single core Intel Haswell processor. Analysing the data transfers throughout the memory hierarchy a performance modell is established which qualitatively describes the performance levels for data sets in different memory hierarchy levels. Further, the dense matrix vector multiplication is investigated to identify performance imporvements by increasing the temporal reuse of vector data. As first optimization strategy outer-loop unroll&jam is identified and successfully tested.

Up next

Wellein, Gerhard
Prof. Dr. Gerhard Wellein
2021-05-12
IdM-login
Wellein, Gerhard
Prof. Dr. Gerhard Wellein
2021-05-17
IdM-login
Wellein, Gerhard
Prof. Dr. Gerhard Wellein
2021-05-18
IdM-login
Wellein, Gerhard
Prof. Dr. Gerhard Wellein
2021-05-19
IdM-login
Wellein, Gerhard
Prof. Dr. Gerhard Wellein
2021-05-26
IdM-login

More clips in this category "Friedrich-Alexander-Universität Erlangen-Nürnberg"

2021-06-22
Studon
protected  
2021-06-22
Studon
protected  
2021-06-22
IdM-login
protected  
2021-06-22
IdM-login
protected  
2021-06-22
IdM-login
protected