56 - NHR PerfLab Seminar 2023-09-05: DGEMM on Integer Tensor Cores/ClipID:49059 previous clip next clip

The automatic subtitles generated using Whisper Open AI in this video player (and in the Multistream video player) are provided for convenience and accessibility purposes. However, please note that accuracy and interpretation may vary. For more information, please refer to the FAQs (Paragraph 14).
Recording date 2023-09-05

Via

Free

Language

English

Organisational Unit

Zentrum für Nationales Hochleistungsrechnen Erlangen (NHR@FAU)

Producer

Zentrum für Nationales Hochleistungsrechnen Erlangen (NHR@FAU)

Speaker: Hiroyuki Ootomo, Tokyo Institute of Technology

Title: DGEMM on Integer Tensor Cores

Date and time: Tuesday, September 5, 2 p.m. – 3 p.m.

Abstract:

In order to meet the increasing demand for dense matrix-matrix multiplication from the deep learning community, processors with specialized computing units for matrix multiplication are being developed by numerous vendors, such as NVIDIA Tensor Cores and Google TPUs. These hardware are designed to efficiently perform matrix multiplication at low precision, taking advantage of the fact that deep learning can tolerate low-precision operations, and the computation heavily relies on matrix multiplications. For machine learning inference, fixed-point value computation is commonplace, where the input and output values and the model parameters are quantized. Thus, many processors are now equipped with fast integer matrix multiplication units. This talk introduces a double-precision equivalent matrix multiplication using Int8 Tensor Cores and the Ozaki scheme, a high-precision matrix multiplication scheme using a lower-precision computing unit.

Short bio:

Hiroyuki Ootomo is a Ph.D. candidate at Tokyo Institute of Technology and studying under Dr. Rio Yokota. His research interests lie in high performance computing, especially mixed-precision computing using special hardware, randomized numerical linear algebra, and quantum circuit simulation. His current work is on a fast and high-accuracy GEMM on NVIDIA Tensor Cores and its application.

For a list of past and upcoming NHR PerfLab seminar events, see: https://hpc.fau.de/research/nhr-perflab-seminar-series/

More clips in this category "Friedrich-Alexander-Universität Erlangen-Nürnberg Zentralbereich"

2024-03-08
IdM-login
protected