Bachelor's thesis presentation. Rihab is advised by Ravil Dorozhinskii.
Previous talks at the SCCS Colloquium
Rihab Torjmen: Implementation of Sparse-GEMM-Kernel for ADER-DG in SeisSol on GPUs
SCCS Colloquium |
Despite the fact that many researches have been done on the subject of how important is the
multiplication of a sparse matrix with a dense matrix, which is an paramount tool in the area
of computing and machine learning, the most efficient way to do it is still needed to be defined
since the process is full of challenges. The major goal of this research is to multiply dense x
sparse matrices using the GPU advantage of being able to compute it in more practical and
faster way. SeisSol, an application for earthquakes simulations and a scientific software for
the numerical simulation of seismic wave phenomena and earthquake dynamics, is the project
for which we are developing solutions. It is based on the the Arbitrary high-order DERivatives
Discontinuous Galerkin (ADER-DG), which is used to solve the issue of combined
elastodynamic wave propagation and dynamic rupture methods.[2] Seissol essentially helps
us predict earthquakes, which is a really innovative study field because it is already so difficult
to predict earthquakes, and it will actually help save people’s lives and issue warnings of
potential harmful earthquakes ahead of schedule enough to allow right approach to the
disaster, allowing people to minimize loss of life and destruction.[3]
ADER-DG methods are based on elements and expressions which are built on small matrices
that could be divided in two types: dense and sparse. If done correctly, using a sparse matrix
to accomplish the multiplication is much faster compared to the normal way, especially in our
project, since the sparsity pattern is giving in advance and for most cases, even the values are
provided to the user (It is actually delivered either during runtime or during compilation).
The issue with this multiplying procedure is that using the CPU or executing conventional
straight multiplication between the matrices is inefficient and wastes a lot of memory. As a
result, in our scenario, we'll use the GPU and do a different type of multiplication to have a
better performance. During previous projects, the multiplication of two dense matrices has
been done and it has been proven how less time consuming and more efficient it is, to use the
GPU instead of the CPU to get the job done.
The SeisSol numerical scheme essentially consists of sparse and dense small matrix
multiplications, where small is defined as the number of entries per dimension in the matrices
being less than 100. [19] Additionally, a priori knowledge of all sparsity patterns exists. We
define the architecture we want to optimize for and the convergence order, which results in
various matrix sizes, prior to executing SeisSol. Then, for each matrix multiplication, we obtain
specific matrix multiplication routines via our code generator.
A dense x dense, dense x dense, or sparse x sparse multiplication might apply to each of them.
We hardcode the sparsity pattern into the resulting code and arrange the operations in such
a way that the compiler automatically vectorises them when the matrix multiplication is
sparse. The dense x sparse multiplication will be included during this research study to extend
the concept and come closer to the eventual aim of being able to anticipate any sort of
earthquake.