Master's Thesis presentation. Keerthi is advised by Dr. Felix Dietrich and Severin Reiz.
Previous talks at the SCCS Colloquium
Keerthi Gaddameedi: Efficient and Scalable Kernel Matrix Approximation using Hierarchical Decomposition.
SCCS Colloquium |
Dimensionality reduction approaches have been helpful in extracting only the necessary information from large datasets. Manifold learning is one such non-linear dimensionality reduction approaches. It is an unsupervised problem that learns the intrinsic geometric structure of non-linear data called a manifold, that has a much lower dimension than the ambient space. datafold is a python package that provides data-driven models for finding a parametrization of manifolds in non-linear high dimensional data and to identify non-linear dynamical systems on these manifolds. Kernel matrices appear frequently in dimensionality reduction algorithms. The proximity between points is computed in the form of a kernel matrix and the lower dimension of the intrinsic geometry is extracted by performing eigendecomposition on the kernel matrix and discarding all the eigen values lower than a threshold. Matrix-vector multiplications in these decomposition algorithms are one of the most expensive operations of the process and as a mitigation measure, hierarchical algorithms are used. GOFMM is a library that provides hierarchical algorithms for matrix approximations and evaluations. For the eigendecompositions, implicitly restarted Arnoldi iteration algorithm from the scipy package is used. Every iteration includes matrix-vector multiplications. Instead of traditional computations, we deploy GOFMM's hierarchical algorithms to first "compress" the matrix, getting rid of the irrelevant information and then perform the mat-vec. This integration of both libraries is then tested for accuracy and scalability on a HPC cluster.