PhD Defense by Lucas "Luke" Erlandson

Primary tabs

Title: The Use of 3D Matrix Multiplication in Chebyshev-Filtered Subspace Iteration

Date: Wednesday, May 11th

Time: 10:00 AM - 12:00 PM Eastern

Location (Virtual):

Location (Physical): Coda C1315 "Grant Park" (Coda access is required, contact Luke if you need guest access)


Lucas "Luke" Erlandson

School of Computational Science and Engineering

College of Computing

Georgia Institute of Technology



Dr. Edmond Chow (Advisor, School of Computational Science and Engineering, Georgia Institute of Technology)

Dr. Felix Herrmann (School of Earth and Atmospheric Sciences, Georgia Institute of Technology)

Dr. Tobin Isaac (School of Computational Science and Engineering, Georgia Institute of Technology)

Dr. Ruipeng Li (Center for Applied Scientific Computing, Lawrence Livermore National Lab)

Dr. Yuanzhe Xi (Department of Mathematics, Emory University)



Electronic Structure calculations can be used to accurately calculate the motion and properties of electrons. These calculations require calculating (approximate) solutions to the Schrodinger Equation H Psi = E Psi, where H is the Hamiltonian operator, Psi is a wave function and E is the energy. Many domains require electronic structure calculations, including chemistry, material science, physics and many more. However, such calculations are prohibitively expensive to unless approximations are made. One method commonly used is known as Kohn-Sham Density Functional Theory due to its high accuracy cost to ratio. However, at the core of Kohn-Sham Density Functional Theory is the solution of an eigenvalue problem, which becomes intractable for large problems. Chebyshev-filtered subspace iteration (ChebFSI) is a method which reduces the cost associated with the eigensolve by instead refining a subspace via Chebyshev polynomials.


In this dissertation, an investigation into the computationally expensive kernels is conducted. These kernels include the matrix-matrix products, Hamiltonian, and eigensolve, and the investigation culminates into the high-performance parallel computation engine (libPCE), with particular focus on a distributed GPU implementation. Many of the matrices encountered within ChebFSI are highly non-square, and as such traditional distributed matrix-matrix products tend to perform inefficiently. Thus, we investigate the use of state-of-the-art matrix-matrix products, which aim to achieve higher efficiency in such cases. Furthermore, we investigate what is required to provide a high-performance distributed GPU code for the Hamiltonian and eigensolve. These routines are packaged in a way to be a replacement for computation routines currently used in DFT codes including the SPARC package (Simulation Package for Ab-initio Real-space Calculations). 


The contributions of this dissertation are as follows: first, we investigate and provide justification for which of the available eigensolvers are useful for different cases, depending on the problems face and hardware available. Second, we investigate the use of matrix-matrix products compared to traditional on both CPU and GPU for the problems face. Third, we combine these investigations with a high-performance Hamiltonian implementation to provide a distributed GPU package. Finally, we demonstrate the efficacy of these developments through numerical experiments.




  • No categories were selected.


  • No keywords were submitted.