event
Ph.D. Proposal Oral Exam - Si Li
Primary tabs
Title: Managing Transient Reliability and Performance in GPU Applications
Committee:
Dr. Yalamanchili, Advisor
Dr. Wills, Chair
Dr. Kim
Abstract: The objective of the proposed research is to develop a framework for software-based, low-cost error detection for GPU applications that can adapt to dynamic changes in kernel resilience characteristics as well as environmental reliability factors. The proposed research consists of an adaptive, software reliability enhancement (SRE) framework, a dynamic reliability management (DRM) that leverages SRE framework to control trade offs between performance and reliability, and an SRE technique tailored to the unique properties of GPU execution. By incorporating the variation in reliability requirements, applications can reach the same level of resilience with lower overhead than any one technique.
Status
- Workflow Status:Published
- Created By:Daniela Staiculescu
- Created:04/26/2016
- Modified By:Fletcher Moore
- Modified:10/07/2016
Categories
Keywords
Target Audience