event

PhD Proposal by Wonhee Cho

Primary tabs

Title: Tiered Transactional Distributed Storage System

 

Wonhee Cho

School of Computer Science

College of Computing

Georgia Institute of Technology

 

Date: Thursday, April 13th, 2017

Time: 1:00pm-3:00pm (EST) 

Location: Klaus 1212

 

Committee:

-----------------------

Dr. Umakishore Ramachandran (Advisor, School of Computer Science, Georgia Tech)

Dr. Sudipta Sengupta (Co-Advisor, Microsoft Research)

Dr. Moin Qureshi (School of Electrical and Computer Engineering, Georgia Tech)

Dr. Santosh Pande (School of Computer Science, Georgia Tech)

 

Abstract:

-----------------------

The current prevailing memory-based distributed storage systems realize extremely low latency and make various applications viable, such as graph-based interactive applications. However, due to recent industry digitization and traditional data expansion, the data center industry faces critical challenges to handle huge data density beyond memory capacity. In addition, the need for frequent software releases and updates by large groups of developers requires that APIs to handle data be simple and consistent. We believe that a conventional wisdom in storage systems gives a solution to these urgent challenges: exploiting tiered architecture.

 

The research questions in this tiered structure are, 

1) how can we architect a tiered distributed storage system such that it provides simple APIs, consistent semantics, and performance that is comparable to that of pure memory design when most data fits in memory, and 

2) how can we design flash storage tier to perform efficient transaction and recovery such that performance is comparable to pure memory design in optimal conditions and degrades gracefully as read I/O is involved, while also guaranteeing that in a failed scenario, the data can be recovered and replicated quickly. 

In this research, we use the FaRM system as a base memory-based distributed storage system.

 

In order to address these challenges, we propose a tiered transactional distributed storage system and explore a set of techniques in a top-down manner. We first revise an allocation mechanism and its address semantics to provide simple and consistent APIs, then we implement efficient flash-based storage subsystems to support fast commit protocol and reduce overhead involved in I/Os. Finally, we address the issues in recovery of flash-based storage, and propose efficient recovery mechanisms. With careful design of flash-based storage and its architecture embedded in a memory-based distributed storage system, we can provide effective solutions to the data center industry’s requirements.

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:04/10/2017
  • Modified By:Tatianna Richardson
  • Modified:04/10/2017

Categories

Keywords

Target Audience