event

SCS Recruitment Seminar: Faisal Nawab, "Efficient Coordination for Global-Scale Data Management"

Primary tabs

Abstract:

Replicating data across data centers (geo-replication) provides higher levels of fault-tolerance and data availability. The Wide-Area Network (WAN) latency separating data centers are orders of magnitude larger than traditional network latency within a data center. This makes it expensive to preserve the consistency of data copies. However, consistency and high-level access abstractions, like database transactions, are favored by developers because they hide the complexity of the underlying replica and concurrency control. This has led to the adoption of consistent transactions in large-scale, geo-replicated systems.


In this talk, I will present the fundamental challenges in designing geo-replicated data management systems. Specifically, transaction latency is high due to the need to coordinate between data centers spread across the world. Traditionally, coordination is performed by polling other data centers for permissions to execute. This made Round-Trip Time (RTT) latency inevitable. In geo-replication, this is an expensive cost and thus leads to the question: Is it possible to avoid the polling paradigm of coordination? Message Futures are a protocol that demonstrates a new paradigm of continuous, proactive coordination.

In this paradigm, transactions can coordinate in sub-RTT latency. Breaking the RTT latency barrier invites the next part of the talk where I derive a lower bound for coordination latency. The proposed lower-bound model inspires a design of a coordination protocol called Helios that targets achieving the lower-bound latency. The talk will also discuss many of the practical aspects of building large, scalable data management and communication platforms for geo-replicated systems. I conclude the talk with future opportunities for global-scale data management in the context of edge computing, Internet of Things, and data science.

Bio:

Faisal Nawab is a Ph.D. candidate at the University of California, Santa Barbara. His dissertation research lies at the intersection of Big Data management and distributed cloud computing systems.

Specifically, he is interested in the challenges that arise in geographically-distributed data management systems. Faisal has worked with HP Labs and Microsoft Research on data management systems over emerging memory technology such as Non-Volatile Memory.

His research has been published in leading database conferences, such as VLDB, SIGMOD, and ICDE.

Status

  • Workflow Status:Published
  • Created By:Devin Young
  • Created:01/19/2017
  • Modified By:Fletcher Moore
  • Modified:04/13/2017

Categories

  • No categories were selected.