event

SCS Seminar: Spyros Blanas

Primary tabs

TITLE: Scaling Database Systems to High-performance Computers

ABSTRACT:

Processing massive datasets quickly requires warehouse-scale computers. Furthermore, many massive datasets are stored in formats like HDF5 and NetCDF that cannot be directly queried using SQL. In this talk, we will present ArrayBridge, a common interoperability layer for array file formats. ArrayBridge allows scientists to use SciDB, TensorFlow, and HDF5-based code in the same file-centric analysis pipeline without converting between file formats. Under the hood, ArrayBridge manages I/O to leverage the massive concurrency of warehouse-scale parallel file systems while keeping backwards compatibility with applications that use the unmodified HDF5 API. Once the data has been loaded in memory, the bottleneck in many array-centric queries becomes the speed of data repartitioning between different nodes. We will present an RDMA-aware data shuffling operator that directly converses with the network adapter in InfiniBand verbs and can repartition data up to 4X faster than MPI. We conclude by highlighting additional research challenges that need to be overcome to scale database systems to massive computers.


BIO:

Spyros Blanas is an assistant professor in the Department of Computer Science and Engineering at the Ohio State University. His research interest is high-performance database systems, and his current goal is to build a database system for high-end computing facilities. He has received the IEEE TCDE Rising Star Award and a Google Research Faculty award. He completed his Ph.D. at the University of Wisconsin–Madison, and part of his Ph.D. dissertation was commercialized in Microsoft’s flagship data management product, SQL Server, as the Hekaton in-memory transaction processing engine.

 

Status

  • Workflow Status:Published
  • Created By:Tess Malone
  • Created:09/11/2018
  • Modified By:Tess Malone
  • Modified:09/11/2018

Keywords

  • No keywords were submitted.