SCS Lecture Series - Christopher Ré - DeepDive: A Dark Data Syster

Event Details
  • Date/Time:
    • Friday November 20, 2015 - Saturday November 21, 2015
      6:00 pm - 6:59 pm
  • Location: Klaus 2447
  • Phone:
  • URL:
  • Email:
  • Fee(s):
  • Extras:

Francella M. Tonge


Summary Sentence: SCS Lecture Series - Christopher Ré - DeepDive: A Dark Data Syster

Full Summary: No summary paragraph submitted.

“DeepDive: A Dark Data System” – Christopher Ré, Assistant Professor – School of Computer Science, Stanford University


Friday, November 20, 2015 @ 2PM

Klaus 2447


A “TGIF” will follow at 3 p.m. on the 2nd floor of the Klaus commons area overlooking the Atrium.  Light snacks and refreshments will be served.





Many pressing questions in science are macroscopic, as they require scientists to integrate information from numerous data sources, often expressed in natural languages or in graphics; these forms of media are fraught with imprecision and ambiguity and so are difficult for machines to understand. Here I describe DeepDive, which is a new type of system designed to cope with these problems. It combines extraction, integration and prediction into one system. For some paleo biology and materials science tasks, DeepDive-based systems have surpassed human volunteers in data quantity and quality (recall and precision). DeepDive is also used by scientists in areas including genomics and drug repurposing, by a number of companies involved in various forms of search, and by law enforcement in the fight against human trafficking. DeepDive does not allow users to write algorithms; instead, it asks them to write only features. A key technical challenge is scaling up the resulting inference and learning engine, and I will describe our line of work in computing without using traditional synchronization methods including Hogwild! and DimmWitted.


DeepDive is open source on github and available from DeepDive.Stanford.Edu.





Christopher (Chris) Ré is an assistant professor in the Department of Computer Science at Stanford University and a Robert N. Noyce Family Faculty Scholar. His work's goal is to enable users and developers to build applications that more deeply understand and exploit data. Chris received his PhD from the University of Washington in Seattle under the supervision of Dan Suciu. For his PhD work in probabilistic data management, Chris received the SIGMOD 2010 Jim Gray Dissertation Award. He then spent four wonderful years on the faculty of the University of Wisconsin, Madison, before moving to Stanford in 2013. He helped discover the first join algorithm with worst-case optimal running time, which won the best paper at PODS 2012. He also helped develop a framework for feature engineering that won the best paper at SIGMOD 2014. In addition, work from his group has been incorporated into scientific efforts including the IceCube neutrino detector and PaleoDeepDive, and into Cloudera's Impala and products from Oracle, Pivotal, and Microsoft's Adam. He received an NSF CAREER Award in 2011, an Alfred P. Sloan Fellowship in 2013, a Moore Data Driven Investigator Award in 2014, the VLDB early Career Award in 2015, and the MacArthur Foundation Fellowship in 2015.

Additional Information

In Campus Calendar

College of Computing, School of Computer Science

Invited Audience
Undergraduate students, Faculty/Staff, Public, Graduate students
Christopher Ré, College of Computing, Georgia Tech, School of Computer Science, SCS
  • Created By: Birney Robert
  • Workflow Status: Draft
  • Created On: Nov 16, 2015 - 10:12am
  • Last Updated: Apr 13, 2017 - 5:17pm