{"619597":{"#nid":"619597","#data":{"type":"event","title":"Phd Defense by Steffen Maass","body":[{"value":"\u003Cp\u003ETitle: Systems Abstractions for Big Data Processing on a Single Machine\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESteffen Maass\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESchool of Computer Science\u003C\/p\u003E\r\n\r\n\u003Cp\u003ECollege of Computing\u003C\/p\u003E\r\n\r\n\u003Cp\u003EGeorgia Institute of Technology\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDate: Wednesday, April 3, 2019\u003C\/p\u003E\r\n\r\n\u003Cp\u003ETime: Noon EDT\u003C\/p\u003E\r\n\r\n\u003Cp\u003ELocation: KACB 3100\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003ECommittee:\u003C\/p\u003E\r\n\r\n\u003Cp\u003E------------\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Taesoo Kim (Advisor, School of Computer Science, Georgia Tech) Dr. Ada Gavrilovska (School of Computer Science, Georgia Tech) Dr. Umakishore Ramachandran (School of Computer Science, Georgia Tech) Dr. Tushar Krishna (School of Electrical Engineering, Georgia Tech) Dr. Willy Zwaenepoel (Faculty of Engineering and Information Technologies, The University of Sydney)\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EAbstract:\u003C\/p\u003E\r\n\r\n\u003Cp\u003E-----------\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003ELarge-scale internet services, such as Facebook or Google, are using clusters of many servers for problems such as search, machine learning, and social networks.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EHowever, while it may be possible to apply the tools used at this scale to smaller, more common problems as well, this dissertation presents approaches to large-scale data processing on only a single machine.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThis approach has obvious cost benefits and lowers the barrier of entrance to large-scale data processing.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThis dissertation approaches this problem by redesigning applications to enable trillion-scale graph processing on a single machine, enable the processing of evolving, billion-scale graphs, and presenting an operating-systems level optimization.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EFirst, this dissertation presents a new out-of-core graph processing engine, called Mosaic, for executing graph algorithms on trillion-scale datasets on a single machine.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EMosaic makes use of many-core processors and PCIe-SSDs coupled with a novel graph encoding scheme to allow processing of graphs of up to one trillion edges on a single machine.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EMosaic also employs a locality-preserving curve to allow for high compression and high locality when storing graphs and executing algorithms.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESecond, this dissertation presents Cytom, a new engine for processing evolving graphs based on insights about achieving high compression and locality while improving load-balancing when processing a graph that changes rapidly.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ECytom also introduces a novel programming model that takes advantage of its subgraph-centric approach coupled with the setting of evolving graphs.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThis is an important enabling step for emerging workloads when processing graphs that change over time.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EFinally, we present an asynchronous scheme for clearing the processors\u0026#39; translation lookaside buffers (TLBs) in response to the high overhead of the current, synchronous process known as a TLB shootdown.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThis process is critical for system services such as freeing memory, NUMA memory migration, and page swapping in emerging, disaggregated data centers; these services are often used when processing large amounts of data.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe key idea of this scheme, Latr, is a lazy mechanism to remove entries from the cores\u0026#39; TLBs while ensuring correctness by lazily releasing virtual memory only after Latr\u0026#39;s lazy shootdown mechanism finishes.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThis scheme removes the current overhead of costly inter-processor interrupts.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EWe show that this mechanism has impacts on many applications ranging from webservers which might be used as caching frontends for big data processing to key-value stores and graph processing.\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"Systems Abstractions for Big Data Processing on a Single Machine"}],"uid":"27707","created_gmt":"2019-03-25 18:10:32","changed_gmt":"2019-03-25 18:10:32","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2019-04-03T13:00:00-04:00","event_time_end":"2019-04-03T16:00:00-04:00","event_time_end_last":"2019-04-03T16:00:00-04:00","gmt_time_start":"2019-04-03 17:00:00","gmt_time_end":"2019-04-03 20:00:00","gmt_time_end_last":"2019-04-03 20:00:00","rrule":null,"timezone":"America\/New_York"},"extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"78771","name":"Public"},{"id":"174045","name":"Graduate students"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}