{"195181":{"#nid":"195181","#data":{"type":"event","title":"SCS Talk: Ganesh Ananthanarayanan, University of California at Berkeley","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ESCS Talk:\u003C\/strong\u003E Ganesh Ananthanarayanan, University of California at Berkeley\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ETitle:\u0026nbsp;\u003C\/strong\u003EBig Data Analytics with All-or-Nothing Parallel Jobs\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EAbstract\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EExtensive data analysis has become the enabler for diagnostics and decision making in many modern systems. These analyses have both competitive as well as social benefits. To cope with the deluge in data that is growing faster than Moore\u2019s law, computation frameworks have resorted to massive parallelization of analytics jobs into many fine-grained tasks. These frameworks promised to provide efficient and fault-tolerant execution of these tasks. However, meeting this promise in clusters spanning hundreds of thousands of machines is challenging and a key departure from earlier work on parallel computing.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EA simple but key aspect of parallel jobs is the \u003Cem\u003Eall-or-nothing\u003C\/em\u003E property: unless all tasks of a job are provided equal improvement, there is no speedup in the completion of the job. This talk will demonstrate how the all-or-nothing property impacts replacement algorithms in distributed caches for parallel jobs. Our coordinated caching system, PACMan, makes global caching decisions and employs a provably optimal cache replacement algorithm. A highlight of our evaluation using workloads from Facebook and Bing datacenters is that PACMan\u2019s replacement algorithm outperforms even Belady\u2019s MIN (that uses an oracle) in speeding up jobs. Along the way, I will also describe how we broke the myth of disk-locality\u2019s importance in datacenter computing and solutions to mitigate straggler tasks.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EBio\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EGanesh Ananthanarayanan is a PhD candidate in the University of California at Berkeley, working with Prof. Ion Stoica in the AMP Lab. His research interests are in systems and networking, with a focus on cloud computing and large scale data analytics systems. Prior to joining Berkeley, he worked for two years at Microsoft Research\u2019s Bangalore office.\u0026nbsp;\u003C\/p\u003E","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003E\u003Cstrong\u003ESCS Talk:\u003C\/strong\u003E Ganesh Ananthanarayanan, University of California at Berkeley\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ETitle:\u003C\/strong\u003E\u0026nbsp;Big Data Analytics with All-or-Nothing Parallel Jobs\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EBio: \u0026nbsp;\u003C\/strong\u003EGanesh Ananthanarayanan is a PhD candidate in the University of California at Berkeley, working with Prof. Ion Stoica in the AMP Lab. His research interests are in systems and networking, with a focus on cloud computing and large scale data analytics systems. Prior to joining Berkeley, he worked for two years at Microsoft Research\u2019s Bangalore office.\u0026nbsp;\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Big Data Analytics with All-or-Nothing Parallel Jobs"}],"uid":"27734","created_gmt":"2013-02-25 15:48:08","changed_gmt":"2016-10-08 02:02:42","author":"Antonette Benford","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2013-03-12T12:00:00-04:00","event_time_end":"2013-03-12T13:00:00-04:00","event_time_end_last":"2013-03-12T13:00:00-04:00","gmt_time_start":"2013-03-12 16:00:00","gmt_time_end":"2013-03-12 17:00:00","gmt_time_end_last":"2013-03-12 17:00:00","rrule":null,"timezone":"America\/New_York"},"extras":["free_food"],"hg_media":{"195161":{"id":"195161","type":"image","title":"Ganesh Ananthanarayanan, UC Berkeley","body":null,"created":"1449179891","gmt_created":"2015-12-03 21:58:11","changed":"1475894846","gmt_changed":"2016-10-08 02:47:26","alt":"Ganesh Ananthanarayanan, UC Berkeley","file":{"fid":"196402","name":"ananthanarayanan_photo.jpg","image_path":"\/sites\/default\/files\/images\/ananthanarayanan_photo_0.jpg","image_full_path":"http:\/\/hg.gatech.edu\/\/sites\/default\/files\/images\/ananthanarayanan_photo_0.jpg","mime":"image\/jpeg","size":89482,"path_740":"http:\/\/hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/images\/ananthanarayanan_photo_0.jpg?itok=Z5jfFbbH"}}},"media_ids":["195161"],"groups":[{"id":"47223","name":"College of Computing"}],"categories":[],"keywords":[],"core_research_areas":[],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003EKishore Ramachandran @ (404) 385-5136 \u003Ca href=\u0022mailto:rama@cc.gatech.edu\u0022\u003Erama@cc.gatech.edu\u003C\/a\u003E\u0026nbsp;\u003C\/p\u003E","format":"limited_html"}],"email":[],"slides":[],"orientation":[],"userdata":""}}}