{"686472":{"#nid":"686472","#data":{"type":"event","title":"PhD Defense by Gaurav Tarlok Kakkar","body":[{"value":"\u003Cp\u003EDear faculty members and fellow students,\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EYou are cordially invited to my\u0026nbsp;Ph.D.\u0026nbsp;thesis\u0026nbsp;defense.\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ETitle:\u0026nbsp;\u003C\/strong\u003EDesigning ML-Centric Data Systems for Efficiency and Usability\u003Cbr\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDate:\u0026nbsp;\u003C\/strong\u003EFriday, November 21st, 2025\u003Cbr\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ETime:\u0026nbsp;\u003C\/strong\u003E12-2 PM, EST\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ELocation:\u0026nbsp;\u003C\/strong\u003EKlaus Advanced Computing Building (KACB), Room 1212\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EGaurav Tarlok Kakkar\u003C\/p\u003E\u003Cp\u003EComputer Science Ph.D. Student\u003C\/p\u003E\u003Cp\u003ESchool of Computer Science\u003Cbr\u003EGeorgia Institute of Technology\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ECommittee:\u003C\/strong\u003E\u003C\/p\u003E\u003Col\u003E\u003Cli\u003EDr. Joy Arulraj (Advisor), School of Computer Science, Georgia Tech\u003C\/li\u003E\u003Cli\u003EDr. Sham Navathe, School of Computer Science, Georgia Tech\u003C\/li\u003E\u003Cli\u003EDr. Kexin Rong, School of Computer Science, Georgia Tech\u003C\/li\u003E\u003Cli\u003EDr. Steve Mussmann, School of Computer Science, Georgia Tech\u003C\/li\u003E\u003Cli\u003EDr. Fatma \u00d6zcan, Google System Research\u003C\/li\u003E\u003C\/ol\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EAbstract:\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EOver the past six decades, relational databases have been remarkably successful in managing structured data. However, the growing demand for analytics over unstructured data, such as videos, images, and text, driven by modern machine learning (ML) workloads exposes fundamental limitations in traditional database systems. Bridging this gap requires a new class of data systems that treat ML models as first-class citizens, integrating them directly into the query engine and providing optimizations tailored for their unique characteristics.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EThis dissertation presents the design, implementation, and evaluation of techniques that form the foundation of ML-centric data management systems. It introduces four systems, EVA, Seiden, Aero, and PRISM, that collectively address challenges of efficiency and usability across multimodal workloads.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EEVA accelerates exploratory video analytics by automatically materializing and reusing the results of expensive user-defined functions (UDFs) through a symbolic reuse framework. Seiden revisits the \u201cproxy model\u201d assumption in visual databases and demonstrates that indexing directly with oracle models and exploration\u2013exploitation sampling delivers superior execution performance and query accuracy. Aero extends adaptive query processing (AQP) to ML workloads by using runtime feedback to reorder predicates and dynamically scale resources, achieving performance improvements over static optimizers. Finally, PRISM optimizes natural language to SQL (NL2SQL) pipelines by treating monetary cost as a first-class objective and systematically navigating the trade-off between accuracy and cost.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003ETogether, these contributions lay the foundation for the next generation of data systems designed for AI-driven workloads.\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EDesigning ML-Centric Data Systems for Efficiency and Usability\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Designing ML-Centric Data Systems for Efficiency and Usability"}],"uid":"27707","created_gmt":"2025-11-17 14:10:31","changed_gmt":"2025-11-17 14:12:41","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2025-11-21T12:00:00-05:00","event_time_end":"2025-11-21T14:00:00-05:00","event_time_end_last":"2025-11-21T14:00:00-05:00","gmt_time_start":"2025-11-21 17:00:00","gmt_time_end":"2025-11-21 19:00:00","gmt_time_end_last":"2025-11-21 19:00:00","rrule":null,"timezone":"America\/New_York"},"location":"Klaus Advanced Computing Building (KACB), Room 1212","extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78771","name":"Public"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}