{"681470":{"#nid":"681470","#data":{"type":"event","title":"PhD Defense by Gaurav Verma","body":[{"value":"\u003Cp\u003ETitle: Robust, Efficient, and Adaptable Multimodal AI for Vertical Applications\u003C\/p\u003E\u003Cp\u003EDate: Thursday, April 10, 2025\u003Cbr\u003ETime: 2:30\u20134:30 PM Eastern Time (US)\u003Cbr\u003ELocation: Coda 1315 (Grant Park)\u003Cbr\u003EVirtual Meeting: Zoom\u003C\/p\u003E\u003Cp\u003EGaurav Verma\u003Cbr\u003Ehttps:\/\/gaurav22verma.github.io\/\u003Cbr\u003EComputer Science PhD Candidate\u003Cbr\u003ESchool of Computational Science and Engineering\u003Cbr\u003EGeorgia Institute of Technology\u003C\/p\u003E\u003Cp\u003ECommittee:\u003Cbr\u003EDr. Srijan Kumar - Advisor, Georgia Tech, Computational Science \u0026amp; Engineering\u003Cbr\u003EDr. Munmun De Choudhury - Georgia Tech, School of Interactive Computing\u003Cbr\u003EDr. Duen Horng (Polo) Chau - Georgia Tech, Computational Science \u0026amp; Engineering\u003Cbr\u003EDr. Chao Zhang - Georgia Tech, Computational Science \u0026amp; Engineering\u003Cbr\u003EDr. Ani Nenkova - Adobe Research, Document Intelligence Lab\u003C\/p\u003E\u003Cp\u003EAbstract:\u003Cbr\u003ELarge artificial intelligence (AI) models have garnered attention for their impressive, sometimes superhuman, performance on benchmarks, yet their practical adoption in verticals like web safety and well-being presents many challenges. Issues such as brittleness to realistic input variations, sensitivity to prompt formatting in large language models (LLMs), performance degradation in specialized settings, and limited effectiveness among certain user groups significantly limit the real-world utility of large AI models.\u003C\/p\u003E\u003Cp\u003ETo systematically address these challenges, this thesis introduces a framework for transforming foundational large AI models into real-world solutions by advancing their vertical-agnostic properties and overcoming challenges in vertical-specific applications. Focusing first on vertical-agnostic properties, the thesis advances multimodal AI models\u2014those integrating vision and language\u2014by tackling three critical areas: robustness to realistic data variations, efficient cross-modal mapping, and adaptability to novel tasks. Key contributions include evaluating model robustness to grounded multimodal variations, proposing a method to quantify text visualness for efficient cross-modal retrieval and generation, and developing techniques for rapidly adapting multimodal agents to custom workflows.\u003C\/p\u003E\u003Cp\u003EBuilding upon these foundational contributions, the thesis then targets vertical applications, demonstrating the necessity of tailored data, modeling, and evaluation approaches. In collaboration with domain experts in web safety and well-being, it characterizes and detects violence-provoking speech and leverages LLMs to uncover actionable mental well-being insights. Further, it reveals how insights from vertical-specific studies can loop back to improve large AI models, notably by addressing inequities across languages through multimodal learning. The thesis also underscores user interfacing with the emerging capabilities of large AI models as a vital and open area for future exploration.\u003Cbr\u003E\u0026nbsp;\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003ERobust, Efficient, and Adaptable Multimodal AI for Vertical Applications\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Robust, Efficient, and Adaptable Multimodal AI for Vertical Applications"}],"uid":"27707","created_gmt":"2025-03-31 18:25:26","changed_gmt":"2025-03-31 18:25:53","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2025-04-10T14:30:34-04:00","event_time_end":"2025-04-10T16:30:34-04:00","event_time_end_last":"2025-04-10T16:30:34-04:00","gmt_time_start":"2025-04-10 18:30:34","gmt_time_end":"2025-04-10 20:30:34","gmt_time_end_last":"2025-04-10 20:30:34","rrule":null,"timezone":"America\/New_York"},"location":"Coda 1315 (Grant Park)","extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78771","name":"Public"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}