PhD Defense by Chia-Wen Kuo

You are cordially invited to attend my dissertation defense on Wednesday, November 29th.

Title: Knowledge-Augmented Vision-and-Language Assistant
Date: Wednesday, November 29th, 2023
Time: 11:00 AM - 12:30 PM PST
Location: this zoom link

Chia-Wen Kuo

Robotics PhD Candidate

School of Electrical and Computer Engineering

Georgia Institute of Technology

Committee:

Dr. Zsolt Kira (Advisor) - School of Interactive Computing, Georgia Institute of Technology

Dr. Chao Zhang - School of Computational Science and Engineering, Georgia Institute of Technology

Dr. Chunyuan Li - Principal Research Scientist, Microsoft Research

Dr. Judy Hoffman - School of Interactive Computing, Georgia Institute of Technology

Dr. Larry Heck - School of Electrical and Computer Engineering, Georgia Institute of Technology

Abstract:

The fusion of vision and language (VL) in artificial intelligence represents a crucial advancement in the creation of truly intelligent systems, echoing a fundamental aspect of human cognition: the ability to see and articulate the world. This integration has transformative potential across various sectors, notably enhancing human interaction with technology. However, developing effective VL models is challenging due to often incomplete or missing knowledge in both vision and language components. This limitation impacts the models' ability to accurately describe visual contents and answer complex, real-world questions. My research, presented in a series of three works, addresses these challenges. The first work, Xmodal-Ctx, introduces external knowledge into VL models to overcome their contextual limitations. The second, HAAV, expands this by integrating a diverse array of knowledge sources, enhancing the models' understanding of visual content. The final work, K-Aug, scales these concepts to larger, more complex multimodal models, addressing the integration and application of high-quality knowledge sources. This structured approach aims to bridge the knowledge gaps in VL models, thereby enhancing their overall interpretative and descriptive capabilities in a context-rich and linguistically coherent manner.

Media

No media selected

Summary

Knowledge-Augmented Vision-and-Language Assistant

Details

Wednesday

Nov 29 2023

11:00am - 12:30pm

Location: Virtual

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow Status:Published
Created By:Tatianna Richardson
Created:11/27/2023
Modified By:Tatianna Richardson
Modified:11/27/2023

Mercury (Hg)