<node id="675369">
  <nid>675369</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1720209446</created>
  <changed>1720209446</changed>
  <title><![CDATA[PhD Defense by Yang Chen]]></title>
  <body><![CDATA[<p><strong>Title:&nbsp;</strong>Extracting Knowledge with Multimodal and Multilingual LLMs</p><p><strong>Date/Time</strong>: July 12th, 2024, 3:00 PM to 5:00 PM EST [12-2PM PST]</p><p><strong>Location</strong>: Coda C1115 Druid Hills</p><p><strong>Zoom</strong>:&nbsp;<a href="https://gatech.zoom.us/j/99753876757">https://gatech.zoom.us/j/99753876757</a>)</p><p>&nbsp;</p><p><strong>Yang Chen&nbsp;</strong>(<a href="https://edchengg.github.io/">Homepage</a>)</p><p>Ph.D. Candidate in Computer Science</p><p>School of Interactive Computing</p><p>Georgia Institute of Technology</p><p>&nbsp;</p><p><strong>Committee:</strong></p><p>Dr. Alan Ritter (advisor), School of Interactive Computing, Georgia Tech</p><p>Dr. Wei Xu (co-advisor), School of Interactive Computing, Georgia Tech</p><p>Dr. Kartik Goyal, School of Interactive Computing, Georgia Tech</p><p>Dr. Hexiang (Frank) Hu, Google Deepmind</p><p>Dr. Ming-Wei Chang, Google Deepmind</p><p>&nbsp;</p><p><strong>Abstract:</strong></p><p>Large language models (LLMs) have revolutionized natural language processing by learning vast amounts of knowledge from online text corpora. These models can utilize pre-trained knowledge to perform a wide range of tasks, and recent advancements have expanded their capabilities to include vision-language inputs. However, extracting and utilizing knowledge from these multimodal and multilingual LLMs presents several challenges. These include accurately benchmarking visual world knowledge, addressing privacy concerns related to memorized personal information, and effectively extracting textual knowledge from multilingual corpora, particularly for low-resource languages. This thesis addresses these challenges by developing and benchmarking methods to extract knowledge with multimodal and multilingual LLMs.</p><p>In this presentation, I will first introduce a visual info-seeking benchmark called InfoSeek, designed to evaluate visual knowledge in multimodal LLMs. Using InfoSeek, I will demonstrate how multimodal LLMs fine-tuned on a training set can elicit pre-trained knowledge and generalize to unseen entities. Additionally, I will show how a retrieval-based system can improve accuracy by accessing external resources such as Wikipedia. Building on these findings, I will then discuss an emergent privacy concern related to the deployment of state-of-the-art multimodal LLMs, particularly their potential to extract private user information from social media posts with geolocation capabilities.</p><p>The second part of the presentation will focus on extracting knowledge from multilingual corpora, with a particular emphasis on low-resource languages. I will introduce TransFusion, a learning framework that leverages translation models to enhance LLM performance on low-resource language tasks. Our experiments demonstrate improvements in African named entity recognition across various settings, including instruction-tuning, prompting, and supervised fine-tuning. Finally, I will present EasyProject, a crucial component in generating annotated information extraction data across multiple languages using a fine-tuned translation model.</p>]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Extracting Knowledge with Multimodal and Multilingual LLMs]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p>Extracting Knowledge with Multimodal and Multilingual LLMs</p>]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2024-07-12T15:00:00-04:00]]></value>
      <value2><![CDATA[2024-07-12T17:00:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[Coda C1115 Druid Hills]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
