<node id="690800">
  <nid>690800</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1781715061</created>
  <changed>1781715105</changed>
  <title><![CDATA[PhD Defense by Nikolai Warner]]></title>
  <body><![CDATA[<p><strong>Title</strong>: Improving Out-of-Distribution Generalization in Human-Centric Multimodal Vision</p><p>&nbsp;</p><p><strong>Date</strong>: Monday, June 22, 2026</p><p><strong>Time</strong>: 1:00 - 3:00 PM ET</p><p><strong>Location</strong>: Coda C0915 Atlantic +&nbsp; Remote (<a href="https://teams.microsoft.com/meet/250438047509225?p=kzLrPnM2Ap8Ny0Dq9t">https://teams.microsoft.com/meet/250438047509225?p=kzLrPnM2Ap8Ny0Dq9t</a>)</p><p><strong>Meeting ID</strong>: 250 438 047 509 225 | Passcode: 2Tn3Us6F</p><p>&nbsp;</p><p><strong>Nikolai Warner</strong></p><p>Robotics Ph.D. Candidate</p><p>George W. Woodruff School of Mechanical Engineering</p><p>Georgia Institute of Technology</p><p>&nbsp;</p><p><strong>Committee</strong></p><p>Dr. Irfan Essa (Advisor) - School of Interactive Computing, Georgia Institute of Technology</p><p>Dr. Thomas Ploetz - School of Interactive Computing, Georgia Institute of Technology</p><p>Dr. Zsolt Kira - School of Interactive Computing, Georgia Institute of Technology</p><p>Dr. Judy Hoffman - School of Interactive Computing, Georgia Institute of Technology</p><p>Dr. Apaar Sadhwani - Amazon</p><p>&nbsp;</p><p><strong>Abstract</strong></p><p>Despite steady in-distribution progress on human-centric vision tasks and the emergence of powerful foundation models, in-the-wild and out-of-distribution performance still lags. This dissertation studies four such tasks (interactive segmentation, non-rigid image editing, 3D human pose estimation, and motion-language alignment) and traces their out-of-distribution gap to two distinct failures: a signal-side failure, where the input modality is ill-posed for the task, and a noise-side failure, where the supervision channel carries distribution-specific nuisance. On the signal side, DAISeg enriches click-conditioned segmentation with an open-vocabulary saliency channel (from +3 mIoU on seen classes up to +10.5 on unseen, beating SAM under text-conditioned clicks), and AugLift hands the 2D-to-3D lifter a per-joint depth lower bound (−8.9% OOD MPJPE across four architectures, plus cross-dataset SOTA when combined with DG techniques). On the noise side, IPC-Edit constructs supervision that had no public equivalent, filtering and composing three noisy proxies into a 13.5K-pair corpus for identity-preserving non-rigid editing (68.5% identity preservation vs. 61%), while MoCHA denoises supervision that already exists, distilling an LLM canonicalization operator that strips annotator style from captions and setting a new cross-distribution SOTA (T2M R@1 from 13.74 to 26.59, +94%).</p>]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Improving Out-of-Distribution Generalization in Human-Centric Multimodal Vision]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p>see below</p>]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2026-06-22T13:00:00-04:00]]></value>
      <value2><![CDATA[2026-06-22T15:00:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[Coda C0915 Atlantic +  Remote ]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
