<node id="673377">
  <nid>673377</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1709745969</created>
  <changed>1709745969</changed>
  <title><![CDATA[PhD Proposal by Pramod Chunduri]]></title>
  <body><![CDATA[<p><span><span><strong><span>Title:</span></strong><span>&nbsp;Advanced Query Processing Systems for Unstructured Data Management</span></span></span></p>

<p><span><span>&nbsp;</span></span></p>

<p><span><span><strong><span>Date:&nbsp;</span></strong><span>Monday, March 11th, 2024</span></span></span></p>

<p><span><span><strong><span>Time:</span></strong><span>&nbsp;1:00 - 2:30 PM EST</span></span></span></p>

<p><span><span><strong><span>Location:</span></strong><span>&nbsp;Klaus 1315</span></span></span></p>

<p><span><span><strong><span>Virtual Link:</span></strong>&nbsp;<span><a href="https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGQ5ZWQyMTMtMGFlYy00NjYwLThmZTQtNGViOTNjYWQ5ZmMw%40thread.v2/0?context=%7b%22Tid%22%3a%22482198bb-ae7b-4b25-8b7a-6d7f32faa083%22%2c%22Oid%22%3a%221bd9e6ce-aac7-482a-b32e-21f38a0d6c53%22%7d" title="https://gatech.zoom.us/j/99025084102">Teams</a></span></span></span></p>

<p><span><span>&nbsp;</span></span></p>

<p><span><span>&nbsp;</span></span></p>

<p><span><span><strong><span>Pramod Chunduri</span></strong></span></span></p>

<p><span><span>(<a href="https://pchunduri6.github.io/">https://pchunduri6.github.io/</a>)</span></span></p>

<p><span><span><span>Database Systems&nbsp;Ph.D.&nbsp;Student</span></span></span></p>

<p><span><span><span>School of Computer Science</span></span></span></p>

<p><span><span><span>Georgia Institute of Technology</span></span></span></p>

<p><span><span>&nbsp;</span></span></p>

<p><span><span>&nbsp;</span></span></p>

<p><span><span><strong><span>Committee:</span></strong></span></span></p>

<p><span><span><span>Dr. Joy Arulraj (Advisor) - School of Computer Science, Georgia Institute of Technology</span></span></span></p>

<p><span><span><span><span>Dr. Kexin Rong – School of Computer Science, Georgia Institute of Technology</span></span></span></span></p>

<p><span><span><span><span>Dr. Xu Chu – School of Computer Science, Georgia Institute of Technology</span></span></span></span></p>

<p><span><span><span><span>Dr. Shamkant Navathe – School of Computer Science, Georgia Institute of Technology</span></span></span></span></p>

<p><span><span>&nbsp;</span></span></p>

<p><span><span>&nbsp;</span></span></p>

<p><span><span><strong><span>Abstract:</span></strong></span></span></p>

<p><span><span><span><span><span><span>The exponential increase in unstructured data, such as video, images, audio, and text, presents significant challenges for efficient processing and analysis. While machine learning (ML), particularly deep learning (DL), has made impressive strides in developing models to handle these tasks, the practical application of these models to large-scale data is hindered by high costs, the inability to query fine-grained information, and the difficulty in selecting appropriate models for specific tasks. My thesis aims to address these challenges by developing efficient, accurate, and practical query processing systems for unstructured data management.</span></span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><span><span><span>In this proposal, I present three query processing systems to achieve this objective. First, I present ZEUS, a video analytics system that leverages reinforcement learning to efficiently localize complex actions in videos. ZEUS rapidly localizes complex actions in videos while maintaining a user-specified accuracy. I then present SketchQL, a user-friendly, sketch-based query system that allows intuitive retrieval of fine-grained video moments. SketchQL significantly enhances the usability and accuracy of fine-grained video moment retrieval.</span></span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><span><span><span>Finally, I propose an automated model selection framework for heterogeneous model ecosystems. In the past year, large language models (LLM) have taken giant leaps in unstructured text processing. A wide range of models are available as proprietary API-based offerings and open-source models. These models are incredibly expensive, with diverse performance profiles on user queries. Our preliminary work demonstrates that a careful model selection process can significantly cut down the query costs while reaching state-of-the-art accuracy. We aim to build a novel model routing strategy for heterogeneous LLMs that optimizes the cost, latency, and accuracy of unstructured text processing.</span></span></span></span></span></span></p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Advanced Query Processing Systems for Unstructured Data Management]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p><span><span><span>Advanced Query Processing Systems for Unstructured Data Management</span></span></span></p>
]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2024-03-11T13:00:00-04:00]]></value>
      <value2><![CDATA[2024-03-11T14:30:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[Klaus 1315]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>102851</tid>
        <value><![CDATA[Phd proposal]]></value>
      </item>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
