<node id="390781">
  <nid>390781</nid>
  <type>event</type>
  <uid>
    <user id="27842"><![CDATA[27842]]></user>
  </uid>
  <created>1427360868</created>
  <changed>1492118379</changed>
  <title><![CDATA[CSIP Seminar]]></title>
  <body><![CDATA[<p><strong>Speaker:</strong> Ryan Curtin (CSIP)</p><p><strong>Title:</strong>&nbsp;<em>Dual-tree k-means with bounded single-iteration runtime</em></p><p><strong>Abstract:<br /></strong>k-means is a widely used clustering algorithm, but for k clusters and a dataset size of N, each iteration of Lloyd's algorithm costs O(kN) time.&nbsp; Although there are tree-based techniques to accelerate single Lloyd iterations, none of these techniques are tailored to the case of large k, which is increasingly common as dataset sizes grow. We propose a dual-tree algorithm that gives the exact result of Lloyd iterations; when using cover trees, we are able to bound the worst-case runtime of each algorithm as O(N + k log k) (this bound also depends on dataset-dependent quantities). To our knowledge these are the first sub-O(kN) bounds for exact Lloyd iterations. We then show that these theoretically favorable algorithms significantly outperform other approaches in practice, especially for large N and k.<strong><br /></strong></p><p>&nbsp;</p><p><strong>Speaker Bio:</strong><br />Ryan Curtin is a nearly-finished Ph.D. student in the School of Electrical and Computer Engineering at Georgia Tech; his research focuses on the acceleration of core primitives that make up machine learning algorithms, generally via the use of trees and other hierarchical structures.&nbsp; He is also the primary developer and maintainer of the mlpack machine learning library (<a href="http://www.mlpack.org/">http://www.mlpack.org/</a>).&nbsp; He doesn't enjoy writing biographies, and also is a pinball wizard.</p>]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Dual-tree k-means with bounded single-iteration runtime]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2015-04-03T16:00:00-04:00]]></value>
      <value2><![CDATA[2015-04-03T17:00:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Undergraduate students]]></value>
      </item>
          <item>
        <value><![CDATA[Faculty/Staff]]></value>
      </item>
          <item>
        <value><![CDATA[Graduate students]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[<p>Andrew Massimino</p><p><a href="mailto:massimino@gatech.edu">massimino@gatech.edu</a></p><p>&nbsp;</p>]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>1255</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[School of Electrical and Computer Engineering]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1795</tid>
        <value><![CDATA[Seminar/Lecture/Colloquium]]></value>
      </item>
      </field_categories>
  <field_keywords>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
