<node id="456051">
  <nid>456051</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1444123299</created>
  <changed>1475892854</changed>
  <title><![CDATA[PhD Defense by Naila Farooqui]]></title>
  <body><![CDATA[<p><strong>Title:&nbsp;&nbsp;Runtime Specialization for Heterogeneous CPU-GPU Resources</strong></p><p>&nbsp;</p><p><strong>Naila Farooqui</strong></p><p>School of Computer Science</p><p>College of Computing<br /> Georgia Institute of Technology<br /> <br /> Date: October 19, 2015 (Monday)<br /> Time: 12:00 PM - 2:00&nbsp;PM&nbsp;(ET)<br /> Location: KACB 3100<br /> <br /><strong> Committee:</strong></p><p>---------------</p><p>Dr. Karsten Schwan (Advisor, School of Computer Science, Georgia Tech)</p><p>Dr. Sudhakar Yalamanchili (School of Electrical and Computer Engineering, Georgia Tech)</p><p>Dr. Ada Gavrilovska (School of Computer Science, Georgia Tech)</p><p>Dr. Richard Vuduc (School of Computational Science and Engineering, Georgia Tech)</p><p>Dr. Vanish Talwar (Research Scientist, PernixData)</p><p>Dr. Rajkishore Barik (Research Scientist, Intel Labs)</p><p><em><strong>&nbsp;</strong></em></p><p><em><strong>Abstract:</strong></em></p><p>------------</p><p> </p><p>Heterogeneous parallel architectures like those comprised of CPUs and GPUs are a </p><p> </p><p>tantalizing compute fabric for performance-hungry developers. While these platforms </p><p> </p><p>enable order-of-magnitude performance increases for many data-parallel application </p><p> </p><p>domains, there remain several open challenges: (i) the distinct execution models </p><p> </p><p>inherent in the heterogeneous devices present on such platforms drives the need to </p><p> </p><p>dynamically match workload characteristics to the underlying resources, (ii) the complex </p><p> </p><p>architecture and programming models of such systems require substantial application<br />&nbsp;knowledge and effort-intensive program tuning to achieve high performance, and (iii)&nbsp;<br /> as such platforms become prevalent, there is a need to extend their utility from running&nbsp;<br /> known regular data-parallel applications to the broader set of input-dependent, irregular&nbsp;<br /> applications common in enterprise settings.&nbsp;<br />&nbsp;<br /> The key contribution of our research is to enable runtime specialization on such hybrid </p><p> </p><p>CPU-GPU platforms by matching application characteristics to the underlying heterogeneous <br /> <a name="section-executive-summary.tex-25"></a>resources for both regular and irregular workloads. Our approach enables profile-driven </p><p> </p><p>resource management and optimizations for such platforms, providing high application </p><p> </p><p>performance and system throughput. Towards this end,&nbsp; this research will: (a) enable dynamic </p><p> </p><p>instrumentation for GPU-based parallel architectures, specifically targeting the complex </p><p> </p><p>Single-Instruction Multiple-Data (SIMD) execution model, to gain real-time introspection into </p><p> </p><p>application behavior; (b) leverage such dynamic performance data to support novel online </p><p> </p><p>resource management methods that improve application performance and system throughput, </p><p> </p><p>particularly for irregular, input-dependent applications; (c) automate some of the programmer </p><p> </p><p>effort required to exercise specialized architectural features of such platforms via </p><p> </p><p>instrumentation-driven dynamic code optimizations; and (d) propose a specialized, affinity-aware </p><p> </p><p>work-stealing scheduler for integrated CPU-GPU processors that efficiently distributes work at </p><p> </p><p>runtime across all CPU and GPU cores for improved load balance, taking into account<br />&nbsp;both application characteristics and architectural differences of the underlying devices.</p><p> <br />&nbsp;<a name="section-executive-summary.tex-25"></a>resources for both regular and irregular workloads. Our approach enables profile-driven </p><p>resource management and optimizations for such platforms, providing high application </p><p>performance and system throughput. Towards this end,&nbsp; this research will: (a) enable dynamic </p><p>instrumentation for GPU-based parallel architectures, specifically targeting the complex </p><p>Single-Instruction Multiple-Data (SIMD) execution model, to gain real-time introspection into </p><p>application behavior; (b) leverage such dynamic performance data to support novel online </p><p>resource management methods that improve application performance and system throughput, </p><p>particularly for irregular, input-dependent applications; (c) automate some of the programmer </p><p>effort required to exercise specialized architectural features of such platforms via </p><p>instrumentation-driven dynamic code optimizations; and (d) propose a specialized, affinity-aware </p><p>work-stealing scheduler for integrated CPU-GPU processors that efficiently distributes work at </p><p>runtime across all CPU and GPU cores for improved load balance, taking into account<br /> <a name="section-executive-summary.tex-36"></a>both application characteristics and architectural differences of the underlying devices.<a name="section-executive-summary.tex-37"></a></p><p>&nbsp;</p>]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Runtime Specialization for Heterogeneous CPU-GPU Resources]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2015-10-19T13:00:00-04:00]]></value>
      <value2><![CDATA[2015-10-19T15:00:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
