<node id="689474">
  <nid>689474</nid>
  <type>event</type>
  <uid>
    <user id="36319"><![CDATA[36319]]></user>
  </uid>
  <created>1775487104</created>
  <changed>1775487395</changed>
  <title><![CDATA[School of CSE Seminar Series: Abhinav Bhatele]]></title>
  <body><![CDATA[<p><strong>Speaker:</strong>&nbsp;Abhinav Bhatele, associate professor at University of Maryland<br><strong>Date and Time:</strong>&nbsp;April 17, 2:00-3:00 p.m.<br><strong>Location:</strong>&nbsp;Coda 114<br><strong>Host:</strong>&nbsp;Rich Vuduc</p><p><strong>Title:</strong>&nbsp;<em>Breaking the Scaling Wall in Distributed Deep Learning</em></p><p><strong>Abstract:</strong> Significant advances in computer architecture (development of extremely powerful server-class GPUs) and parallel computing (scalable libraries for dense and sparse linear algebra) have contributed to the on-going AI revolution. In particular, distributed training of deep neural networks (DNNs) relies on scalable matrix multiplication algorithms and efficient communication on high-speed interconnects. Pre-training and fine-tuning large language models (LLMs) with hundreds of billions to trillions of parameters and graph neural networks (GNNs) on extremely large graphs requires hundreds to tens of thousands of GPUs. However, such training often suffers from significant scaling bottlenecks such as high communication overheads and load imbalance.</p><p>In this talk, I will present several systems research directions that directly impact AI model training. First, I will describe my group's work in using a three-dimensional parallel algorithm for matrix multiplication in large-scale LLM training.&nbsp; Second, I will demonstrate the application of the same algorithm to full-graph and mini-batch GNN training when working with extremely large graphs. Finally, I will also discuss the need for scalable collective communication routines for large-scale DNN training.</p><p><strong>Bio:</strong> Abhinav Bhatele is an associate professor in the department of computer science, and director of the <a href="https://pssg.cs.umd.edu/">Parallel Software and Systems Group</a> at the University of Maryland, College Park. His research interests are broadly in systems and AI, with a focus on parallel computing and distributed AI. He has published research in parallel programming models and runtimes, network design and simulation, applications of machine learning to parallel systems, parallel deep learning, and on analyzing/visualizing, modeling and optimizing the performance of parallel software and systems. Abhinav has received best paper awards at Euro-Par 2009, IPDPS 2013, IPDPS 2016, and PDP 2024, and a best poster award at SC 2023. He was selected as a recipient of the <a href="http://www.ieee-tcsc.org/early.php">IEEE TCSC Award for Excellence in Scalable Computing (Early Career)</a> in 2014, the <a href="https://www.llnl.gov/news/laboratory-researchers-recognized-accomplishments-early-and-mid-career-0">LLNL Early and Mid-Career Recognition</a> award in 2018, the NSF CAREER award in 2021, the <a href="http://www.ieee-tcsc.org/middle.php">IEEE TCSC Award for Excellence in Scalable Computing (Middle Career)</a> in 2023, and the <a href="https://cs.illinois.edu/about/awards/alumni-awards/alumni-awards-past-recipients/66697">UIUC CS Early Career Academic Achievement Alumni Award</a> in 2024.</p><p>Abhinav received a B.Tech. degree in Computer Science and Engineering from I.I.T. Kanpur, India in May 2005, and M.S. and Ph.D. degrees in Computer Science from the University of Illinois at Urbana-Champaign in 2007 and 2010 respectively. He was a post-doc and later computer scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory from 2011-2019. Abhinav was an associate editor of the IEEE Transactions on Parallel and Distributed Systems (TPDS) from 2022-2024. He was one of the General Chairs of IEEE Cluster 2022, and Research Papers Chair of ISC 2023.</p>]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[School of CSE hosts a seminar from University of Maryland Associate Professor Abhinav Bhatele]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p><strong>Speaker:</strong>&nbsp;Abhinav Bhatele, associate professor at University of Maryland<br><strong>Date and Time:</strong>&nbsp;April 17, 2:00-3:00 p.m.<br><strong>Location:</strong>&nbsp;Coda 114<br><strong>Host:</strong>&nbsp;Rich Vuduc</p><p><strong>Title:</strong>&nbsp;<em>Breaking the Scaling Wall in Distributed Deep Learning</em></p>]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2026-04-17T14:00:00-04:00]]></value>
      <value2><![CDATA[2026-04-17T15:00:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Alumni]]></value>
      </item>
          <item>
        <value><![CDATA[Faculty/Staff]]></value>
      </item>
          <item>
        <value><![CDATA[Postdoc]]></value>
      </item>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
          <item>
        <value><![CDATA[Graduate students]]></value>
      </item>
          <item>
        <value><![CDATA[Undergraduate students]]></value>
      </item>
      </field_audience>
  <field_media>
          <item>
        <nid>
          <node id="679866">
            <nid>679866</nid>
            <type>image</type>
            <title><![CDATA[Abhinav-Bhatele.jpg]]></title>
            <body><![CDATA[]]></body>
                          <field_image>
                <item>
                  <fid>264076</fid>
                  <filename><![CDATA[Abhinav-Bhatele.jpg]]></filename>
                  <filepath><![CDATA[/sites/default/files/2026/04/06/Abhinav-Bhatele.jpg]]></filepath>
                  <file_full_path><![CDATA[http://hg.gatech.edu//sites/default/files/2026/04/06/Abhinav-Bhatele.jpg]]></file_full_path>
                  <filemime>image/jpeg</filemime>
                  <image_740><![CDATA[]]></image_740>
                  <image_alt><![CDATA[CSE Seminar Abhinav Bhatele]]></image_alt>
                </item>
              </field_image>
            
                      </node>
        </nid>
      </item>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[<p>Rich Vuduc (richie@cc.gatech.edu)</p>]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[Coda, Room 114]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>47223</item>
          <item>50877</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[College of Computing]]></item>
          <item><![CDATA[School of Computational Science and Engineering]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1795</tid>
        <value><![CDATA[Seminar/Lecture/Colloquium]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>166983</tid>
        <value><![CDATA[School of Computational Science and Engineering]]></value>
      </item>
      </field_keywords>
  <field_userdata><![CDATA[]]></field_userdata>
</node>
