{"681269":{"#nid":"681269","#data":{"type":"event","title":"Ph.D. Dissertation Defense - Raveesh Garg","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ETitle\u003C\/strong\u003E\u003Cem\u003E:\u0026nbsp; Architectures and Data Orchestration for Tensor Algebra Applications with Low-reuse Operations\u003C\/em\u003E\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ECommittee:\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EDr. Tushar Krishna, ECE, Chair, Advisor\u003C\/p\u003E\u003Cp\u003EDr. Michael Pellauer, Brown\/NVIDIA, Co-Advisor\u003C\/p\u003E\u003Cp\u003EDr. Hyesoon Kim, CoC\u003C\/p\u003E\u003Cp\u003EDr. Richard Vuduc, CoC\u003C\/p\u003E\u003Cp\u003EDr. Callie Hao, ECE\u003C\/p\u003E\u003Cp\u003EDr. Bahar Asgari, U of Maryland\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003ETensor-algebra applications are made up of operations that are independently analyzed and optimized when mapping them onto custom accelerators. While this approach works well for GEMMs with good reuse, we show that several tensor-algebra applications have operations with skewed aspect ratios which severely limits the reuse, making them memory bound. Therefore, the approach of optimizing an individual kernel or tensor operation is severely limiting. This work presents architectures and mapping strategies for efficient execution that improve the speedup on tensor-algebra applications where individual operations have low or mixed reuse. We first propose a taxonomy to classify dataflows for pipelining between two operations, and a cost-model OMEGA. We then propose PipeOrgan, which extends this to a larger pipeline depth and proposes a new class of spatial organization strategies for parallel pipeline dataflows, which minimize on-chip communication from producer to consumer. However, generalizing inter-operation reuse on tensor-algebra applications also requires considering applications like CG with complex DAGs, and delayed downstream dependencies where traditional inter-operation pipelining does apply. We propose SCORE, a scheduling strategy that classifies dependencies in the DAG and proposes data flow and tiling strategies at the register file level for applications like Conjugate Gradient. However, the design-space of buffer allocation strategies at the level of on-chip scratchpad explodes, especially for CG, which puts a burden on the scratchpads that need to allocate the lines in the buffer explicitly. Caches, on the other hand, have policies with a myopic line-level view. Therefore, we propose CELLO, an accelerator that uses SCORE to determine the data flow, and CHORD, a proposed hybrid implicit\/explicit buffer mechanism for buffer allocation reducing the burden on the compiler. Finally, we propose HARP, a taxonomy to classify emerging hierarchical and heterogeneous architectures for mixed-reuse workloads such as LLMs with high- and low-reuse operations in the same application cascade.\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Architectures and Data Orchestration for Tensor Algebra Applications with Low-reuse Operations "}],"uid":"28475","created_gmt":"2025-03-20 22:58:36","changed_gmt":"2025-03-20 22:59:15","author":"Daniela Staiculescu","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2025-03-27T11:00:00-04:00","event_time_end":"2025-03-27T13:00:00-04:00","event_time_end_last":"2025-03-27T13:00:00-04:00","gmt_time_start":"2025-03-27 15:00:00","gmt_time_end":"2025-03-27 17:00:00","gmt_time_end_last":"2025-03-27 17:00:00","rrule":null,"timezone":"America\/New_York"},"location":"Room 3126, Klaus ","extras":[],"related_links":[{"url":"https:\/\/gatech.zoom.us\/j\/93986270716","title":"Zoom link"}],"groups":[{"id":"434381","name":"ECE Ph.D. Dissertation Defenses"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"},{"id":"1808","name":"graduate students"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78771","name":"Public"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}