{"683142":{"#nid":"683142","#data":{"type":"event","title":"PhD Proposal by Ali Hassani","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ETitle:\u003C\/strong\u003E \u003Cstrong\u003ENeighborhood Attention: Reducing the O(n^2) complexity of Attention at the threadblock level\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EAli Hassani\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EPh.D. Student in Computer Science\u003C\/p\u003E\u003Cp\u003ESchool of Interactive Computing\u003C\/p\u003E\u003Cp\u003EGeorgia Institute of Technology\u003C\/p\u003E\u003Cp\u003E\u003Ca href=\u0022https:\/\/alihassanijr.com\u0022 title=\u0022https:\/\/alihassanijr.com\u0022\u003Ealihassanijr.com\u003C\/a\u003E\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDate \u0026amp; Time:\u003C\/strong\u003E\u0026nbsp;Thursday 7\/31\/2025 12:00 PM - 2:00 PM Eastern Time\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ELocation:\u003C\/strong\u003E\u0026nbsp;Coda C1103 Lindberg + \u003Ca href=\u0022https:\/\/gatech.zoom.us\/j\/99422563124?pwd=kSII1Cab0ooku6rpPtf2hR5Uoylb9O.1\u0022 title=\u0022https:\/\/gatech.zoom.us\/j\/99422563124?pwd=kSII1Cab0ooku6rpPtf2hR5Uoylb9O.1\u0022\u003EZoom Meeting\u003C\/a\u003E\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ECommittee:\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EDr. Humphrey Shi (Advisor)\u0026nbsp;- School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\u003Cp\u003EDr. Kartik Goyal\u0026nbsp;- School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\u003Cp\u003EDr. Judy Hoffman\u0026nbsp;- School of Interactive Computing, Georgia Institute of Technology\u003C\/p\u003E\u003Cp\u003EDr. Wen-mei Hwu - Electrical \u0026amp; Computer Engineering, University of Illinois at Urbana-Champaign.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EAbstract:\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EAttention is at the heart of most foundational AI models, across tasks and modalities. In many of those cases, it incurs a significant amount of computation, which is quadratic in complexity, and often\u003C\/p\u003E\u003Cp\u003Ecited as one of its greatest limitations. As a result, many approaches have been proposed to alleviate this issue, with one of the most common approaches being masked or reduced attention span.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EIn this work, we revisit sliding window approaches, which were commonly believed to be inherently inefficient, and we propose a new framework called Neighborhood Attention (NA). Through it, we solve design flaws in the original sliding window attention works, attempt to implement the approach efficiently for modern hardware accelerators, specifically GPUs, and conduct experiments that highlight the strengths and weaknesses of these approaches.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EAt the same time, we bridge the parameterization and properties of Convolution and Attention, by showing that NA exhibits inductive biases and receptive fields similar to that in convolutions, while still capable of capturing inter-dependencies, both short and long range, similar to attention.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EWe then show the necessity for and challenges that arise from infrastructure, especially in the context of modern implementations such as Flash Attention, and develop even more efficient and performance-optimized implementations for NA. Through these implementations, we achieve orders of magnitude improvement over naive implementations, and up to 2X improvement in inference and 1.4X improvement in training time.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003EWe finally show the limitations of the existing methodology, and outline research topics that can address them. All of our work is open sourced through the \u003Ca href=\u0022https:\/\/natten.org\u0022 title=\u0022https:\/\/natten.org\u0022\u003ENATTEN project\u003C\/a\u003E.\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EMeeting Details:\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EJoin Zoom Meeting\u003C\/p\u003E\u003Cp\u003E\u003Ca href=\u0022https:\/\/gatech.zoom.us\/j\/99422563124?pwd=kSII1Cab0ooku6rpPtf2hR5Uoylb9O.1\u0022 title=\u0022https:\/\/gatech.zoom.us\/j\/99422563124?pwd=kSII1Cab0ooku6rpPtf2hR5Uoylb9O.1\u0022\u003Ehttps:\/\/gatech.zoom.us\/j\/99422563124?pwd=kSII1Cab0ooku6rpPtf2hR5Uoylb9O.1\u003C\/a\u003E\u003C\/p\u003E\u003Cp\u003EMeeting ID: 994 2256 3124\u003C\/p\u003E\u003Cp\u003EPasscode: 435466\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003E\u003Cstrong\u003ENeighborhood Attention: Reducing the O(n^2) complexity of Attention at the threadblock level\u003C\/strong\u003E\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Neighborhood Attention: Reducing the O(n^2) complexity of Attention at the threadblock level"}],"uid":"27707","created_gmt":"2025-07-15 17:59:32","changed_gmt":"2025-07-15 18:00:02","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2025-07-31T12:00:00-04:00","event_time_end":"2025-07-31T14:00:00-04:00","event_time_end_last":"2025-07-31T14:00:00-04:00","gmt_time_start":"2025-07-31 16:00:00","gmt_time_end":"2025-07-31 18:00:00","gmt_time_end_last":"2025-07-31 18:00:00","rrule":null,"timezone":"America\/New_York"},"location":"Coda C1103 Lindberg + Zoom Meeting","extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"102851","name":"Phd proposal"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78771","name":"Public"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}