event
PhD Defense by Qingru Zhang
Primary tabs
Title: On the Efficiency and Steerability of Self-Attention Mechanism of Large Language Models
Date: April 9th, 2025
Time: 2:00 pm – 3:30 pm (EST)
Location: Online
Zoom link: https://gatech.zoom.us/j/99605972633?pwd=sXxqHgVu2d3bj129p7kQnqadNk6Xqg.1
Qingru Zhang
Machine Learning PhD Candidate
School of Computational Science and Engineering
Georgia Institute of Technology
Committee
1. Dr. Tuo Zhao (ISYE, Georgia Tech) (Advisor)
2. Dr. Chao Zhang (CSE, Georgia Tech)
3. Dr. Anqi Wu (CSE, Georgia Tech)
4. Dr. Bo Dai (CSE, Georgia Tech)
5. Dr. Xiaodong Liu (Microsoft Research)
Abstract
Large language models (LLMs) have demonstrated exceptional performance across a wide range of real-world tasks. These models leverage self-attention mechanism to capture intricate dependencies between tokens, resulting in precise contextual understanding. However, when handling prompts containing long background contexts, the self-attention mechanism often faces challenges: (1) significant memory and computational overheads when processing long sequences, and (2) difficulty in fully comprehending contexts and performing complex reasoning. In this thesis, we focus on two crucial aspects of self-attention: efficiency and steerability, and explore innovative prompting techniques to address these challenges. In the first part, we tackle the computational and memory overheads of long sequence modeling by introducing mixed attention span and compressing Key-Value caches, achieving near-lossless performance with significantly reduced costs. In the second part, we propose post-hoc attention steering method that guides LLM attention to better align with contextual information and user instructions. In the final part, we present innovative prompting strategies that enhance LLM reading comprehension via steerable prompting and improve complex reasoning through a parallel decomposition approach. Together, these contributions advance the scalability, controllability, and reasoning capabilities of LLMs.
Groups
Status
- Workflow Status:Published
- Created By:Tatianna Richardson
- Created:04/04/2025
- Modified By:Tatianna Richardson
- Modified:04/04/2025
Categories
Keywords
Target Audience