event
Ph.D. Dissertation Defense Announcement: Sixu Li
Primary tabs
Title: The Algorithm-Hardware Specialization Spectrum of Multi-Stage Intelligence Pipelines: From Dedicated Accelerators to Heterogeneous Systems
Date: Tuesday, May 19, 2026
Time: 4PM to 6PM Eastern Time
Location: Klaus 2100
Virtual Meeting: https://gatech.zoom.us/my/li.sixu
Committee members:
Dr. Yingyan (Celine) Lin, College of Computing, Georgia Institute of Technology
Dr. Josiah Hester, College of Computing, Georgia Institute of Technology
Dr. Hyesoon Kim, College of Computing, Georgia Institute of Technology
Dr. Tushar Krishna, College of Computing, Georgia Institute of Technology
Dr. Thierry Tambe, Department of Electrical Engineering, Stanford University
Abstract:
This dissertation develops a hardware-centric specialization framework for multi-stage intelligence pipelines, using 3D intelligence as the primary study domain. Pipelines in this domain span perception and reconstruction, rendering, and high-level reasoning; their stages differ fundamentally in computational regularity, memory behavior, and control-flow dynamics, and therefore interact with current hardware to very different degrees.
We characterize this heterogeneity through algorithmic entropy, a hardware-oriented measure of execution unpredictability that decomposes into two orthogonal axes: intra-operator entropy (X, datapath irregularity) and inter-operator entropy (Y, scheduling unpredictability), quantified via GPU profiling across nine representative workloads. The resulting two-dimensional space partitions workloads into quadrants, each mapping to a substrate along a hierarchical specialization spectrum: Q1 (low X, low Y) → dedicated ASIC; Q2 (high X, low–moderate Y) → enhanced fixed-function GPU; Q3 (low X, high Y) → heterogeneous GPU-PIM; Q4 (high on both axes) → commodity GPU/CPU baseline. Three systems instantiate this principle, one per populated quadrant:
Fusion-3D (Q1): a dedicated 3D-reconstruction ASIC with hierarchical spatial tiling and a unified on-chip pipeline, extended to multi-chip for large scenes. Achieves 2.5× / 6× throughput over prior accelerators in reconstruction and inference, validated on a silicon prototype.
GauRast (Q2): a 3D Gaussian Splatting rasterizer that extends the existing GPU fixed-function rasterizer with lightweight neural-rendering operations. Achieves 6× / 4× end-to-end speedup on original / optimized 3DGS pipelines at ≤0.2% SoC area overhead.
ORCHES (Q3): a GPU-PIM heterogeneous system for test-time-compute (TTC) LLM/VLM reasoning, combining adaptive workload assignment, branch-prediction-guided pipelining, and fragmentation-aware memory structuring. Achieves 4.16× / 3.10× end-to-end speedup on text / vision reasoning over SOTA GPU baselines.
Across the three quadrants, profile-guided substrate selection delivers substantial speedups for 3D intelligence workloads while specializing only the axis where existing hardware fails. Beyond 3D intelligence, we anticipate the same (X, Y) methodology serving as a starting point for substrate selection in other multi-stage, heterogeneous-workload domains.
Groups
Status
- Workflow status: Published
- Created by: Tatianna Richardson
- Created: 05/07/2026
- Modified By: Tatianna Richardson
- Modified: 05/07/2026
Categories
Keywords
User Data
Target Audience