event

Ph.D. Dissertation Defense Announcement: Sixu Li

Primary tabs

Title: The Algorithm-Hardware Specialization Spectrum of Multi-Stage Intelligence Pipelines: From Dedicated Accelerators to Heterogeneous Systems

 

Date: Tuesday, May 19, 2026

Time: 4PM to 6PM Eastern Time

Location: Klaus 2100

Virtual Meeting: https://gatech.zoom.us/my/li.sixu 

 

Committee members:

Dr. Yingyan (Celine) Lin, College of Computing, Georgia Institute of Technology

Dr. Josiah Hester, College of Computing, Georgia Institute of Technology

Dr. Hyesoon Kim, College of Computing, Georgia Institute of Technology

Dr. Tushar Krishna, College of Computing, Georgia Institute of Technology

Dr. Thierry Tambe, Department of Electrical Engineering, Stanford University

 

Abstract: 

This dissertation develops a hardware-centric specialization framework for multi-stage intelligence pipelines, using 3D intelligence as the primary study domain. Pipelines in this domain span perception and reconstruction, rendering, and high-level reasoning; their stages differ fundamentally in computational regularity, memory behavior, and control-flow dynamics, and therefore interact with current hardware to very different degrees.

We characterize this heterogeneity through algorithmic entropy, a hardware-oriented measure of execution unpredictability that decomposes into two orthogonal axes: intra-operator entropy (X, datapath irregularity) and inter-operator entropy (Y, scheduling unpredictability), quantified via GPU profiling across nine representative workloads. The resulting two-dimensional space partitions workloads into quadrants, each mapping to a substrate along a hierarchical specialization spectrum: Q1 (low X, low Y) → dedicated ASIC; Q2 (high X, low–moderate Y) → enhanced fixed-function GPU; Q3 (low X, high Y) → heterogeneous GPU-PIM; Q4 (high on both axes) → commodity GPU/CPU baseline. Three systems instantiate this principle, one per populated quadrant:

Fusion-3D (Q1): a dedicated 3D-reconstruction ASIC with hierarchical spatial tiling and a unified on-chip pipeline, extended to multi-chip for large scenes. Achieves 2.5× / 6× throughput over prior accelerators in reconstruction and inference, validated on a silicon prototype.

GauRast (Q2): a 3D Gaussian Splatting rasterizer that extends the existing GPU fixed-function rasterizer with lightweight neural-rendering operations. Achieves 6× / 4× end-to-end speedup on original / optimized 3DGS pipelines at ≤0.2% SoC area overhead.

ORCHES (Q3): a GPU-PIM heterogeneous system for test-time-compute (TTC) LLM/VLM reasoning, combining adaptive workload assignment, branch-prediction-guided pipelining, and fragmentation-aware memory structuring. Achieves 4.16× / 3.10× end-to-end speedup on text / vision reasoning over SOTA GPU baselines.

Across the three quadrants, profile-guided substrate selection delivers substantial speedups for 3D intelligence workloads while specializing only the axis where existing hardware fails. Beyond 3D intelligence, we anticipate the same (X, Y) methodology serving as a starting point for substrate selection in other multi-stage, heterogeneous-workload domains.

Status

  • Workflow status: Published
  • Created by: Tatianna Richardson
  • Created: 05/07/2026
  • Modified By: Tatianna Richardson
  • Modified: 05/07/2026

Categories

Keywords

User Data

Target Audience