PhD Proposal by Shangjie Xue

Dear faculty members and fellow students,

You are cordially invited to attend my dissertation proposal.

Title: Actionable Scene Representation

Shangjie Xue

Computer Science PhD Student

School of Interactive Computing

Georgia Institute of Technology

Date: Friday, August 15, 2025

Time: 8:30-9:30 pm EDT

Location: this zoom link (https://gatech.zoom.us/j/94505002515?pwd=7DP6PQbU1JhALeXyQQ5a9vXhGbYryN.1)

Committee:

Dr. Danfei Xu - School of Interactive Computing, Georgia Institute of Technology

Dr. Frank Dellaert - School of Interactive Computing, Georgia Institute of Technology

Dr. Panagiotis Tsiotras - School of Aerospace Engineering, Georgia Institute of Technology

Dr. Zsolt Kira - School of Interactive Computing, Georgia Institute of Technology

Abstract:

Enabling robots to operate in unknown environments is essential for deploying robots in everyday human environments and remains a major challenge in robotics. In such environments, robots must plan actions that both gather information about the scene and achieve designated goals. To reliably accomplish this, it is crucial for the robot to predict the outcome of its actions—specifically, both the future state resulting from an action and the information gained by performing it. This requires an actionable scene representation capable of modeling both state transition and information gain.

This thesis proposes a unified framework to address this challenge, consisting of three parts: (i) Uncertainty quantification for active mapping: Motivated by the insight that predictions in previously unseen regions are inherently unreliable, a principled framework is proposed to quantify uncertainty in 3D scene representations, particularly radiance fields. We first introduce the Neural Visibility Field (NVF), a method that quantifies uncertainty in Neural Radiance Fields (NeRF) through visibility estimation, thereby enabling active mapping. Then, we present the Gaussian Splatting Anisotropic Visibility Field (GAVIS), a method that achieves efficient and accurate uncertainty quantification in 3D Gaussian Splatting (3DGS) via analytical visibility modeling, thereby enhancing active mapping performance. (ii) Representation for Model-Based Manipulation: We develop model-based manipulation methods for predicting the future states of robot actions. First, we introduce the Neural Field Dynamics Model, a learning-based dynamics model for granular material manipulation that represents both the end-effector and objects as density fields, enabling gradient-based trajectory optimization. Then, we present the Motion Foundation Model, which represents both grippers and objects as Any-point Tracks, enabling language-conditioned manipulation. (iii) Unified framework for exploratory manipulation: As future work, we aim to integrate these two capabilities into a single framework that jointly predicts future states and the corresponding information gain, enabling robots to perform uncertainty-driven exploratory manipulation in unknown environments.

Media

No media selected

Summary

Actionable Scene Representation

Details

Friday - Monday

Aug 15

2025

Aug 18

2025

08:30pm - 09:30pm

Location: ZOOM

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Graduate Studies

Status

Workflow status: Published
Created by: Tatianna Richardson
Created: 08/11/2025
Modified By: Tatianna Richardson
Modified: 08/11/2025

Mercury (Hg)