PhD Proposal by Yin Li

Event Details
  • Date/Time:
    • Monday June 20, 2016
      1:00 pm - 3:00 pm
  • Location: TSRB GVU Cafe
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Learning Embodied Models of Actions from First Person Video

Full Summary: No summary paragraph submitted.

Title: Learning Embodied Models of Actions from First Person Video

Yin Li
Computer Science Ph.D. Student
School of Interactive Computing
College of Computing
Georgia Institute of Technology

Date: Monday, June 20th, 2016
Time: 1:00pm to 3:00pm (EST)
LocationTSRB GVU Cafe
 
Committee:
---------------
Dr. James M. Rehg (Advisor), School of Interactive Computing, Georgia Institute of Technology 

Dr. Irfan Essa, School of Interactive Computing, Georgia Institute of Technology 

Dr. James Hays, School of Interactive Computing, Georgia Institute of Technology 

Dr. Kristen Grauman, Department of Computer Science, University of Texas at Austin

Abstract:
-----------

The development of wearable cameras and the advancement of computer vision make it possible for the first time in history to collect and analyze a large scale record of our daily visual experiences, in the form of first person videos. My thesis work focuses on the automatic analysis of these first person videos, known as First Person Vision (FPV). My goal is to develop novel embodied representations for understanding the camera wearer's actions, by leveraging first person visual cues derived from first person videos, including body motion, hand locations and gaze. This ``embodied'' representation is different from traditional visual representations, as it derives from the purposive body movements of the first person and captures the concept of objects within the context of actions. 

 

By considering actions as intentional body movements, I propose to investigate three important parts of first person actions. First, I present a method to estimate egocentric gaze that reveal the visual trajectory of an action. Our work demonstrates for the first time that egocentric gaze can be reliably estimated using only head motion and hand locations derived from first person video, and without the need of object or action information. Second, I develop a method for first person action recognition. Our work demonstrates that an embodied representation that combines egocentric cues and visual cues can inform the location of actions and significantly improve the accuracy of recognition. Finally, I propose a novel task of object interaction prediction to uncover the plan of a future object manipulation and thus explain the purposive motions. I will develop novel learning schemes for the task and learn a embodied object representation from the task. 

 

Additional Information

In Campus Calendar
No
Groups

Graduate Studies

Invited Audience
Public
Categories
Other/Miscellaneous
Keywords
Phd proposal
Status
  • Created By: Tatianna Richardson
  • Workflow Status: Published
  • Created On: Jun 16, 2016 - 8:57am
  • Last Updated: Oct 7, 2016 - 10:18pm