event

PhD Proposal by Ashley Edwards

Primary tabs

Title: Perceptual Goal Specifications for Reinforcement Learning


 



Date:
Wednesday, November 22nd, 2017

Time:
1:00pm - 3:00pm (EST)


Location:
CCB 153


 

Ashley Edwards

Ph.D.
Student

School of
Interactive Computing

College
of Computing

Georgia
Institute of Technology


 

Committee:

---------------

Dr. Charles Isbell (Advisor, School of Interactive Computing, Georgia Institute of Technology)

Dr. Tucker Balch (School of Interactive Computing, Georgia Institute of Technology)

Dr. Sonia Chernova (School of Interactive Computing, Georgia Institute of Technology)

Dr. Mark Riedl (School of Interactive Computing, Georgia Institute of Technology)


 

Abstract: 

--------------- 

Rewards often act as the sole feedback for reinforcement learning problems. This signal is surprisingly powerful—it can motivate agents to solve tasks without any further guidance for how to accomplish them. Nevertheless, rewards do not come for free, and are typically hand-engineered for each problem. Furthermore, rewards are often defined as a function of an agent’s state variables. These components have traditionally been tuned to the domain and include information such as the location of the agent or other objects in the world. The reward function then is inherently based on domain-specific representations. While such reward specifications can be sufficient enough to produce optimal behavior, more complex tasks might be difficult to express in this manner. Suppose a robot has a task of building origami figures. The environment would need to provide a reward each time the robot made a correct figure, thus requiring the program designer to define a notion of correctness for each desired configuration. Constructing a reward function for each model might become tedious and even difficult—what should the inputs even be?

 

Humans regularly exploit learning materials outside of the physical realm of a task, be it through diagrams, videos, text, and speech. For example, we might look at an image of a completed origami figure to determine if our own model is correct. This proposal will describe similar approaches for presenting tasks to agents. In particular, I will introduce methods for specifying perceptual goals both within and outside of the agent’s environment, and perceptual reward functions that are derived from these goals. This will allow us to represent goals in settings where we can more easily find or construct solutions, without requiring us to modify the reward function when the task changes. In this proposal, I aim to demonstrate that rewards derived from perceptual goal specifications are: easier to specify than task-specific rewards functions; more easily generalizable across tasks; and equally enable task completion. I will validate these claims with the following contributions: 

 

1) Hand-Defined Perceptual Reward Functions specified through a hand-defined similarity metric that enable intra-domain and cross-domain goal specifications.

 

2) Semi-Supervised Perceptual Reward Functions learned in a semi-supervised manner that enable cross-domain goal specifications.

 

3) Unsupervised Perceptual Reward Functions learned from videos in an unsupervised manner that enable intra-domain and cross-domain goal specifications.


 

 

Status

  • Workflow Status:Published
  • Created By:Tatianna Richardson
  • Created:11/17/2017
  • Modified By:Tatianna Richardson
  • Modified:11/17/2017

Categories

Keywords