- This event has passed.
December 10, 2018 9:00 am - 10:30 am EST
Student: Ulrich Viereck
Date: Monday, December 10th, 2018
Time: 9:00am – 10:30am
Location: ISEC 138
Title: Learning a visuomotor controller by planning trajectories in simulation
We want to build robots that are useful in unstructured real world applications, such as doing work in the household. This requires reliable robotic manipulation skills, such as grasping an object, placing a cap on a bottle, or inserting a peg into a hole. However, reliable robotic manipulation in unstructured and dynamic environments is still a challenge. A closed-loop controller with visual feedback can make the manipulation more reliable because it can correct for disturbances in the environment such as object motion during grasping or avoiding collision with an unexpected object.
Recently, deep learning has been used to learn visuomotor controllers that map raw sensor input directly to control actions. There have been advances in learning policies in simulation using deep Reinforcement Learning (RL). However, RL might not be the best approach for this problem. RL is less sampling efficient than supervised learning. An issue with RL is that rewards are usually sparse and learning requires to reach the goal by chance. Another challenge is perceptually aliased states, where similar observations map to different world states. This is a problem because if states are perceptually aliased, then the state of the world is highly uncertain making it difficult to choose the best action.
In this thesis, I propose using full state planning to solve visuomotor learning as supervised learning rather than an RL problem. A planner in simulation could be used in various ways. First, the robot could learn a policy directly from sequences of observations and actions along trajectories generated by a planner. Second, the robot could learn a value function similar to RL, but instead of using bootstrapping to estimate the value of states, we would use a planner to calculate the value of states. This approach would allow to use RL to finetune the policy because we are learning a value function. In addition to using a full state planner, I also propose detecting perceptually aliased states and learn policies that avoid them. Perceptually aliased states could be detected by perturbing a candidate state and compare the new observation with the original. We would avoid perceptually aliased states by treating them as obstacles and plan trajectories around them.
Ulrich Viereck is a Ph.D. student in Computer Science at Northeastern University’s College of Computer and Information Science, advised by Prof. Rob Platt. His research interests lie in the areas of robotic manipulation, machine learning and control. Ulrich received his Master’s and Bachelor’s in Electrical Engineering & Information Technologies from Karlsruhe Institute of Technology (KIT), Germany.
- Robert Platt, Northeastern University (advisor)
- Christopher Amato, Northeastern University
- Hanumant Singh, Northeastern University
- Kate Saenko, Boston University