From Computational Learning and Motor Control Lab

Research: Reinforcement Learning

Reinforcement Learning for Robotics and Computational Motor Control

While supervised statistical learning techniques have significant applications for model and imitation learning, they do not suffice for all motor learning problems, particularly when no expert teacher or idealized desired behavior is available. Thus, both robotics and the understanding of human motor control require reward (or cost) related self-improvement. The developement of efficient reinforcement learning methods is therefore essential for the success of learning in motor control.

However, reinforcement learning in high-dimensional spaces such as manipulator and humanoid robotics is extremely difficult as a complete exploration of the underlying state-action spaces is impossible and few existing techniques scale into this domain.

Nevertheless, it is obvious that humans also never need such an extensive exploration in order to learn new motor skills and instead rely upon a combination of both watching a teacher and subsequent self-improvement. In more technical terms: first, a control policy is obtained by imitation and then improved using reinforcement learning. It is essential that only local policy search techniques, e.g., policy gradient methods, are applied as a rapid change to the policy would result into a complete unlearning of the policy and might also result into an unstable control policies which can damage the robot.

In order to bring reinforcement learning to robotics and computational motor control, we have developed a variety of novel reinforcement learning algorithms, such as the Natural Actor-Critic and the Episodic Natural Actor-Critic. These methods are particularly well-suited for policies based upon motor primitives and are being applied to motor skill learning in humanoid robotics and legged locomotion.

Contact persons: Jan Peters, Stefan Schaal

Related Publications

(:clmckeywordsearch reinforcement learning:)

Retrieved from
Page last modified on September 06, 2008, at 05:13 PM