R:SS Workshop: Bridging the gap between high-level discrete representations and low-level continuous behaviors
Date: June 28, 2009
Location: University of Washington Electrical Engineering Building, Room EEB045
Registration Information
Organizers
Dana Kulic (University of Waterloo), dkulic@ece.uwaterloo.ca
Pieter Abbeel (University of California at Berkeley), pabbeel@cs.berkeley.edu
Jan Peters (Max Planck Institute for Biological Cybernetics), mail@jan-peters.net
Objectives and Topics
Recently, robotics researchers have been investigating the modeling of human and robot behavior in terms of motion primitives. This research direction, based on biological and neuroscience findings, posits that human behavior is composed of motor primitive units, which can be acquired by a robot through imitation learning or practice. Motion primitives offer an approach for discretizing continuous behavior, representing a "bottom-up" approach for organizing robot behavior. On the other hand, in AI and planning fields, there has been a longstanding area of research in planning and acting in the discrete domain, or through modeling changes in the world as an instantaneous change in discrete state. This approach can be thought of as a "top-down" approach for organizing robot behavior. In this workshop, we propose to bring together researchers from both areas to discuss approaches for "bridging the gap" and combining continuous domain approaches with discrete representations.
The aim of the workshop is to bring together researchers working on motion primitives as a way of discretizing continuous behavior, and discuss ways in which these approaches can be extended through hierarchical organization and combined with planning and other discrete domain approaches. Specific themes of the workshop include:
- motion primitive representations and task abstractions
- learning and parsing sequences and plans of motion primitives
- imitation learning and learning from observation based on motion primitives
- hierarchical reinforcement learning
- apprenticeship learning of composed tasks
- hybrid task control
- hierarchical organization of behaviors
- learning operator conditions for primitives
- plan recognition
- plan generation and modification
The workshop will include talks by a number of the top researchers in the field, who will articulate current approaches and the progress to date. A key goal of the workshop is to provide a venue to allow discussion on how current approaches may be combined to integrate the capability for acting in the continuous domain while reasoning in the discrete domain.
The workshop is supported by the Technical Committee on Robot Learning of the IEEE Robotics and Automation Society.
Workshop Schedule
Session 1 (9:00 - 10:40)
9:00 – 9:20
Dana Kulic
This talk will present our recent work on designing humanoid robots capable of continuous, on-line learning through observation of human movement. Learning behavior and motion primitives from observation is a key skill for humanoid robots, enabling humanoids to take advantage of their similar body structure to humans. First, approaches for designing the appropriate motion representation and abstraction will be discussed. Next, an approach for on-line, incremental learning of whole body motion primitives and primitive sequencing from observation of human motion will be described. The talk will conclude with an overview of preliminary experimental results and a discussion of future research directions.
9:20 – 9:40
Aude Billard
Low-level encoding of motions in autonomous dynamical systems (DS) is advantageous in that it gives a compact, generic and time-independent representation of the motion, robust to spatial and temporal perturbations. Ensuring that the resulting DS is stable remains a major issue, largely overlooked so far. This talk will present recent efforts of ours to develop a method to model non-linear dynamics of motion through mixtures of Gaussians, while ensuring that the resulting mixture generates stable trajectories at the desired attractor. It will conclude with a brief overview of potential extensions of this approach for learning sequences of motions through hierarchical Hidden Markov Models, thus offering a first step toward a higher-level representation of motion.
9:40 – 10:00
Jun Morimoto
Hierarchical Learning Architectures for Motor Control: Application to Learning Stand-up and Walking Behaviors
We introduce our attempts to use hierarchical learning architectures for motor control tasks. We apply our proposed methods to a discrete stand-up movement and a periodic walking behavior. In general, naive application of non-linear optimal control approaches to robots have many degrees of freedom does not work since we need to consider a high-dimensional state space. Therefore, hierarchical learning methods have been proposed to deal with such a high-dimensional state space. In our study, we propose a two-layered hierarchical reinforcement learning method for learning discrete movements. In the lower-layer, a continuous state-action space is considered to learn a low-level controller that outputs joint torque. In the upper-layer, a discrete state-action space is considered to learn a policy that outputs sub-goals for the lower-layer learning system. The proposed hierarchical learning architecture was applied to a 3-link robot that is not fixed to the ground. The real 3-link robot successfully learned to stand up within reasonable learning trials. On the other hand, for learning periodic movements, using Poincare maps to represent upper-layer dynamics can be a natural approach. In our study, we pre-design low-layer periodic pattern generator and extract a low-dimensional task relevant feature space at a Poincare section for the upper-layer to improve policies that modulates the lower-layer pattern generator. The proposed learning method was applied to a humanoid robot. The humanoid robot could improve stepping and walking behaviors by using the proposed learning method.
10:00 – 10:20
Michael Pardowitz
In robot learning systems, the problem of task segmentation and task decomposition has not been addressed with satisfactory attention. In this talk I will outline a method relying on psychological gestalt theories originally developed for visual perception and apply it to the domain of action segmentation. I will propose a computational model for gestalt-based segmentation called Competitive Layer Model (CLM). The CLM relies on features mutually supporting or inhibiting each other to form segments by competition. I will outline how gestalt laws for actions can be learned from human demonstrations and how they can be beneficial to the CLM segmentation method. Finally, I will present an outlook, how the road from action segmentation might continue towards everyday robot task learning systems.
10:20 – 10:40
Chad Jenkins
When given a robot platform, the question arises about how users will program autonomous controllers for accomplishing various tasks of interest. Robot learning from demonstration (LfD) has emerged in recent years as a highly viable direction for developing robot controllers implicitly from human demonstration, in contrast to programs explicitly coded by hand. With LfD in mind, we have developed nonparametric methods based on pairwise similarity kernels for learning primitives directly from demonstration data. This work includes uncovering human motion primitives using spatio-temporal "registration kernels" with dimension reduction and decision making primitives with Infinite Mixtures of Gaussian Process Experts. While successful for learning primitives, the question still remains whether learning or manual coding will become "the path of least resistance" for developing robot controllers. Revisiting our initial proposition, I argue that learning still has several distinct challenges towards becoming a viable alternative. These challenges center around the various forms of finite state machines that can be learned from demonstration (or human guidance) and how these approaches can unify.
Coffee Break (10:40 - 11:00)
Session 2 (11:00 - 12:30)
11:00 – 11:20
Michael Beetz
Towards Automated Models of Robot and Human Everyday Manipulation Activities
I propose automated probabilistic models of everyday activities (AM-EvA) as a novel technical means for the perception, interpretation, and analysis of everyday manipulation tasks and activities of daily life. AM-EvAs are based on action-related concepts in everyday activities such as action-related places (the place where cups are taken from the cupboard), capabilities (the objects that can be picked up single-handedly), etc. These concepts are probabilistically derived from a set of previous activities that are fully and automatically observed by computer vision and additional sensor systems or through a robot collecting experiences. AM-EvA models enable robots to analyze and reason about activities in situation and activity contexts. They make the classification and assessment of actions and situations objective and can justify the probabilistic interpretation with respect to the activities the concepts have been learned from. I describe in the current state of implementation of the AM-EvA system that realizes this idea of automated models of everyday activities and show example results from the observation and analysis of humans and robots setting a table.
11:20 – 11:40
Danica Kragic
From Scenes to Concepts: Active Vision for Detecting, Fixating, Manipulating Objects and Learning of Human Actions
The ability to autonomously acquire new knowledge through interaction with the environment is one of the major research goals in the field of robotics. The knowledge can be acquired only if suitable perception-action capabilities are present. In other words, a robotic system has to be able to detect, attend to and manipulate objects in the environment. In the first part of the talk, we present the results of our longterm work in the area of vision based sensing and control. The work on finding, attending, recognizing and manipulating objects in domestic environments is discussed. More precisely, we present a stereo based active vision system framework where aspects of Top-down and Bottom-up attention and foveated attention are put into focus and demonstrate how the system can be utilized for object grasping.
The second part of the talk presents our work on the visual analysis of human manipulation actions which are of interest for e.g. human-robot interaction applications where a robot learns how to perform a task by watching a human. A method for classifying manipulation actions in the context of the objects manipulated, and classifying objects in the context of the actions used to manipulate them is presented. The action-object correlation over time is then modeled using conditional random fields. Experimental comparison shows improvement in classification rate when the action-object correlation is taken into account, compared to separate classification of manipulation actions and manipulated objects.
11:40 – 12:00
Tetsunari Inamura
We have previously proposed a mimesis model that can recognize, generate, abstract and imitate other's behavior based on the concept of mirror neuron systems. The model abstracts others' motion patterns and links to a primitive symbol representation based on self-body configuration. It, however, doesn't take into account the structural difference between the self and the other. Furthermore, unobservable inner sensory information such as torque cannot be treated.
In this study, we utilize symbol communication to solve these problems and to develop an adaptive acquisition method that estimates others' sensor patterns, which is an implementation of the bridge between the self and the other's sensorimotor patterns. Through experiments on virtual humanoid robots, we confirmed the feasibility of this symbol-communication-based estimation method.]
12:00 – 12:20
Tamim Asfour
The design of cognitive robots able to learn to operate in the real world and to interact and communicate with humans must model and reflectively reason about their perceptions and actions in order to learn, act, predict and react appropriately. Such understanding can only be attained by embodied agents and requires the simultaneous consideration of perception and action.
In this talk, we will present the concept of Object-Action Complexes (OAC) which has been introduced by the European project PACO-PLUS (www.paco-plus.org) to emphasize the notion that objects and actions are inseparably intertwined and that categories are therefore determined (and also limited) by the action a cognitive agent can perform and by the attributes of the world it can perceive. Entities “things” in the world of a robot (or human) will only become semantically useful objects through the action that the agent can/will perform on them.
Object-Action Complexes (OACs) are proposed as a universal representation enabling efficient planning and execution of purposeful action at all levels of the cognitive architecture. OACs combine the representational and computational efficiency for purposes of search (the frame problem) of STRIPS rules (Fikes 1971) and the object- and situation-oriented concept of affordance (Gibson 1950, Sahin 2007) with the logical clarity of the event calculus (Kowalski et al. 1986, Steedman 2002). Affordance is the relation between a situation, usually including an object of a defined type, and the actions that it allows. While a affordances have mostly been analyzed in their purely perceptual aspect, the OAC concept defines them more generally as state-transition functions suited to prediction. Such functions can be used for efficient forward-chaining planning, learning, and execution of actions represented simultaneously at multiple levels in an embodied agent architecture.
12:20 – 12:30
Poster Teasers
Lunch (12:30 - 14:00)
Session 3 (14:00 - 15:40)
14:00 – 14:20
Jan Peters
Bridging the GAP with Motor Primitives
Dynamic systems-based Motor Primitives (DMP) have proven a useful tool for creating low-level discrete & rhythmic behaviors. Imitation and Reinforcement Learning with DMPs has been thoroughly researched and efficient methods have recently been introduced. In this talk, we focus on presenting (i) how such DMPs can become increasingly context using goal learning, (ii) how such DMPs can be selected based on context information. We show both published and preliminary results.
14:20 – 14:40
Dieter Fox
Over the last decade, the robotics community has developed highly efficient and robust solutions to state estimation problems such as robot localization, people tracking, and map building. With the availability of various techniques for spatially consistent sensor integration, an important next goal is to enable robots to reason about the many objects located in our everyday environments and to reason about spatial concepts such as rooms, hallways, streets, and intersections. An additional requirement for successful operation in populated environments is the ability to recognize the intent of humans and to adapt to their behavior patterns.
Extracting the necessary information from raw sensor data requires to bridge the gap between continuous sensor data and discrete, high-level concepts. In this talk I will present some work using graphical models and machine learning techniques to extract such information from sensor data. Examples include place and object recognition from vision and laser data, and human activity recognition from wearable sensor data.
14:40 – 15:00
Rod Grupen
TBD
15:00 – 15:20
Zico Kolter
High-level Control Using Multiple Inaccurate Models, with Application to Extreme Driving
We consider the problem of controlling a complex dynamical system, such as skidding a car sideways into a parking spot; in this task and many others, it can be extremely difficult to accurately model the system in certain regions of the state space. In this work, we present a high-level control method that uses a probabilistic combination of multiple inaccurate models in order to control such systems. In particular, we show that by combining 1) a "nominal" model that may be highly inaccurate in certain regions and 2) a model that captures only the observed behavior of an expert demonstration, we can accurately control a system without the need to explicitly model the most complex regions. We apply this algorithm to the aforementioned task of skidding a car sideways into a parking spot, and demonstrate state-of-the-art performance on this highly challenging control task.
15:20 – 15:40
Pieter Abbeel
We consider the problem of learning to follow a desired trajectory when given a small number of demonstrations from a sub-optimal expert. We present an algorithm that (i) extracts the—initially unknown—desired trajectory from the sub-optimal expert’s demonstrations and (ii) learns a local model suitable for control along the learned trajectory. We apply our algorithm to the problem of autonomous helicopter flight. In all cases, the autonomous helicopter’s performance exceeds that of our expert helicopter pilot’s demonstrations. Even stronger, our results significantly extend the state-of-the-art in autonomous helicopter aerobatics. In particular, our results include the first autonomous tic-tocs, loops and hurricane, vastly superior performance on previously performed aerobatic maneuvers (such as in-place flips and rolls), and a complete airshow, which requires autonomous transitions between these and various other maneuvers.
Coffee Break (15:40 - 16:00)
Session 4 (16:00 - 17:40)
16:00 – 16:20
Volker Krueger
From observations to actions
In this talk we will discuss how we can construct action grammars from observations and use them recognizing actions of humans. Actions we are considering are, e.g., pointing at an object, reaching an object, moving an object. To recognize an action here is understood as a) identifying, what action is performed, e.g., reaching and b) where it is being reached at.
For building action grammars, we observe a human acting on objects. Actions are clustered according to their effect on the scene, .i.e., an object being moved or rotated. Human actions are considered to be the same if they have the same effect on the scene.
Given the action clusters, we use parametric Hidden Markov Models to model these action clusters. Furthermore, a stochastic context-free grammar is generated based on the observations while taking the action clusters as an alphabet.
For recognizing actions, we in principle require a 3D human body tracker. Here, we present our 3D human body tracker, where we treat the 3D tracking problem as an action recognition problem: Instead of tracking in the human joint space, we track in the space of possible actions and action parameters. We call this "tracking in action space". For tracking, we use Bayesian propagation over time, where the propagation step is defined through the stochastic grammar.
16:20 – 16:40
Christopher Geib
This talk presents a new algorithm for symbolic level plan recognition. This algorithm is based on representing the plans to be recognized with a lexicalized grammatical formalism called Combinatory Categorial Grammar(CCG). We show that representing plans with CCGs has a number of desirable computational properties and will discuss how lexicalized action grammar represenations may make bridging the gap between discrete representations and continuous behaviors easier.
16:40 – 17:00
Erion Plaku
Synergistic Combination of High-Level Discrete Planning and Low-Level Motion Planning
This talk presents a novel multi-layered framework, which efficiently plans the sequence of low-level motions and control inputs the robotic system needs to execute so that the resulting trajectory satisfies a given high-level specification. While current methods have focused on reachability, i.e., compute a collision-free trajectory to a goal state, this framework goes beyond reachability and considers far richer specifications given by Finite State Machines, Linear Temporal Logic, and other discrete models widely used in AI and logic.
A distinctive feature of the framework is that the planning layers are not independent, but, in fact, work in tandem. The framework leverages from sampling-based motion planning the underlying idea of searching for a solution trajectory by selectively sampling and exploring the continuous state and control spaces. Drawing from research in AI and logic, high-level discrete planning uses in novel ways information gathered by motion planning to identify increasingly feasible high-level plans that the sampling-based motion planning can further explore to significantly advance the search. Extensive experiments in simulation with high-dimensional dynamical models of ground and flying vehicles demonstrate significant computational speedups over related work.
In addition, the framework is well-suited to compute witness trajectories that indicate violation of safety properties in hybrid systems. As hybrid systems, which combine discrete and continuous dynamics, are often used to model safety-critical protocols in robotics, automated highway systems, and air-traffic management, the verification of safety properties, which assert that nothing "bad" happens, becomes increasingly important. Experiments on high-dimensional hybrid system models of vehicle navigation tasks and aircraft conflict-resolution protocols demonstrate the effectiveness of this synergistic combination of high-level discrete planning and motion planning in discovering violations of safety properties.
17:00 – 17:20
Bhaskara Marthi
Angelic Hierarchical Planning
I will describe a recent approach for integrating planning at various levels of abstraction. The approach is based on a so-called "angelic semantics": essentially, higher levels work with (bounds on) the reachable sets of lower level actions, making the assumption that specific states from the reachable sets will eventually be chosen in a best-case way given the overall plan. This semantics allows correctly ruling out or committing to plans at the high level, vastly reducing the size of the search space. I will also present preliminary results showing how task and robotic motion planning can be combined using this framework.
17:20 – 17:40
Hadas Kress-Gazit
Specifying and achieving high level reactive tasks using logic, synthesis and control primitives
One of the major challenges facing the robotics research community is how to control a robot at a high level. Instead of hard coding and tuning every aspect of every task, one would like to be able to give more abstract instructions, such as “search the rooms” or “find the red ball”, and then have the robot fill in the details . This talk will describe an approach to designing robot controllers in which reactive, environmentally-dependant tasks are formally captured using temporal logic and then automatically transformed into correct by construction hybrid controllers using synthesis techniques and feedback control primitives.
Poster Session (17:40 - 18:40)
Diego Pardo and Cecilio Angulo
Synthesizing Motor Behaviors from Robot Experience
Yi Li and Yiannis Aloimonos
The Joint Synergies: Partitioning Human MoCap Data into Action Segments
Mohammad Ghavamzadeh
Hierarchical Hybrid Reinforcement Learning Algorithms
Elena Gribovskaya and Aude Billard
Combining Task-Level and Trajectory-Level Learning for Teaching Robots Bimanual Coordinated Tasks
David C. Conner, Hadas Kress-Gazit, Howie Choset, Alfred A. Rizzi and George J. Pappas
Feedback policies as motion primitives – design and composition
George Konidaris and Andrew Barto
Toward the Autonomous Acquisition of Robot Skill Hierarchies
Manuel Lopes, Francisco Melo and Luis Montesano
Active Learning for Reward Estimation in Inverse Reinforcement Learning
Stephen Hart and Rod Grupen
A Hybrid Control Architecture for Developmental Learning
Ozan Caldiran, Kadir Haspalamutgil, Abdullah Ok, Can Palaz, Esra Erdem, and Volkan Patoglu
From High-Level Reasoning to Low-Level Control