Main » Publications by Topic

This list is automatically created, please see publications by year in order to have a more chronological overview on my publications. Note that the list on this page is automatically generated and as such always overlapping due to overlapping keywords.

Reinforcement Learning

Record Number10132
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2007
TitleSolving Deep Memory POMDPs with Recurrent Policy Gradients
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Keywordspolicy gradients, reinforcement learning
AbstractThis paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.
Notesjan
Link to PDFhttp://www-clmc.usc.edu/publications//D/Wierstra_ICANN_2007.pdf

Control

Record Number10132
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2007
TitleSolving Deep Memory POMDPs with Recurrent Policy Gradients
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Keywordspolicy gradients, reinforcement learning
AbstractThis paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.
Notesjan
Link to PDFhttp://www-clmc.usc.edu/publications//D/Wierstra_ICANN_2007.pdf

Learning Motor Primitives

Record Number10132
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2007
TitleSolving Deep Memory POMDPs with Recurrent Policy Gradients
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Keywordspolicy gradients, reinforcement learning
AbstractThis paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.
Notesjan
Link to PDFhttp://www-clmc.usc.edu/publications//D/Wierstra_ICANN_2007.pdf

Robotics

Record Number10132
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2007
TitleSolving Deep Memory POMDPs with Recurrent Policy Gradients
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Keywordspolicy gradients, reinforcement learning
AbstractThis paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.
Notesjan
Link to PDFhttp://www-clmc.usc.edu/publications//D/Wierstra_ICANN_2007.pdf

Human Motor Control

Record Number10132
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2007
TitleSolving Deep Memory POMDPs with Recurrent Policy Gradients
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Keywordspolicy gradients, reinforcement learning
AbstractThis paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.
Notesjan
Link to PDFhttp://www-clmc.usc.edu/publications//D/Wierstra_ICANN_2007.pdf

Book Reviews

Record Number10132
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2007
TitleSolving Deep Memory POMDPs with Recurrent Policy Gradients
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Keywordspolicy gradients, reinforcement learning
AbstractThis paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a “Long Short-Term Memory” architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.
Notesjan
Link to PDFhttp://www-clmc.usc.edu/publications//D/Wierstra_ICANN_2007.pdf

The majority of the publications can also be obtained by Google Scholar where incomplete lists of citations are also given.


Page last modified on September 11, 2008, at 12:57 AM
Designed by: N.Ohanyan & J.Peters. Powered by PmWiki.