HomeNews and EventsEvents Calendar → Seminar


Bayesian Reinforcement Learning

Event Type: Seminar

Date: March 10, 2008

Time: 10:00AM - 11:30AM

Venue: CC-2-2545

Abstract:

A large number of problems is science and engineering can be characterized as sequential decision-making under uncertainty, from robotics and manufacturing to information retrieval and game playing. Many interesting sequential decision-making tasks can be formulated as reinforcement learning (RL) problems. In RL problems, an agent interacts with an unfamiliar, dynamic and stochastic environment, with the goal of finding an action-selection strategy, or policy, to optimize some measure of its long-term performance. Despite extensive research and numerous successes in a number of different domains, there remain several fundamental obstacles hindering the widespread application of RL methodology to real-world problems. Recent advances have shown that Bayesian approach to RL offers viable solutions to some of these major limitations, such as the lack of confidence intervals for performance predictions, the difficulty of appropriately reconciling exploration with exploitation, and the lack of a systematic method for encoding prior knowledge and for formulating domain assumptions.

In this talk, I will present two Bayesian policy gradient RL algorithms. These algorithms use Gaussian processes to define prior distribution over the performance gradient, and obtain closed-form expressions for its posterior distribution, conditioned on the observed data. The posterior mean serves as the policy gradient estimate and is used to update the policy, while the posterior covariance allows us to gauge the reliability of the update. I will present empirical results that indicate the Bayesian policy gradient algorithms require less number of samples to obtain accurate gradient estimates, and therefore, have faster convergence than the conventional Monte-Carlo-based policy gradient algorithms.

Speaker: Mohammad Ghavamzadeh

Speaker Bio:

Mohammad Ghavamzadeh received a Ph.D. degree in Computer Science from University of Massachusetts Amherst in 2005. Since September 2005 he has been a postdoctoral fellow at the Department of Computing Science at University of Alberta working with Prof. Richard Sutton. His research interests lie primarily in Artificial Intelligence and Machine Learning, with emphasis on decision making under uncertainty using principled mathematical tools from probability theory, decision theory, and statistics. His current research is mostly focused on using recent advances in statistical machine learning, especially Bayesian reasoning and kernel methods, to develop more efficient reinforcement learning algorithms.