Bayesian Reinforcement Learning
Event Type: Seminar
Date: March 10, 2008
Time:
10:00AM
- 11:30AM
Venue:
CC-2-2545
Abstract:
A large number of problems is science and engineering can be
characterized as sequential decision-making under uncertainty,
from robotics and manufacturing to information retrieval and
game playing. Many interesting sequential decision-making tasks
can be formulated as reinforcement learning (RL) problems. In RL
problems, an agent interacts with an unfamiliar, dynamic and
stochastic environment, with the goal of finding an action-selection
strategy, or policy, to optimize some measure of its long-term
performance. Despite extensive research and numerous successes
in a number of different domains, there remain several fundamental
obstacles hindering the widespread application of RL methodology
to real-world problems. Recent advances have shown that Bayesian
approach to RL offers viable solutions to some of these major
limitations, such as the lack of confidence intervals for performance
predictions, the difficulty of appropriately reconciling exploration
with exploitation, and the lack of a systematic method for encoding
prior knowledge and for formulating domain assumptions.
In this talk, I will present two Bayesian policy gradient RL algorithms.
These algorithms use Gaussian processes to define prior distribution
over the performance gradient, and obtain closed-form expressions for
its posterior distribution, conditioned on the observed data. The
posterior mean serves as the policy gradient estimate and is used to
update the policy, while the posterior covariance allows us to gauge
the reliability of the update. I will present empirical results that
indicate the Bayesian policy gradient algorithms require less number
of samples to obtain accurate gradient estimates, and therefore, have
faster convergence than the conventional Monte-Carlo-based policy
gradient algorithms.
Speaker:
Mohammad Ghavamzadeh
Speaker Bio:
Mohammad Ghavamzadeh received a Ph.D. degree in Computer Science from
University of Massachusetts Amherst in 2005. Since September 2005 he
has been a postdoctoral fellow at the Department of Computing Science
at University of Alberta working with Prof. Richard Sutton. His
research interests lie primarily in Artificial Intelligence and
Machine Learning, with emphasis on decision making under uncertainty
using principled mathematical tools from probability theory, decision
theory, and statistics. His current research is mostly focused on
using recent advances in statistical machine learning, especially
Bayesian reasoning and kernel methods, to develop more efficient
reinforcement learning algorithms.