Combining Actor-Critic Methods with Model Predictive Control via Stein Variational Inference

Cai, Shizhe

Permalink

Access status:

USyd Access

Type

Thesis

Thesis type

Masters by Research

Author/s

Cai, Shizhe

Abstract

Deep Reinforcement Learning (DRL) has demonstrated remarkable success in continuous control tasks. However, it often requires extensive training data, struggles with complex long-horizon planning, and may fail to maintain safety constraints during operation. Meanwhile, Model ...
See moreDeep Reinforcement Learning (DRL) has demonstrated remarkable success in continuous control tasks. However, it often requires extensive training data, struggles with complex long-horizon planning, and may fail to maintain safety constraints during operation. Meanwhile, Model Predictive Control (MPC) provides explainability and constraint satisfaction but typically leads to only locally optimal solutions and demands careful manual design of cost functions. To address these complementary limitations, this thesis develops and validates Q-guided Stein variational model pre- dictive Actor-Critic (Q-STAC), a novel framework that bridges these approaches by integrating Bayesian Model Predictive Control (Bayesian MPC) with actor-critic reinforcement learning through Stein Variational Gradient Descent (SVGD). A core innovation within this framework is the direct optimization of control sequences using learned Q-values as objectives, an approach that eliminates the need for explicit cost function design while leveraging the dynamics of the system to improve sample efficiency and forces that control signals remain within safe boundaries. Extensive experiments on 2D navigation, robotic manipulation tasks and real-world picking task demonstrate that Q-STAC achieves superior sample efficiency, robustness, and optimality compared to State-of-the-Art (SOTA) algorithms.
See less

Date

2026

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Reinforcement Learning
Model Predictive Control
Bayesian Inference
Stein Variational Gradient Descent