Actor-Critic Reinforcement Learning for a Lunar Lander

Jan 31, 2022

Lunar Lander Landing between Flags

Description

The objective of the project is to learn a policy to smoothly descend a lunar lander to the ground in between two flags with minimal fuel use and without collapsing. To this end, Generalized Advantage Estimation (GAE) is used to estimate actor and critic, respectively, while decreasing the variance of the policy gradient estimates and keeping them unbiased.

Reinforcement Learning

Actor-Critic Reinforcement Learning for a Lunar Lander

Description

Chenhao Li

Reinforcement Learning for Robotics