Reinforcement Learning (EIM)

Reinforcement Learning with Application to Autonomous Systems (EIM)
Learning Objectives
Students
- gain insight into the theory and applications of reinforcement learning
- learn to analyze the challenges in a reinforcement learning application and to identify promising learning approaches
- are able to assess for which problems reinforcement learning is particularly well suited and which disadvantages exist with regard to this.
- understand, explain, and classify relevant basic concepts.
More information as well as lecture noted are available in the moodle course.
Lecture
# | Name | Summary |
---|---|---|
1 | Markov Decision Process | Markov Processes, Markov Reward Processes, Markov Decision Processes, Bellman Expectation Equations, Bellman Optimality Equations |
2 | Dynamic Programming | Policy Evaluation, Policy Iteration, Optimal Policy, Generalized Policy Iteration, Value Iteration |
3 | Monte Carlo Methods | Monte Carlo Prediction, Monte Carlo Policy Evaluation, Monte Carlo Control, Exploration-Exploitation Tradeoff |
4 | Temporal Difference Learning | On-policy TD Control, Off-policy TD Control, SARSA, Q-Learning |
5 | Function Approximation | Incremental Methods, Gradient Descent, Prediction Algorithms for the Linear Case, Control Algorithms for the Linear Case, Batch Methods |
6 | Policy Gradient Methods | Policy-based Methods, Deterministic vs. Stochastic Policies, Gradient-based Estimator, Monte-Carlo REINFORCE, Actor-Critic Architectures |
7 | Introduction to Deep Learning | Components of Deep Learning Architectures, Activation Functions, Output Functions, Typical Applications, Image Classification, Object Segmentation, Object Detection |
8 | Beyond DQN | Double DQN, Dueling DQN, Rainbow DQN, Trust Region Policy Approximation, Soft Actor-Critic |
Lab Work
For the majority of lab session, we will use the OpenAI Gym environment, click here for more information. Besides implementing deep reinforcement learning agents on your own, the deep reinforcement learning library Stable-Baslines3 will be used.
# | Name | Summary |
---|---|---|
1 | Tic-Tac-Toe | Tabular Solution for the Tic-Tac-Toe Game using a simplified temporal difference solution. |
2 | Frozen Lake | We solve the frozen lake environment with Policy as well as Value Iteration |
3 | Blackjack | Monte Carlo Methods are reviewed with the Blackjack environment |
4 | Taxi | Homework assignment to implement SARSA to solve the Taxi environment |
5 | Lunar Lander | First implementation of function approximation with a simple neural network |
6 | Racecar | More advanced deep learning networks in combination with more advanced strategies, e.g. Dueling DQN, are being evaluated. |
7 | DonkeyCar | DonkeyCar offers a more realistic simulation of a racecar using the unity engine. We use the simulator to train a Reinforcement learning agent to drive autonomously. |
8 | RC Car | We use the DonkeyCar-trained agent to evaluate the performance in a real remote-controlled car. |