Syllabus


Udacity Deep Reinforcement Learning

Trainers

  • Arthur Juliani, Deep Learning Researcher at Unity
  • Avilay Parekh, Principal Machine Learning Engineer at Unity
  • Melody Guan, Machine Learning Ph.D. at Stanford University
  • Peter Welinder, Research Scientist at OpenAI
  • Vincent Gao, Software Engineer (Machine Learning) at Unity

Estimated Time

  • 4 months: at 10-15 hrs/week
  • Total: 160-240 hrs

Contents

1. Foundations of Reinforcement Learning

  • how to define real-world problems as Markov Decision Processes (MDPs), so that they can be solved with reinforcement learning.
  • implement classical methods such as SARSA and Q-learning to solve several environments in OpenAI Gym

2. Value-Based Methods

  • how to leverage neural networks when solving complex problems using the Deep Q-Networks (DQN) algorithm
  • double Q-learning
  • prioritized experience replay
  • dueling networks
  • create an artificially intelligent game-playing agent that can navigate a spaceship
  • use a Gazebo simulation to train a rover to navigate an environment without running into walls

3. Policy-Based Methods

  • Proximal Policy Optimization (PPO)
  • Advantage Actor-Critic (A2C)
  • Deep Deterministic Policy Gradients (DDPG)
  • optimization techniques such as evolution strategies and hill climbing
  • how to apply deep reinforcement learning techniques to finance and explore an algorithm for optimal execution of portfolio transactions

4. Multi-Agent Reinforcement Learning

  • Most of reinforcement learning is concerned with a single agent that seeks to demonstrate proficiency at a single task. In this agent’s environment, there are no other agents.
  • However, if we’d like our agents to become truly intelligent, they must be able to communicate with and learn from other agents. In the final part of this nanodegree, we will extend the traditional framework to include multiple agents.
  • Monte Carlo Tree Search (MCTS), the skills behind DeepMind’s AlphaZero

Projects

  • Project 1: Navigation In the first project, you’ll leverage neural networks to train an agent to navigate a virtual world and collect as many yellow bananas as possible while avoiding blue banana

  • Project 2: Continuous Control In the second project, you’ll write an algorithm to train a robotic arm to reach moving target positions.

  • Project 3: Collaboration and Competition In the final project of the Nanodegree program, you’ll design your own algorithm to train a pair of agents to play tennis.


Tools

  • All of the projects in this Nanodegree program use the rich simulation environments from the Unity Machine Learning Agents (ML-Agents) software development kit (SDK). It is a flexible and intuitive framework which enables:

    • Academic and industry researchers to study complex behaviors from visual content and realistic physics
    • Industrial and enterprise researchers to implement large-scale parallel training regimes for robotics, autonomous vehicles, and other industrial applications
    • Game developers to tackle challenges, such as using agents to dynamically adjust the game-difficulty level
  • Tanks


Installation of tools


Reach out


Ref


Books to read


Good to know

  • Learn by doing
  • Data from nearly 100,000 Udacity graduates show that commitment and persistence are the highest predictors of whether or not a student will graduate.