Mastering Atari Games with Discrete World Models

There are two main categories of Reinforcement Learning (RL) algorithms: model-based and model-free. Model free algorithm learn the policy directly by exploring the environment without building a world model, whilst model-based algorithms explicitly build the world model first before synthesising the policy.

The paper argues that Intelligent agents need to generalize from past experience to achieve goals in complex environments, and world models, i.e. model-based RL, facilitate such generalization and allow learning behaviours from imagined outcomes to increase sample-efficiency.

While learning world models from image inputs has recently become feasible for some tasks, modelling Atari games accurately enough to derive successful behaviours has remained an open challenge for many years. This paper introduces DreamerV2, a reinforcement learning agent that learns behaviours purely from predictions in the compact latent space of a powerful world model. The world model uses discrete representations and is trained separately from the policy. The policy or behaviour is trained purely with the world model using an actor and critic approach. By only using the compact and discrete latent space to predict the future, it is a lot more efficient comparing to interacting with the actual environment.

DreamerV2 is the first agent that achieves human-level performance on the Atari benchmark of 55 tasks by learning behaviours inside a separately trained world model. With the same computational budget and wall-clock time, Dreamer V2 reaches 200M frames and surpasses the final performance of the top single-GPU agents using other popular model-free algorithms like IQN and Rainbow.

These large Reinforcement Learning algorithms often require a large amount of computing power. Business Systems International (BSI) is the largest Nvidia GPU supplier in Europe and we provide custom solutions of complete AI Machine Learning environments that enable the training of complex machine learning models such as the DreamerV2.

This article was provided by our AI researcher Bill Shao.

To learn more...

BSI is a Dell Technologies Titanium Partner. Our Dell Technologies AI solutions can be viewed here and our AI inception programme here.

Get in touch to discover how we could optimise your business with AI.

Share:

Tags:

AI

Reinforcement Learning

RL

Algorithms

Posted in AI Blog By Bill Shao on 10/08/2022 11:58

Mastering Atari Games with Discrete World Models

To learn more...

Share:

Tags:

News

Latest Products

Tweets