In previous blog, we use the Keras to play the FlappyBird. Similarity, we will use another deep learning toolkit Tensorflow to develop the DQN and Double DQN and to play the another game Breakout (Atari 3600).

Here, we will use the OpenAI gym toolkit to construct out environment. Detail implementation is as follows:

And then we look some demos:

For deep learning purpose, we need to crop the image to a square image:

Ok, now let us to use the Tensorflow to develop the DQN algorithm first.

First of all, we need to reference some packages and initialize the environment.

As mentioned above, we need to crop the image and preprocess the input before feed the raw image into the algorithm. So we define a StateProcessor class to do this.

We first convert the image to gray image and then resize it to 84 by 84 (the size DQN paper used). Then, we construct the neural network to estimate the value function. The structure of the network as the same as the DQN paper.

As mentioned in DQN paper, there are two network that share the same parameters in DQN algorithm. We need to copy the parameters to the target network on each $t$ steps.

We also need a policy to take an action.

Now let us to develop the DQN algorithm (we skip the details here because we explained it earlier).

Finally, run it.

Next, we will develop the Double-DQN algorithm. This algorithm only has a little changes.

In DQN q_learning method,

we just change these codes to,