Using Random Search and Google Cloud for Reinforcement Learning with Pygame Learning Environment

Last post I extended OpenAI's baselines to work with Pygame Learning Environment (PLE). While I was largely successful and able to get good results on some of the PLE games, I wanted to build upon that to solve Monster Kong (clone of classic Donkey Kong). However training on my GPU and fiddling with hyperparameters is time consuming. Training runs can take over 24 hours. The goal was to move training to the Google Cloud Platform and to create a script that uses random search over available hyperparameters. Unfortunately I was unsuccessful in my attempt. It's doable but I have devoted enough time to this specific problem and wanted to move on. I'll likely pick up the pieces at a later date.

Random Search
I updated my fork of OpenAI's baselines to allow randomized hyperparamter selection over a variety of parameters. See the run_ple_dqn.py file for the details. Nothing particularly noteworthy here. The big issue is that while the random search works fine for training one model, when trying to train multiple models, issues start to appear. Calling run_ple_dqn multiple times causes collisions with tensorflow's variable scope and I am unable to clear the graphs between runs. The simplest solution is to call run_ple_dqn with multiple command line calls. I did so with a launcher called random_search_launcher.py. I thought this worked fine but for some reason the launcher kills the training after running for a couple of hours. Random search will still work if you only call one model. If you have enough cloud instances you can run one random search model on each instance but I would have preferred to have the flexibility to run multiple models on each instance (ie run 4 random search models on 2 instances rather than 1 model on 8 instances). To fix this someone would have to modify the OpenAI's baselines code to allow multiple models to be made or modify random_search_launcher to have better command line calls. I think the latter is probably the easiest fix.

Monster Kong
Monster Kong crashes my GPU computer, crashes the cloud instance I tried, and I couldn't find a good way to reduce the game size. To get Monster Kong to work some testing would have to be done to find out why Monster Kong is causing these issues.

Google Cloud Platform
Eventually, I was able to create a virtual machine (VM) and get my OpenAI baselines with PLE implementation code to run. It was a struggle to get the packages installed and troubleshoot the various errors. If I was going to use this VM repeatedly, I'd likely invest in making a Docker image to save myself the hassle of going through that again. A mostly complete list of the needed command line arguments to get the VM ready to run:

sudo apt-get update
sudo apt-get install gcc g++ make cmake zlib1g-dev python python-dev python3 python3-dev
sudo apt-get install python3-pip python3-dev
sudo pip3 install tensorflow 
sudo pip3 install cloudpickle opencv-python
sudo apt update
sudo apt install -y libsm6 libxext6
sudo apt-get install openmpi-bin
sudo apt install libopenmpi-dev
sudo pip3 install mpi4py
sudo apt-get install git-all

sudo git clone https://github.com/openai/gym.git
cd gym
sudo pip3 install -e .
cd ..

sudo git clone https://github.com/AurelianTactics/PyGame-Learning-Environment.git
cd PyGame-Learning-Environment/
sudo pip3 install -e .
cd ..

sudo git clone https://github.com/AurelianTactics/baselines.git
cd baselines
sudo pip3 install -e .
cd ..

sudo apt-get install python3-pip python3-dev python-virtualenv
sudo mkdir tf
sudo virtualenv --system-site-packages -p python3 tf
source ~/tf/bin/activate
sudo pip3 install --upgrade tensorflow
sudo pip3 install pygame

That left me with two issues. PLE wouldn't run without a soundcard and wouldn't run without a display. The display was a quick fix. I added this code prior to a PLE environment being called:

    import os
    os.putenv('SDL_VIDEODRIVER', 'fbcon')
    os.environ["SDL_VIDEODRIVER"] = "dummy"

Sound was difficult. After some searching I figured I was left with two choices: make a dummy soundcard using alsa (which seemed quite involved) or disable sound in pygame. I went with the later. Basically I went to my PLE fork and replaced pygame.init() with specific inits like pygame.font.init() and pygame.display.init(). Perhaps more inits are needed but it seemed to work fine.

Upcoming
I hope my failures help lead those trying to do something similar down the right track. I'll revisit random search and google cloud instances at a later date. Up next I'm going to finish the UCal Berkeley Deep RL assignments. After that I'd like to try applying tensorforce using a policy gradients method like PPO or A3C.

Comments