Count-Based Exploration with Neural Density Models

The PixelCNN model used in this paper is a lightweight variant of the Gated PixelCNN introduced in (van den Oord et al., 2016a). It consists of a 7 × 7 masked convolution, followed by two residual blocks with 1×1 masked convolutions with 16 feature planes, and another 1×1 masked convolution producing 64 features planes, which are mapped by a final masked convolution to the output logits. Inputs are 42 × 42 greyscale images, with pixel values quantized to 8 bins.