The Cart-Pole Experiment as a Benchmark for Trainable Controllers

It is widely believed that balancing an inverted pendulum is a difficult nonlinear control task. Many researchers have used a variant of the inverted pendulum problem, the cart-pole, for demonstrating the success of their neural network learning methods. It has been known for a long time that a linear control law, implemented by a single artificial neuron, can control the cart-pole. Not noted before was that a random search in weight space can quickly uncover coefficients (weights) for controllers that work over a wide range of initial conditions. This result indicates that success in finding a satisfactory neural controller is not sufficient proof for the effectiveness of unsupervised training methods. By analysing the dynamics of the linear controller we reformulate the cart-pole problem to make it a more stringent test for neural training methods. A review of the literature on unsupervised training methods for cart-pole controllers shows that the published results are difficult to compare and that for most of the methods there is no clear evidence of better performance than the random search method.