TTOpt: A Maximum Volume Quantized Tensor Train-based Optimization and its Application to Reinforcement Learning