β-Multivariational Autoencoder for Entangled Representation Learning in Video Frames