Exploiting CXL-based Memory for Distributed Deep Learning