Forecasting subway demand in large-scale networks: a deep learning approach