Coincidence time resolution (CTR) in modern time-of-flight (TOF) PET scanners is limited by properties of the detector system, namely scintillator size and material, as well as single-photon time resolution (SPTR) and photon detection efficiency of the readout pixels. Recent studies have demonstrated the effectiveness of incorporating depth of interaction (DOI) information into TOF readout to improve CTR. Having multiple timestamps per event has also been shown to improve CTR through leading edge slope estimation. We propose using convolutional neural networks (CNNs) to improve CTR in PET modules with excellent DOI resolution and access to multiple timestamps per event. Monte Carlo simulations were used to generate PET-like data on 4-to-1 coupled single-ended readout depth-encoding modules with $1.5 \times 1.5 \times 20$ mm3 LYSO crystals and prismatoid light guide arrays. We trained our CNN to perform TOF estimation between the two simulated modules with and without DOI information, as well as with single and multiple timestamps. The results show using DOI and multiple timestamps provides more than 50% improvement in CTR over standard single timestamp acquisition when sufficient SPTR (<100 ps) is available. Our CNN demonstrates the importance of having multiple timestamps and excellent DOI resolution for improving CTR in practical depth-encoding PET modules.