AMPeD: An Analytical Model for Performance in Distributed Training of Transformers