Sum-Product-Attention Networks: Leveraging Self-Attention in Energy-Based Probabilistic Circuits