Emotion Classification from Speech and Text in Videos Using a Multimodal Approach