Survey on Speech Emotion Recognition with Expressive Speech Synthesis
- Version
- Download 18
- File Size 320.00 KB
- File Count 1
- Create Date 29 March 2025
- Last Updated 29 March 2025
Survey on Speech Emotion Recognition with Expressive Speech Synthesis
Authors:
Abhimanue S1, Dr. Jyothish K John2
1Department of Computer Science and Engineering Federal Institute of Science And Technology, Angamaly, 683 577, Kerala, India
2Department of Computer Science and Engineering Federal Institute of Science And Technology, Angamaly, 683 577, Kerala, India
Abstract - Emotion plays a key role in identifying the state of a person, that is, whether they are angry, sad, happy, etc. The paper presents an integrated framework that recognizes emotions from speech, generates emotionally aware responses, and synchronizes facial expressions to provide an animated video response. The system provides real-time, empathetic interactions for emotional support. It focuses on identifying the emotion of the person, especially to know if the person is depressed or having a hard time, so that it can provide emotional support to them, to overcome the feeling of distress and isolation using emotion-incorporated synthesis. The proposed system provides responses like an actual human and keeps them company in the absence of an actual individual, who provides emotional support. It acts like a companion or friend in the time of need. The framework employs an integrated approach combining Wav2Vec for deep learning-based speech emotion recognition, Coqui TTS for emotion-aware voice synthesis, and Neuro Sync for synchronized facial animation visualization through Unreal Engine’s Meta Human models. Unlike existing solutions that rely on handcrafted features or separate disconnected components, our framework creates an end-to-end solution that achieves 74.506% emotion recognition accuracy while generating contextually appropriate emotional responses with synchronized facial expressions in real-time. This multimodal approach distinguishes itself from traditional methods by seamlessly bridging emotion recognition and response generation, creating a more natural and empathetic human-computer inter- action experience specifically designed for emotional support applications.
Key Words: Emotion recognition, emotion-incorporated syn- thesis, emotional support agent
Download