Speech Emotion Recognition Using Deformable Convolutional Neural Networks

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

Speech Emotion Recognition Using Deformable Convolutional Neural Networks

Version
Download 17
File Size 480.97 KB
File Count 1
Create Date 17 May 2025
Last Updated 17 May 2025

Speech Emotion Recognition Using Deformable Convolutional Neural Networks

A. Pramod Reddy1, Kura Abhiram2, Kunchem Rakesh3 , M. Sai Kiran4, M Naresh5

1-5 Department of CSE & TKR College of Engineering & Technology

2-5cB.Tech Students

ABSTRACT

Speech Emotion Recognition (SER) enhances human-computer interaction by enabling machines to detect and respond to emotional cues in speech. This project proposes a deep learning-based SER system using Deformable Convolutional Neural Networks (DCNNs), which dynamically adjust receptive fields to better capture nuanced speech patterns often missed by standard CNNs. It leverages three benchmark datasets—RAVDESS, CREMA-D, and TESS—providing a rich and diverse emotional speech corpus. Preprocessed audio is transformed into MFCCs and Mel-spectrograms, which are stacked to form a dual-channel input for the DCNN. The model accurately classifies eight core emotions across varied speakers and conditions. Results show DCNNs significantly outperform conventional CNNs, highlighting their potential in applications like virtual assistants, mental health tools, and customer service systems.

Keywords — SER, DCNN, TESS, CREMA, MFCC

Download