Deep Face Gen: Speech-Driven Face Image Synthesis
- Version
- Download 19
- File Size 248.15 KB
- File Count 1
- Create Date 25 February 2025
- Last Updated 25 February 2025
Deep Face Gen: Speech-Driven Face Image Synthesis
P. Kamakshi Thai 1, P. Manisha 2, L. Abhigyna Reddy 3 and M. Nagendhar Reddy 4
1 Assistant Professor of Department of CSE(AI&ML) of ACE Engineering College.
2,3,4 Students of Department CSE(AI&ML) of ACE Engineering College.
Abstract:
A framework based on Generative Adversarial Networks (GANs) is proposed to synthesize facial images from audio inputs. The system aims to automatically translate large volumes of audio into understandable facial images without human intervention. By using a GAN architecture, the model generates image features from audio waveforms to reconstruct facial images. It is trained on a dataset of labeled examples, producing facial images corresponding to the identities of the speakers. The method achieves an accuracy of 96.88% for ungrouped data and 93.91% for grouped data. This approach demonstrates its capability to generate accurate facial representations from audio, offering an automated solution for converting speech into intelligible visual data.
Keywords: Generative Adversarial Networks, facial image synthesis, audio-to-image, speech-to-visual, automated reconstruction.
Download