Vulgar Comment Classification Using BERT-Based Models: A Comprehensive Study
- Version
- Download 9
- File Size 540.08 KB
- File Count 1
- Create Date 20 May 2025
- Last Updated 20 May 2025
Vulgar Comment Classification Using BERT-Based Models: A Comprehensive Study
Authors:
Sourabh Kumar, Sudhir Kumar
Abstract: The proliferation of user-generated content on social media has led to an upsurge in vulgar and offensive comments, posing significant challenges for online platforms. This research presents a comprehensive investigation of BERT-based models for the classification of vulgar comments. We systematically explore data preprocessing, model architectures, training strategies, and ensemble methods. Our experiments, conducted on benchmark datasets such as OLID and Jigsaw, demonstrate that BERT-based ensembles outperform traditional and standalone deep learning models, achieving up to 94.7% accuracy. The study also provides insights into the detection of implicit toxicity and the adaptation of models to evolving online language.
Download