Comparative Analysis of Machine Learning Algorithms for Email Spam Detection using TF-IDF

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

Comparative Analysis of Machine Learning Algorithms for Email Spam Detection using TF-IDF

Version

File Size 529.59 KB

Downloads 64

Files 1

Published 16 April 2026

Updated 16 April 2026

Comparative Analysis of Machine Learning Algorithms for Email Spam Detection using TF-IDF

Patinavalasa Durga Prasad*1, Pathri Deepthi Sri2, Doddi Praveen Kumar3, Sayyed Akbar Alisha4, Vasireddi Saran Manikanta5, Suneel Kimar Duvvuri6

1Student, M.Sc (Computer Science), Government College (Autonomous), Rajahmundry, Adhra Pradesh, India.

2Student, B.Sc (Artificial Intelligence), Government College (Autonomous), Rajahmundry, Adhra Pradesh, India.

3Student, B.Sc (Artificial Intelligence), Government College (Autonomous), Rajahmundry, Adhra Pradesh, India.

4Student, B.Sc (Artificial Intelligence), Government College (Autonomous), Rajahmundry, Adhra Pradesh, India.

5Student, B.Sc (Artificial Intelligence), Government College (Autonomous), Rajahmundry, Adhra Pradesh, India.

6Assistant Professor, Department of Computer Science, Government College (Autonomous), Rajahmundry, Adhra Pradesh, India.

Abstract – Spam email detection has become a critical challenge in modern communication systems due to the increasing volume of unwanted and malicious emails. This research presents a comparative analysis of multiple machine learning algorithms for efficient spam classification. The study utilizes a labeled dataset containing spam and ham messages, which is preprocessed and transformed using Term Frequency–Inverse Document Frequency (TF-IDF) vectorization.Five different machine learning algorithms, namely Gaussian Naive Bayes, K-Nearest Neighbors (KNN), Decision Tree, Random Forest, and Support Vector Machine (SVM), are implemented and evaluated. The dataset is split into training and testing sets, and performance is measured using accuracy and confusion matrix.