Lead Scoring for CRM Systems Using Machine Learning: A Comparative Study of Classification Algorithms
Lead Scoring for CRM Systems Using Machine Learning: A Comparative Study of Classification Algorithms
Authors:
Gulamjilani Gulamfarid Shaikh
Department of Artificial Intelligence & Data Science
Parul Institute of Engineering & Technology
Faculty of Engineering & Technology, Parul University
Vadodara, Gujarat, India
Email: shaikhgulamjilani21@gmail.com
ABSTRACT
One fundamental feature of modern Customer Relationship Management (CRM) platforms is lead scoring, which ranks prospects based on the probability of conversion for sales teams to prioritize follow-up actions. Purely on rules obtained from the literature space, traditional rule-based approaches are subjective, and can miss complex behaviours. In In this research, a machine learning classification-based data-driven lead scoring framework is proposed. For example, four supervised models were trained: Logistic Regression (LR), K-Nearest Neighbours (KNN), Support Vector Machine (SVM) and Random Forest, on a public lead dataset with 9240 records. The models are evaluated by accuracy, precision, recall, F1-score and ROC-AUC metrics. The Random Forest classifier outperformed the rest based on the global performance metrics (accuracy = 0.79, precision = 0.78, recall = 0.77, F1-score = 0.78 and ROC-AUC Score= 0.77). Feature importance analysis shows that Total Time Spent on Website, Lead Origin and Total visits are the top predictors for lead conversion respectively. Real-time deployment of the trained model was demonstrated by developing a web application on flask. This work was conducted as part of a B.Tech final semester internship at Enlighten Infosystems, an IT company developing CRM software.
Keywords: Lead Scoring, Customer Relationship Management, Machine Learning, Random Forest, Classification, Flask, Python, Predictive Analytics