Predicting Red Wine Quality with Using Machine Learning
- Version
- Download 2
- File Size 530.24 KB
- File Count 1
- Create Date 18 July 2025
- Last Updated 18 July 2025
Predicting Red Wine Quality with Using Machine Learning
G. Manoj Kumar, D. Satyanarayana
Assistant Professor, MCA Final Semester, Master of Computer Applications,
Sanketika Vidya Parishad Engineering College, Vishakhapatnam, Andhra Pradesh, India
.
Abstract:
This project focuses on the prediction of wine quality using machine learning techniques, specifically targeting red wine samples. The dataset, sourced from UCI Machine Learning Repository, contains various physicochemical features such as acidity, sugar content, chlorides, and alcohol levels. These features are analyzed and visualized to identify trends and correlations with wine quality ratings. Exploratory data analysis includes statistical summaries, bar plots, and a correlation heatmap to understand the relationships among variables and their influence on the target label.
The quality attribute is converted into a binary classification problem using label binarization, categorizing wines as either high (quality ≥ 7) or low quality. The dataset is then split into training and testing subsets to ensure unbiased evaluation. A Random Forest Classifier is employed due to its robustness and ability to handle feature interactions effectively. The model is trained and validated, achieving a notable level of accuracy on the test set.
This project demonstrates the practical application of machine learning in the food and beverage industry, showcasing how data-driven approaches can enhance quality control and decision-making processes in wine production. Further enhancements such as hyperparameter tuning, feature engineering, or model comparison could be explored to improve predictive performance. Wine quality assessment is traditionally performed by human experts through sensory analysis, which can be subjective and time-consuming. This study explores the application of machine learning techniques to predict the quality of red wine based on its physicochemical properties. Using a publicly available dataset containing features such as acidity, sugar content, pH, alcohol level, and more, several supervised learning algorithms—such as Linear Regression, Decision Trees, Random Forest, Support Vector Machines, and Gradient Boosting—were implemented and evaluated. The models were trained to classify wine quality on a scale typically ranging from 0 to 10. Performance was assessed using accuracy, precision, recall, and F1-score metrics. Among the models tested, ensemble methods like Random Forest and Gradient Boosting yielded the highest prediction accuracy.
Download