Machine Learning Models for Cannabis Strain Rating With EDA
- Version
- Download 11
- File Size 160.00 KB
- File Count 1
- Create Date 11 May 2025
- Last Updated 11 May 2025
Machine Learning Models for Cannabis Strain Rating With EDA
Authors:
Harsh Bherwani, Joel Paul Madhavan, Nesar K S
Project Guide - Dr Ravi Lanke
Abstract:
This project investigates the application of machine learning models to predict user ratings of cannabis strains based on features such as flavor, effects, and strain type. Using a dataset of 2,351 strains sourced from Kaggle, the study evaluates three regression models—Linear Regression, Random Forest, and XGBoost—for their predictive accuracy. After preprocessing and feature engineering, including one-hot encoding and CountVectorizer transformations, the models were trained and tested using standard metrics like Mean Squared Error (MSE), R-squared (R²), and Mean Absolute Error (MAE). XGBoost emerged as the best-performing model with an MAE of 0.2591, demonstrating strong capability in capturing complex, non-linear relationships. The findings highlight the utility of predictive modeling in the cannabis industry for enhancing user satisfaction and developing personalized strain recommendations.
Download