Predicting Airline Delays Using Xgboost and Random Forest
- Version
- Download 3
- File Size 1.24 MB
- File Count 1
- Create Date 18 July 2025
- Last Updated 18 July 2025
Predicting Airline Delays Using Xgboost and Random Forest
1CH. VASUNDHARA, 2DUMMU NAVEEN
1Assistant Professor, 2MCA Final Semester,
1Master of Computer Applications,
1Sanketika Vidya Parishad Engineering College, Vishakhapatnam, Andhra Pradesh, India
Abstract:
Flight delays pose considerable challenges for both passengers andairlines, resulting in financial burdens, operational inefficiencies, and reduced customer satisfaction. In response to these issues, we present a robust predictive modeling framework that leverages a rich set of historical flight data to accurately forecast potential delays. Our dataset includes features such as scheduled and actual departure times, carrier information, weather conditions, flight routes, and historical delay patterns.We employ the XGBoost regression algorithm, which is well-suited for handling high-dimensional, tabular data and capturing complex nonlinear relationships. Compared to traditional statistical models and baseline machine learning approaches, our XGBoost-based model demonstrates significantly improved predictive accuracy, as measured by metrics such as Mean Absolute Error (MAE) and ROC AUC (for classification-based delay thresholds).A comprehensive feature importance analysis highlights key variables influencing delay likelihood, including departure time of day, weather at origin and destination airports, airline carrier performance, and specific high-traffic routes. These insights not only improve model interpretability but also offer practical guidance for stakeholders in airline scheduling, resourceallocation,and proactive passenger communication.Ultimately, our predictive system enables airlines to anticipate disruptions more effectively, streamline operational planning, reduce costs associated with delays, andenhance the overall passenger experience through more reliable and timely air travel services.
IndexTerms: Machine Learning (ML), Supervised Learning, Semi-supervised Learning, XGBoost, Decision Trees, Random Forest, Logistic Regression
Download