Disaster Response Systems and Data Integrity: Optimizing Crowdsourced Inputs with Simple Machine Learning
- Version
- Download 13
- File Size 271.01 KB
- File Count 1
- Create Date 5 August 2025
- Last Updated 5 August 2025
Disaster Response Systems and Data Integrity: Optimizing Crowdsourced Inputs with Simple Machine Learning
Mohammed Bilal Ahmed1
1 Independent Researcher, Florida, USA
Abstract - In the chaotic aftermath of natural disasters, rapid, accurate situational awareness is essential to direct search-and-rescue efforts effectively. Crowdsourced data—volunteered reports via social media, SMS, and ad hoc mobile applications—offers invaluable ground-level insights but is often fraught with noise, duplication, and misinformation. High‑complexity deep learning solutions can improve data quality but demand substantial computational resources and stable connectivity, constraints rarely met in field deployments. This paper introduces a lightweight pipeline employing logistic regression and random forest classifiers, integrated with rule‑based validation, to optimize the integrity of crowdsourced inputs with minimal computation. We evaluate our approach on the CrisisNLP tweet corpus and a synthetic sensor dataset simulating flood‑level readings. Results demonstrate a 22% increase in precision and a 0.18 improvement in F1 score over a heuristic baseline, while maintaining recall above 0.90. The offline‑first design enables deployment on modest hardware and ensures robust performance in connectivity‑scarce environments. All code and datasets are publicly available for reproducibility and adaptation.
Keywords: disaster response, crowdsourcing, data integrity, logistic regression, random forest, offline computing
Download