Addressing Data Heterogeneity in Federated Learning: A Comparative Study of FedAvg and FedProx under IID and Non-IID Scenarios

Version
Download 121
File Size 409.44 KB
File Count 1
Create Date 2 September 2025
Last Updated 2 September 2025

Addressing Data Heterogeneity in Federated Learning: A Comparative Study of FedAvg and FedProx under IID and Non-IID Scenarios

Mr. Santosh Kumar Metta 1, Mrs. A Tulasi 2

1 Department of CS&SE, AUCE, [Andhra University]

2 Department of CS&SE, AUCE, [Andhra University]

1M.Tech Student, Computer Science and Systems Engineering, Andhra University College of Engineering(A), Andhra University, Visakhapatnam.

2Assistant Professor, Computer Science and Systems Engineering, Andhra University College of Engineering(A), Andhra University, Visakhapatnam.

Abstract: Federated Learning (FL) has emerged as a promising paradigm for privacy-preserving machine learning, where data remains localized on clients while contributing to a shared global model. Among the most widely studied algorithms in this field are Federated Averaging (FedAvg) and Federated Proximal (FedProx). This paper presents a comparative study of FedAvg and FedProx under both Independent and Identically Distributed (IID) and Non-IID data scenarios. We utilize the EMNIST dataset (balanced split, 47 classes) with 40 simulated clients under IID and Dirichlet-based Non-IID partitioning. Our experiments demonstrate that FedAvg performs efficiently in IID settings with fast convergence and competitive accuracy, whereas FedProx, by incorporating a proximal regularizer, provides stability and superior performance in Non-IID environments. Performance is assessed using metrics including accuracy, communication overhead, convergence area-under-curve (AUC), and training time. The results highlight that FedAvg is optimal for homogeneous data distributions, while FedProx is more suitable for real-world heterogeneous federated systems.

Keywords: Federated Learning, FedAvg, FedProx, IID, Non-IID, Data Heterogeneity

Download