Triple-E: Efficient, Emergent, and Explainable Debiasing for AI Ethics
Triple-E: Efficient, Emergent, and Explainable Debiasing for AI Ethics
Aahan Arora
Chitkara University
aahan1107.be22@chitkara.edu.in
Abstract. Algorithmic bias in machine learning models is a major problem as the application of AI systems in decision-making is on the rise. In this paper, we have introduced a debiasing algorithm named Triple-E, which is applicable to visual classifiers and does not need the sensitive attribute labels. The algorithm is based on subnetwork discovery through trimming, an emergent fairness signal, and post-hoc interpretability using Sparse Autoencoders. We have performed experiments on the CelebA and UTKFace datasets, obtaining higher fairness and comparable accuracy and computational costs than the retraining-based methods.Keywords: Algorithmic Fairness · Debiasing · Explainable AI · Sparse Autoencoders · Neural Network Trimming