Addressing Racial Bias in Facial Emotion Recognition
A Fan, X Xiao, P Washington - arXiv preprint arXiv:2308.04674, 2023 - arxiv.org
A Fan, X Xiao, P Washington
arXiv preprint arXiv:2308.04674, 2023•arxiv.orgFairness in deep learning models trained with high-dimensional inputs and subjective labels
remains a complex and understudied area. Facial emotion recognition, a domain where
datasets are often racially imbalanced, can lead to models that yield disparate outcomes
across racial groups. This study focuses on analyzing racial bias by sub-sampling training
sets with varied racial distributions and assessing test performance across these
simulations. Our findings indicate that smaller datasets with posed faces improve on both …
remains a complex and understudied area. Facial emotion recognition, a domain where
datasets are often racially imbalanced, can lead to models that yield disparate outcomes
across racial groups. This study focuses on analyzing racial bias by sub-sampling training
sets with varied racial distributions and assessing test performance across these
simulations. Our findings indicate that smaller datasets with posed faces improve on both …
Fairness in deep learning models trained with high-dimensional inputs and subjective labels remains a complex and understudied area. Facial emotion recognition, a domain where datasets are often racially imbalanced, can lead to models that yield disparate outcomes across racial groups. This study focuses on analyzing racial bias by sub-sampling training sets with varied racial distributions and assessing test performance across these simulations. Our findings indicate that smaller datasets with posed faces improve on both fairness and performance metrics as the simulations approach racial balance. Notably, the F1-score increases by points, and demographic parity increases by points on average across the simulations. However, in larger datasets with greater facial variation, fairness metrics generally remain constant, suggesting that racial balance by itself is insufficient to achieve parity in test performance across different racial groups.
arxiv.org