Degendering Resumes for Fair Algorithmic Resume Screening

P Parasurama, J Sedoc - arXiv preprint arXiv:2112.08910, 2021 - arxiv.org
arXiv preprint arXiv:2112.08910, 2021arxiv.org
We investigate whether it is feasible to remove gendered information from resumes to
mitigate potential bias in algorithmic resume screening. Using a corpus of 709k resumes
from IT firms, we first train a series of models to classify the self-reported gender of the
applicant, thereby measuring the extent and nature of gendered information encoded in
resumes. We then conduct a series of gender obfuscation experiments, where we iteratively
remove gendered information from resumes. Finally, we train a resume screening algorithm …
We investigate whether it is feasible to remove gendered information from resumes to mitigate potential bias in algorithmic resume screening. Using a corpus of 709k resumes from IT firms, we first train a series of models to classify the self-reported gender of the applicant, thereby measuring the extent and nature of gendered information encoded in resumes. We then conduct a series of gender obfuscation experiments, where we iteratively remove gendered information from resumes. Finally, we train a resume screening algorithm and investigate the trade-off between gender obfuscation and screening algorithm performance. Results show: (1) There is a significant amount of gendered information in resumes. (2) Lexicon-based gender obfuscation method (i.e. removing tokens that are predictive of gender) can reduce the amount of gendered information to a large extent. However, after a certain point, the performance of the resume screening algorithm starts suffering. (3) General-purpose gender debiasing methods for NLP models such as removing gender subspace from embeddings are not effective in obfuscating gender.
arxiv.org