Enabling Equal Opportunity in Logistic Regression Algorithm
Research Question: This paper aims at adjusting the logistic regression algorithm to mitigate unwanted discrimination shown towards race, gender, etc. Motivation: Decades of research in the field of algorithm design have been dedicated to making a better prediction model. Many algorithms are designed and improved, which made them better than the judgments of people and even experts. However, in recent years it has been discovered that predictive models can make unwanted discrimination. Such unwanted discrimination in the predictive model can lead to legal consequences. In order to mitigate the problem of unwanted discrimination, we propose equal opportunity between privileged and discriminated groups in the logistic regression algorithm. Idea: Our idea is to add a regularization term in the goal function of the logistic regression. Therefore, our predictive model will solve both the social problem and the predictive problem. More specifically, our model will provide fair and accurate predictions. Data: The data used in this research present U.S. census data describing individuals using personal characteristics with a goal to provide a binary classification model for predicting if an individual has an annual salary above $50k. The dataset used is known for disparate impact regarding female individuals. In addition, we used the COMPAS dataset aimed at predicting recidivism. COMPAS is biased toward African-Americans. Tools: We developed a novel regularization technique for equal opportunity in the logistic regression algorithm. The proposed regularization is compared against classical logistic regression and fairness constraint logistic regression, using a ten-fold cross-validation. Findings: The results suggest that equal opportunity logistic regression manages to create a fair prediction model. More specifically, our model improved both disparate impact and equal opportunity compared to classical logistic regression, with a minor loss in prediction accuracy. Compared to the disparate impact constrained logistic regression, our approach has higher prediction accuracy and equal opportunity, while having a lower disparate impact. By inspecting the coefficients of our approach and classical logistic regression, one can see that proxy attribute coefficients are reduced to very low values. Contribution: The main contribution of this paper is in the methodological part. More specifically, we implemented an equal opportunity in the logistic regression algorithm.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.