Fairness and Explanation in Clustering and Outlier Detection

Ian Davidson

As machines move towards replacing humans in decision making the need to make intelligent systems transparent (explainable and fair) becomes paramount. However, fairness and explanation remain understudied problems for unsupervised learning with a recent survey on explanation not covering the topicand the seminal papers on fairness appearing only in 2017. The work in outlier detection is even more recent appearing only in the last year. The need for transparency in unsu- pervised learning is greater than in supervised learning as the lack of supervision means there is no extrinsic measure why a given model was chosen. Hence there is more room to be unfair and a greater demand for explanation. In this tutorial we will consider fairness and explanation for classic unsupervised learning methods that are used extensively in data-mining. The majority of published work is for clustering but we will also cover newer work on unsupervised outlier detection. We will cover both explanation and fairness from multiple perspectives. We begin with the philosophical, legal and ethical motivations of what we are trying to achieve with fairness and explanation. Then we move onto rigorous formal definitions of these problems, algorithmic solutions along with their limitations. We then overview example applications and future work.