KV

専攻講演会

Department Lecture

専攻講演会

Department Lecture

Towards a Subgroup-Informed Approach to AI and ML

In ML and AI, the traditional approach consists in characterizing model behavior in terms of its overall performance, using metrics such as accuracy, false-positive or false-positive rates, rank correlation, and more. In this talk, we will develop a subgroup-informed approach, in which we will provide tools and concepts to investigate model behavior, and data behavior, at the level of subgroups.

We present DivExplorer, a tool that enables the discovery and characterization of data subgroups that behave in anomalous fashion, either in terms of their data characteristics, or in terms of ML model performance on that data. DivExplorer is implemented in a Python package that can be used to analyze any Pandas dataframe, and is simple enough to be used by researchers and practitioners alike. On this basis, we show how a subgroup-informed approach can be used to investigate ML model performance, to guide data acquisition for model improvement, to analyze model fairness, to detect distribution shifts at subgroup level, and more.