Heatmap for Confusion Matrix in Python

One image can be worth thousands of words.

Image by author

Introduction

A confusion matrix is a convenient way to present the types of mistakes a machine learning mode makes. It is an N by N grid with numbers, where the value in the [n, m] cell represents the number of examples annotated with the n-th class recognized as the m-th class. In this tutorial, I will focus on creating a confusion matrix and a heatmap. The color palette will be used to display the sizes of different groups, making it easy to notice similarities or significant differences in group sizes. This kind of visualization is handy when you deal with numerous categories.

Here is a visual explanation of the elements of the confusion matrix.

Image by author

Please remember that the data used to demonstrate confusion matrices is artificial and does not represent any real classification model.

Now, I will explain step by step how to generate such a confusion matrix using Python modules.

Python bare minimum