Feature Selection with Hierarchical Clustering for Interpretable Models

Create a short list of features using this statistical method (Python Tutorial)

11 min read

13 hours ago

In industry, you can have hundreds and even thousands of potential model features in your dataset. And, using dimensionality reduction methods, like PCA, can leave you with features that are hard to explain. Thankfully, feature clustering can help create a short list of features and an interpretable model.

We will:

  • Apply hierarchical clustering using Python
  • Explain the theory behind this method
  • Discuss its benefit over other clustering methods for feature selection.

We end by gaining some intuition of how the method works using correlation heatmaps. You can also find the project on GitHub.

You may also enjoy this video on the topic. And, if you want to learn more, check out my course — XAI with Python. You can get free access if you sign up to my newsletter.