How I Created a Data Science Project Following CRISP-DM Lifecycle

An end-to-end project using the CRISP-DM framework

22 min read

11 hours ago

Image created by AI. OpenAI. CRISP-DM Project Lifecycle Framework for Data Science. DALL-E, 2024.

Introduction

CRISP-DM stands for Cross-Industry Standard Process for Data Mining, a data mining framework open to anyone who wants to use it.

Its first version was created by SPSS, Daimler-Benz and NCR. Then, a group of companies developed and evolved it to CRISP-DM, which nowadays is one of the most known and adopted frameworks in data science.

The process consists of 6 phases, and it is flexible. It is more like a living organism where you can (and probably should) go back and forth between the phases, iterating and enhancing the results.

The phases are:

Business Understanding

Data Understanding

Data Preparation

Modeling

Evaluation

Deployment

The small arrows show a natural path from Business Understanding to Deployment—where the interactions occur directly—while the circle denotes a cyclic relationship between the phases. This means that the project does not end with Deployment but can be restarted due to new business questions triggered by the project or…