What to Do If the Logit Decision Boundary Fails?

Feature engineering for classification models using Bayesian Machine Learning

Logistic regression is by far the most widely used machine learning model for binary classification datasets. The model is relatively simple and is based on a key assumption: the existence of a linear decision boundary (a line or a surface in a higher-dimensional feature space) that can separate the classes of the target variable y based on the features in the model.

In a nutshell, the decision boundary can be interpreted as a threshold at which the model assigns a data point to one class or the other, conditional on the predicted likelihood of belonging to a class.

The figure below presents a schematic representation of the decision boundary that separates the target variable into two classes. In this case the model is based on a set of two features (x1 and x2). The target variable can be clearly separated into two classes based on the values of the features.

However, in your daily modeling activities, the situation might look rather similar to the figure below.