Confidence Interval vs. Prediction Interval

A small but important difference that you should know

(Image by the author)

In many data science-related tasks, we want to know how certain we are about the result. Knowing how much we can trust a result helps us to make better decisions.

Once we have quantified the level of uncertainty that comes with a result we can use it for:

  • scenario planning to evaluate a best-case and worst-case scenario
  • risk assessment to evaluate the impact on decisions
  • model evaluation to compare different models and model performance
  • communication with decision-makers about how much they should trust the results

Where does the uncertainty come from?

Let’s look at a simple example. We want to estimate the mean price of a 300-square-meter house in Germany. Collecting the data for all 300-square-meter houses is not viable. Instead, we will calculate the mean price based on a representative subset.