Samy Baladram

AI

Gradient Boosting Regressor, Explained: A Visual Guide with Code Examples

ENSEMBLE LEARNING Fitting to errors one booster stage at a time Samy Baladram · Follow Published in Towards Data Science · 11 min read · 9 hours ago — Decision Tree Regressor, Explained: A Visual Guide with Code Examples Trimming branches smartly with Cost-Complexity Pruning towardsdatascience.com Of course, in machine learning, we want our predictions spot on. We started with simple decision trees — they worked okay. Then came Random Forests and AdaBoost, which did

Read More »
AI

AdaBoost Classifier, Explained: A Visual Guide with Code Examples

ENSEMBLE LEARNING Putting the weight where weak learners need it most Samy Baladram · Follow Published in Towards Data Science · 11 min read · 10 hours ago — Random Forest, Explained: A Visual Guide with Code Examples Making tree-mendous predictions with random trees towardsdatascience.com Everyone makes mistakes — even the simplest decision trees in machine learning. Instead of ignoring them, AdaBoost (Adaptive Boosting) algorithm does something different: it learns (or adapts) from these mistakes

Read More »
AI

Least Squares Regression, Explained: A Visual Guide with Code Examples for Beginners

REGRESSION ALGORITHM Gliding through points to minimize squares Samy Baladram · Follow Published in Towards Data Science · 11 min read · 6 hours ago — When people start learning about data analysis, they usually begin with linear regression. There’s a good reason for this — it’s one of the most useful and straightforward ways to understand how regression works. The most common approaches to linear regression are called “Least Squares Methods” — these work

Read More »
AI

Data Leakage in Preprocessing, Explained: A Visual Guide with Code Examples

DATA PREPROCESSING 10 sneaky ways your preprocessing pipeline leaks Samy Baladram · Follow Published in Towards Data Science · 13 min read · 3 hours ago — In my experience teaching machine learning, students often come to me with this same problem: “My model was performing great — over 90% accuracy! But when I submitted it for testing on the hidden dataset, it is not as good now. What went wrong?” This situation almost always

Read More »
AI

Oversampling and Undersampling, Explained: A Visual Guide with Mini 2D Dataset

DATA PREPROCESSING Artificially generating and deleting data for the greater good Samy Baladram · Follow Published in Towards Data Science · 9 min read · 15 hours ago — ⛳️ More DATA PREPROCESSING, explained: · Missing Value Imputation · Categorical Encoding · Data Scaling · Discretization ▶ Oversampling & Undersampling Collecting a dataset where each class has exactly the same number of class to predict can be a challenge. In reality, things are rarely perfectly

Read More »
AI

Discretization, Explained: A Visual Guide with Code Examples for Beginners

DATA PREPROCESSING 6 fun ways to categorize numbers into bins! Samy Baladram · Follow Published in Towards Data Science · 10 min read · 2 hours ago — ⛳️ More DATA PREPROCESSING, explained: · Missing Value Imputation · Categorical Encoding · Data Scaling ▶ Discretization · Over- & Under-sampling (soon!) Most machine learning model requires the data to be numerical — all object or categorical data has to be in numerical format first. But, actually,

Read More »
AI

Gaussian Naive Bayes, Explained: A Visual Guide with Code Examples for Beginners

CLASSIFICATION ALGORITHM Bell-shaped assumptions for better predictions Samy Baladram · Follow Published in Towards Data Science · 9 min read · 2 hours ago — ⛳️ More CLASSIFICATION ALGORITHM, explained: · Dummy Classifier · K Nearest Neighbor Classifier · Bernoulli Naive Bayes ▶ Gaussian Naive Bayes · Decision Tree Classifier · Logistic Regression · Support Vector Classifier · Multilayer Perceptron (soon!) Building on our previous article about Bernoulli Naive Bayes, which handles binary data, we

Read More »
AI

K Nearest Neighbor Regressor, Explained: A Visual Guide with Code Examples

REGRESSION ALGORITHM Finding the neighbors FAST with KD Trees and Ball Trees Samy Baladram · Follow Published in Towards Data Science · 11 min read · 8 hours ago — K Nearest Neighbor Classifier, Explained: A Visual Guide with Code Examples for Beginners The friendly neighbor approach to machine learning towardsdatascience.com Building on our exploration of the Nearest Neighbor Classifier, let’s turn to its sibling in the regression world. The Nearest Neighbor Regressor applies the

Read More »
AI

Dummy Regressor, Explained: A Visual Guide with Code Examples for Beginners

REGRESSION ALGORITHM Naively choosing the best number for all of your prediction Samy Baladram · Follow Published in Towards Data Science · 7 min read · 2 hours ago — There are a lot of times when my students come to me saying that they want to try the most sophisticated model out there for their machine learning tasks, and sometimes, I jokingly said, “Have you tried the best ever model first?” Especially in regression

Read More »
AI

Encoding Categorical Data, Explained: A Visual Guide with Code Example for Beginners

DATA PREPROCESSING Six ways of matchmaking categories and numbers Samy Baladram · Follow Published in Towards Data Science · 10 min read · 12 hours ago — Ah, categorical data — the colorful characters in our datasets that machines just can’t seem to understand. This is where “red” becomes 1, “blue” 2, and data scientists turn into language translators (or more like matchmakers?). Now, I know what you’re thinking: “Encoding? Isn’t that just assigning numbers

Read More »