Aaron Birnbaum and Matthew Makansi

From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities | Towards Data Science

Introduction: Can AI really distinguish dog breeds like human experts? One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical. Huskies and Alaskan Malamutes, Shiba Inus and Akitas, I always found myself second-guessing. How do professional veterinarians and researchers spot the differences at a glance? What are they focusing on? 🤔 This question

Aaron Birnbaum and Matthew Makansi March 10, 2025

Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend | Towards Data Science

Running cool experiments is easily one of my favorite parts of working in data science. Most experiments don’t deliver big wins, so the winners make for fun stories. We’ve had a few of these at IntelyCare, and I’m sharing each story in a way that highlights a concept related to experimentation. And in this post, we’ll share a story about how we avoided doing something stupid by running an experiment first, and using it to

Aaron Birnbaum and Matthew Makansi March 10, 2025

Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board | Towards Data Science

Running experiments is a task that often falls to data scientists. If that’s you, congrats! It can be a rewarding and high-impact area of work, but also requires tools found outside the typical ML-heavy data science curriculum. Even with the best tools, only a small share of experiments deliver meaningful business value. I’ve been lucky to design and execute many experiments. Of those, I have a few winners. From these, I’m sharing some stories to

Aaron Birnbaum and Matthew Makansi March 10, 2025

LettuceDetect: A Hallucination Detection Framework for RAG Applications | Towards Data Science

Originally published on HuggingFace TL;DR We present LettuceDetect, a lightweight hallucination detector for Retrieval-Augmented Generation (RAG) pipelines. It is an encoder-based model built on ModernBERT, released under the MIT license with ready-to-use Python packages and pretrained models. What: LettuceDetect is a token-level detector that flags unsupported segments in LLM answers. 🥬 How: Trained on RAGTruth (18k examples), leveraging ModernBERT for context lengths up to 4k tokens. 🚀 Why: It addresses (1) the context-window limits in

Aaron Birnbaum and Matthew Makansi March 10, 2025

Custom Training Pipeline for Object Detection Models | Towards Data Science

What if you want to write the whole object detection training pipeline from scratch, so you can understand each step and be able to customize it? That’s what I set out to do. I examined several well-known object detection pipelines and designed one that best suits my needs and tasks. Thanks to Ultralytics, YOLOx, DAMO-YOLO, RT-DETR and D-FINE repos, I leveraged them to gain deeper understanding into various design details. I ended up implementing SoTA