Building RAGs Without A Retrieval Model Is a Terrible Mistake

Here are my favorite techniques — one is faster, the other is more accurate.

Photo by Alexander Grey

I build RAG apps; it’s fun!

But the apps I build don’t do well in production. They’re promising prototypes, but they never go live!

The culprit is almost always the retrieval. Come on, this is the heart of RAGs. What are we supposed to build without this?

This is until I index documents for faster or better retrieval.

Indexing helps us engineer solutions that retrieve data faster. It significantly reduces latency, improving the overall app experience. We use indexing in almost every app we build. It has nothing to do with LLMs or RAGs.

Almost all the databases ship with indexing support. For instance, Postgres can do B-Tree, GiST, SP-GiST, BRIN, GIN, and Hash types of indexing. That’s a list long enough to go to a separate future post.

In this post, I’ll discuss the popular indexing strategies I frequently use for better document retrieval. These techniques are, however, specific to RAG apps. You’ll see why in a moment.

My two go-to indexing techniques are multi-representation and ColBERT. These aren’t the only methods we have. And it’s…