How to Use HyDE for Better LLM RAG Retrieval

Building an advanced local LLM RAG pipeline with hypothetical document embeddings

Dr. Leon Eversberg

Published in

Towards Data Science

9 min read

7 hours ago

—

Implementing HyDE is very simple in Python. Image by the author

Large Language Models (LLMs) can be improved by giving them access to external knowledge through documents.

The basic Retrieval Augmented Generation (RAG) pipeline consists of a user query, an embedding model that converts text into embeddings (high-dimensional numerical vectors), a retrieval step that searches for documents similar to the user query in the embedding space, and a generator LLM that uses the retrieved documents to generate an answer [1].

In practice, the RAG retrieval part is crucial. If the retriever does not find the correct document in the document corpus, the LLM has no chance to generate a solid answer.

A problem in the retrieval step can be that the user query is a very short question — with imperfect grammar, spelling, and punctuation — and the corresponding document is a long passage of well-written text that contains the information we want.

A query and the corresponding passage from the MS MARCO dataset, illustrating that typically query and document have different lengths and formats. Image by the author

HyDE is a proposed technique to improve the RAG retrieval step by converting the user question into a…

Countering Mainstream Bias via End-to-End Adaptive Local Learning: Conclusion and References | HackerNoon

In this study, we aim to address the mainstream bias in recommender systems that niche users who possess special and minority interests receive overly low

August 21, 2024

Ticktock! 48 hours left to nab your early-bird tickets for Disrupt 2024 | TechCrunch

Don’t miss out: TechCrunch Disrupt early-bird pricing ends in 48 hours! The countdown is on! With only 48 hours left, the early-bird pricing for TechCrunch

May 30, 2024

Unpacking Soft Power Dynamics | HackerNoon

Before each of his military campaigns, Napoleon always made a point of passing through Épernay, stopping at the cellars of his friend Jean-Rémy Moët [a

November 25, 2024

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

With expert analysis, comprehensive market coverage, and actionable insights, our newsletter equips you with the knowledge & tools necessary to make informed decisions & maximize your potential returns in the dynamic world of future tech stocks.