Building a RAG chain using LangChain Expression Language (LCEL)

Learning the building blocks of LCEL to develop increasingly complex RAG chains

Roshan Santhosh

Published in

Towards Data Science

7 min read

2 days ago

—

In this post, I will be going over the implementation of a Self-evaluation RAG pipeline for question-answering using LangChain Expression Language (LCEL). The focus of this post will be on the use of LCEL for building pipelines and not so much on the actual RAG and self evaluation principles used, which are kept simple for ease of understanding.

I will be covering the following topics :

Basic initialization steps
Development of different variations of the RAG pipeline of increasing complexity using LCEL
Methods for extracting intermediate variables from a LCEL-scripted pipeline
Reasons for using LCEL

The Setup

Before we jump into the development of the RAG chain, there are some basic setup steps that we need to perform to initialize this setup. These include :

Data Ingestion

The data ingestion consists of two key steps :

Reading the text from the pdf
Splitting up the pdf text into chunks for inputting to the vector database

Prompt Templates

We will be using different prompts for the question-answering and self-evaluation tasks. We will be having 3 different prompt templates :

qa_prompt : Basic prompt for the question-answering task
qa_eval_prompt : Prompt for evaluator model that takes as input question-answer pair
qa_eval_prompt_with_context : Similar to above prompt but additionally includes the context as well for the evaluation

Database Initialization

We initialize a simple vector database using FAISS and Open AI embeddings. For retrieval, we set k as 3 (return top 3 chunks for a given query)

RAG Development

Simple QA RAG

We start off with an example of a basic RAG chain that carries out the following steps :

Retrieves the relevant chunks (splits of pdf text) from the vector database based on the user’s question and merges them into a single string
Passes the retrieved context text along with question to the prompt template to generate the prompt
Input generated prompt to LLM to generate final answer

Using LangChain Expression Language(LCEL), this RAG would be implemented as such:

rag_chain = ( 
            RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
            qa_prompt | 
            llm 
)

The above code primarily follows the pipe architecture where the output from the preceding element is used as the input for the next element. The below diagram showcases the flow of data. Starting from the user’s input, it passes first through the RunnableParallel block, then through the qa_prompt to generate the prompt. This prompt is then sent to the LLM to generate the final output.

Basic LCEL input/output flow

There are two key additions to this pipeline that are unique to LangChain :

RunnableParallel : As the name suggests, this class provides the functionality to run multiple processes in parallel. As a result, the output of a RunnableParallel is a dict with the keys being the arguments provided during its initialization. In this case, the output would have two keys : context and question.
So why do we need this in our current situation? Its required because the qa_prompt template requires two input values: the context and the question. Therefore we need to compute these values individually and then pass them together to the qa_prompt template.
RunnablePassthrough : This is a useful class when you want to pass through the input to the next stage without any modification. Essentially, this acts as an identity function that returns whatever is passed as its input.

The flowchart for the above RAG would look like this :

QA RAG with Self Evaluation I

Building over the previous RAG chain, we now introduce new elements into the chain to implement the self evaluation component.

The self evaluation component is again a pretty straightforward implementation. We take the answer provided by the first LLM and pass it to the evaluator LLM along with the question and ask it to provide a binary response (Correct/Incorrect).

rag_chain = ( 
            RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
            RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question") ) |
            qa_eval_prompt | 
            llm_selfeval |
            json_parser
            )

The first key difference is the addition of an additional RunnableParallel component. This is required because, similar to the initial prompt for the QA, the self eval prompt also requires two inputs : the base LLM’s answer as well as the user’s question.

So the output of the first RunnableParallel is the context text and the question while the output of the second RunnableParallel is the LLM answer along with the question.

NOTE: For the second RunnableParallel, we use the itemgetter method to retain only the question value from the previous input and propagate it forward. This is done instead of using RunnablePassthrough as it would passed on the full input (dict with two keys) whereas we are only interested in passing on the question right now and not the context. Additionally, there is the issue of formatting as qa_eval_prompt expects a dict with str -> str mapping but using RunnablePassthrough would results in a str-> dict mapping

The flowchart for this RAG implementation would look like this:

QA RAG with Self Evaluation II

For this variation, we make a change to the evaluation procedure. In addition to the question-answer pair, we also pass the retrieved context to the evaluator LLM.

To accomplish this, we add an additional itemgetter function in the second RunnableParallel to collect the context string and pass it to the new qa_eval_prompt_with_context prompt template.

rag_chain = ( 
            RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
            RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question"), context = itemgetter("context") ) |
            qa_eval_prompt_with_context | 
            llm_selfeval |
            json_parser
            )

Implementation Flowchart :

Retrieving intermediate variables

One of the common pain points with using a chain implementation like LCEL is the difficulty in accessing the intermediate variables, which is important for debugging pipelines. We look at few options where we can still access any intermediate variables we are interested using manipulations of the LCEL

Using RunnableParallel to carry forward intermediate outputs

As we saw earlier, RunnableParallel allows us to carry multiple arguments forward to the next step in the chain. So we use this ability of RunnableParallel to carry forward the required intermediate values all the way till the end.

In the below example, we modify the original self eval RAG chain to output the retrieved context text along with the final self evaluation output. The primary change is that we add a RunnableParallel object to every step of the process to carry forward the context variable.

Additionally, we also use the itemgetter function to clearly specify the inputs for the subsequent steps. For example, for the last two RunnableParallel objects, we use itemgetter(‘input’) to ensure that only the input argument from the previous step is passed on to the LLM/ Json parser objects.

rag_chain = ( 
            RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
            RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question"), context = itemgetter("context") ) |
            RunnableParallel(input =  qa_eval_prompt, context = itemgetter("context")) |
            RunnableParallel(input = itemgetter("input") | llm_selfeval , context = itemgetter("context") ) | 
            RunnableParallel(input = itemgetter("input") | json_parser,  context = itemgetter("context") )
            )

The output from this chain looks like the following :

A more concise variation:

rag_chain = ( 
            RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
            RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question"), context = itemgetter("context") ) |
            RunnableParallel(input =  qa_eval_prompt | llm_selfeval | json_parser, context = itemgetter("context"))
            )

Using Global variables to save intermediate steps

This method essentially uses the principle of a logger. We introduce a new function that saves its input to a global variable, thus allowing us access to the intermediate variable through the global variable

global contextdef save_context(x):
    global context
    context = x
    return x
rag_chain = ( 
            RunnableParallel(context = retriever | format_docs | save_context, question = RunnablePassthrough() ) |
            RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question") ) |
            qa_eval_prompt | 
            llm_selfeval |
            json_parser
            )

Here we define a global variable called context and a function called save_context that saves its input value to the global context variable before returning the same input. In the chain, we add the save_context function as the last step of the context retrieval step.

This option allows you to access any intermediate steps without making major changes to the chain.

Accessing intermediate variables using global variables

Using callbacks

Attaching callbacks to your chain is another common method used for logging intermediate variable values. Theres a lot to cover on the topic of callbacks in LangChain, so I will be covering this in detail in a different post.

Why use LCEL?

The reasons for using LCEL are best explained by the authors of Langchain themselves in their official documentation.

Of the points mentioned in the documentation, the following are some that I find especially useful :

Input and output schemas : Will be covering this in detail in a different post
Async support : As we move towards production applications, it becomes more important to have async functionality. LCEL pipeline allow for the seamless transition to async operations.
Optimized parallel execution

Above these reasons, as a matter of personal preference, I feel that using LCEL helps improve the readability of your code and allows for cleaner implementations.

Resources

Full code notebook

PDF text

Images : All images are created by the author

In addition to Medium, I share my thoughts, ideas and other updates on Linkedin.

Building a RAG chain using LangChain Expression Language (LCEL)

Learning the building blocks of LCEL to develop increasingly complex RAG chains

The Setup

Data Ingestion

Prompt Templates

Database Initialization

RAG Development

Simple QA RAG

QA RAG with Self Evaluation I

QA RAG with Self Evaluation II

Retrieving intermediate variables

Using RunnableParallel to carry forward intermediate outputs

Using Global variables to save intermediate steps

Using callbacks

Why use LCEL?

Resources

Thovt Launches Exclusive Founder’s NFT Sale

Apple just launched a public demo of its ‘4M’ AI model: Here’s why it’s a big deal

‘$70 games aren’t unsustainable,’ says CEO that’s selling a game for $70

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

Building a RAG chain using LangChain Expression Language (LCEL)

Learning the building blocks of LCEL to develop increasingly complex RAG chains

The Setup

Data Ingestion

Prompt Templates

Database Initialization

RAG Development

Simple QA RAG

QA RAG with Self Evaluation I

QA RAG with Self Evaluation II

Retrieving intermediate variables

Using RunnableParallel to carry forward intermediate outputs

Using Global variables to save intermediate steps

Using callbacks

Why use LCEL?

Resources

Thovt Launches Exclusive Founder’s NFT Sale

Apple just launched a public demo of its ‘4M’ AI model: Here’s why it’s a big deal

‘$70 games aren’t unsustainable,’ says CEO that’s selling a game for $70

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

Subscribe