LLM Agents Demystified

Hands-on implementation with LightRAG library

14 min read

Jul 14, 2024

Image source, credits to Growtika

LightRAG library: https://github.com/SylphAI-Inc/LightRAG
Colab notebook

“An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future.”

— Franklin and Graesser (1997)

Alongside the well-known RAGs, agents [1] are another popular family of LLM applications. What makes agents stand out is their ability to reason, plan, and act via accessible tools. When it comes to implementation, LightRAG has simplified it down to a generator that can use tools, taking multiple steps (sequential or parallel) to complete a user query.

What is ReAct Agent?

We will first introduce ReAct [2], a general paradigm for building agents with a sequential of interleaving thought, action, and observation steps.

  • Thought: The reasoning behind taking an action.
  • Action: The action to take from a predefined set of actions. In particular, these are the tools/functional tools we have introduced in tools.
  • Observation: The simplest scenario is the execution result of the action in string format. To be more robust, this can be defined in any way that provides the right amount of execution information for the LLM to plan the next step.

Prompt and Data Models

DEFAULT_REACT_AGENT_SYSTEM_PROMPT is the default prompt for React agent’s LLM planner. We can categorize the prompt template into four parts:

  1. Task description

This part is the overall role setup and task description for the agent.

task_desc = r"""You are a helpful assistant.
Answer the user's query using the tools provided below with minimal steps and maximum accuracy.
Each step you will read the previous Thought, Action, and Observation(execution result of the action) and then provide the next Thought and Action."""

2. Tools, output format, and example

This part of the template is exactly the same as how we were calling functions in the tools. The output_format_str is generated by FunctionExpression via JsonOutputParser. It includes the actual output format and examples of a list of FunctionExpression instances. We use thought and action fields of the FunctionExpression as the agent’s response.

tools = r"""{% if tools %}
<TOOLS>
{% for tool in tools %}
{{ loop.index }}.
{{tool}}
------------------------
{% endfor %}
</TOOLS>
{% endif %}
{{output_format_str}}"""

3. Task specification to teach the planner how to “think”.

We provide more detailed instruction to ensure the agent will always end with ‘finish’ action to complete the task. Additionally, we teach it how to handle simple queries and complex queries.

  • For simple queries, we instruct the agent to finish with as few steps as possible.
  • For complex queries, we teach the agent a ‘divide-and-conquer’ strategy to solve the query step by step.
task_spec = r"""<TASK_SPEC>
- For simple queries: Directly call the ``finish`` action and provide the answer.
- For complex queries:
- Step 1: Read the user query and potentially divide it into subqueries. And get started with the first subquery.
- Call one available tool at a time to solve each subquery/subquestion.
- At step 'finish', join all subqueries answers and finish the task.
Remember:
- Action must call one of the above tools with name. It can not be empty.
- You will always end with 'finish' action to finish the task. The answer can be the final answer or failure message.
</TASK_SPEC>"""

We put all these three parts together to be within the <SYS></SYS> tag.

4. Agent step history.

We use StepOutput to record the agent’s step history, including:

  • action: This will be the FunctionExpression instance predicted by the agent.
  • observation: The execution result of the action.

In particular, we format the steps history after the user query as follows:

step_history = r"""User query:
{{ input_str }}
{# Step History #}
{% if step_history %}
<STEPS>
{% for history in step_history %}
Step {{ loop.index }}.
"Thought": "{{history.action.thought}}",
"Action": "{{history.action.action}}",
"Observation": "{{history.observation}}"
------------------------
{% endfor %}
</STEPS>
{% endif %}
You:"""

Tools

In addition to the tools provided by users, by default, we add a new tool named finish to allow the agent to stop and return the final answer.

def finish(answer: str) -> str:
"""Finish the task with answer."""
return answer

Simply returning a string might not fit all scenarios, and we might consider allowing users to define their own finish function in the future for more complex cases.

Additionally, since the provided tools cannot always solve user queries, we allow users to configure if an LLM model should be used to solve a subquery via the add_llm_as_fallback parameter. This LLM will use the same model client and model arguments as the agent’s planner. Here is our code to specify the fallback LLM tool:

_additional_llm_tool = (
Generator(model_client=model_client, model_kwargs=model_kwargs)
if self.add_llm_as_fallback
else None
)
def llm_tool(input: str) -> str:
"""I answer any input query with llm's world knowledge. Use me as a fallback tool or when the query is simple."""
# use the generator to answer the query
try:
output: GeneratorOutput = _additional_llm_tool(
prompt_kwargs={"input_str": input}
)
response = output.data if output else None
return response
except Exception as e:
log.error(f"Error using the generator: {e}")
print(f"Error using the generator: {e}")
return None

React Agent

We define the class ReActAgent to put everything together. It will orchestrate two components:

  • planner: A Generator that works with a JsonOutputParser to parse the output format and examples of the function calls using FunctionExpression.
  • ToolManager: Manages a given list of tools, the finish function, and the LLM tool. It is responsible for parsing and executing the functions.

Additionally, it manages step_history as a list of StepOutput instances for the agent’s internal state.

Prompt the agent with an input query and process the steps to generate a response.

Agent In Action

We will set up two sets of models, llama3–70b-8192 by Groq and gpt-3.5-turbo by OpenAI, to test two queries. For comparison, we will compare these with a vanilla LLM response without using the agent. Here are the code snippets:

from lightrag.components.agent import ReActAgent
from lightrag.core import Generator, ModelClientType, ModelClient
from lightrag.utils import setup_env

setup_env()

# Define tools
def multiply(a: int, b: int) -> int:
"""
Multiply two numbers.
"""
return a * b
def add(a: int, b: int) -> int:
"""
Add two numbers.
"""
return a + b
def divide(a: float, b: float) -> float:
"""
Divide two numbers.
"""
return float(a) / b
llama3_model_kwargs = {
"model": "llama3-70b-8192", # llama3 70b works better than 8b here.
"temperature": 0.0,
}
gpt_model_kwargs = {
"model": "gpt-3.5-turbo",
"temperature": 0.0,
}

def test_react_agent(model_client: ModelClient, model_kwargs: dict):
tools = [multiply, add, divide]
queries = [
"What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2?",
"Give me 5 words rhyming with cool, and make a 4-sentence poem using them",
]
# define a generator without tools for comparison
generator = Generator(
model_client=model_client,
model_kwargs=model_kwargs,
)
react = ReActAgent(
max_steps=6,
add_llm_as_fallback=True,
tools=tools,
model_client=model_client,
model_kwargs=model_kwargs,
)
# print(react)
for query in queries:
print(f"Query: {query}")
agent_response = react.call(query)
llm_response = generator.call(prompt_kwargs={"input_str": query})
print(f"Agent response: {agent_response}")
print(f"LLM response: {llm_response}")
print("")

The structure of React, including the initialization arguments and two major components: tool_manager and planner, is shown below.

ReActAgent(
max_steps=6, add_llm_as_fallback=True,
(tool_manager): ToolManager(Tools: [FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='multiply', func_desc='multiply(a: int, b: int) -> intnn Multiply two numbers.n ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'int'}, 'b': {'type': 'int'}}, 'required': ['a', 'b']})), FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='add', func_desc='add(a: int, b: int) -> intnn Add two numbers.n ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'int'}, 'b': {'type': 'int'}}, 'required': ['a', 'b']})), FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='divide', func_desc='divide(a: float, b: float) -> floatnn Divide two numbers.n ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'float'}, 'b': {'type': 'float'}}, 'required': ['a', 'b']})), FunctionTool(fn: .llm_tool at 0x11384b740>, async: False, definition: FunctionDefinition(func_name='llm_tool', func_desc="llm_tool(input: str) -> strnI answer any input query with llm's world knowledge. Use me as a fallback tool or when the query is simple.", func_parameters={'type': 'object', 'properties': {'input': {'type': 'str'}}, 'required': ['input']})), FunctionTool(fn: .finish at 0x11382fa60>, async: False, definition: FunctionDefinition(func_name='finish', func_desc='finish(answer: str) -> strnFinish the task with answer.', func_parameters={'type': 'object', 'properties': {'answer': {'type': 'str'}}, 'required': ['answer']}))], Additional Context: {})
(planner): Generator(
model_kwargs={'model': 'llama3-70b-8192', 'temperature': 0.0},
(prompt): Prompt(
template:
{# role/task description #}
You are a helpful assistant.
Answer the user's query using the tools provided below with minimal steps and maximum accuracy.
{# REACT instructions #}
Each step you will read the previous Thought, Action, and Observation(execution result of the action) and then provide the next Thought and Action.
{# Tools #}
{% if tools %}

You available tools are:
{# tools #}
{% for tool in tools %}
{{ loop.index }}.
{{tool}}
------------------------
{% endfor %}

{% endif %}
{# output format and examples #}

{{output_format_str}}

{# Task specification to teach the agent how to think using 'divide and conquer' strategy #}
- For simple queries: Directly call the ``finish`` action and provide the answer.
- For complex queries:
- Step 1: Read the user query and potentially divide it into subqueries. And get started with the first subquery.
- Call one available tool at a time to solve each subquery/subquestion.
- At step 'finish', join all subqueries answers and finish the task.
Remember:
- Action must call one of the above tools with name. It can not be empty.
- You will always end with 'finish' action to finish the task. The answer can be the final answer or failure message.

-----------------
User query:
{{ input_str }}
{# Step History #}
{% if step_history %}

{% for history in step_history %}
Step {{ loop.index }}.
"Thought": "{{history.action.thought}}",
"Action": "{{history.action.action}}",
"Observation": "{{history.observation}}"
------------------------
{% endfor %}

{% endif %}
You:, prompt_kwargs: {'tools': ['func_name: multiplynfunc_desc: "multiply(a: int, b: int) -> int\n\n Multiply two numbers.\n "nfunc_parameters:n type: objectn properties:n a:n type: intn b:n type: intn required:n - an - bn', 'func_name: addnfunc_desc: "add(a: int, b: int) -> int\n\n Add two numbers.\n "nfunc_parameters:n type: objectn properties:n a:n type: intn b:n type: intn required:n - an - bn', 'func_name: dividenfunc_desc: "divide(a: float, b: float) -> float\n\n Divide two numbers.\n "nfunc_parameters:n type: objectn properties:n a:n type: floatn b:n type: floatn required:n - an - bn', "func_name: llm_toolnfunc_desc: 'llm_tool(input: str) -> strnn I answer any input query with llm''s world knowledge. Use me as a fallback tooln or when the query is simple.'nfunc_parameters:n type: objectn properties:n input:n type: strn required:n - inputn", "func_name: finishnfunc_desc: 'finish(answer: str) -> strnn Finish the task with answer.'nfunc_parameters:n type: objectn properties:n answer:n type: strn required:n - answern"], 'output_format_str': 'Your output should be formatted as a standard JSON instance with the following schema:n```n{n "thought": "Why the function is called (Optional[str]) (optional)",n "action": "FuncName() Valid function call expression. Example: \"FuncName(a=1, b=2)\" Follow the data type specified in the function parameters.e.g. for Type object with x,y properties, use \"ObjectType(x=1, y=2) (str) (required)"n}n```nExamples:n```n{n "thought": "I have finished the task.",n "action": "finish(answer=\"final answer: 'answer'\")"n}n________n```n-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!n-Use double quotes for the keys and string values.n-DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.n-Follow the JSON formatting conventions.'}, prompt_variables: ['input_str', 'tools', 'step_history', 'output_format_str']
)
(model_client): GroqAPIClient()
(output_processors): JsonOutputParser(
data_class=FunctionExpression, examples=[FunctionExpression(thought='I have finished the task.', action='finish(answer="final answer: 'answer'")')], exclude_fields=None, return_data_class=True
(output_format_prompt): Prompt(
template: Your output should be formatted as a standard JSON instance with the following schema:
```
{{schema}}
```
{% if example %}
Examples:
```
{{example}}
```
{% endif %}
-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!
-Use double quotes for the keys and string values.
-DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.
-Follow the JSON formatting conventions., prompt_variables: ['example', 'schema']
)
(output_processors): JsonParser()
)
)
)

Now, let’s run the test function to see the agent in action.

test_react_agent(ModelClientType.GROQ(), llama3_model_kwargs)
test_react_agent(ModelClientType.OPENAI(), gpt_model_kwargs)

Our agent will show the core steps for developers via colored printout, including input_query, steps, and the final answer. The printout of the first query with llama3 is shown below (without the color here):

2024-07-10 16:48:47 - [react.py:287:call] - input_query: What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2

2024-07-10 16:48:48 - [react.py:266:_run_one_step] - Step 1:
StepOutput(step=1, action=FunctionExpression(thought="Let's break down the query into subqueries and start with the first one.", action='llm_tool(input="What is the capital of France?")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': 'What is the capital of France?'}), observation='The capital of France is Paris!')
_______
2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 2:
StepOutput(step=2, action=FunctionExpression(thought="Now, let's move on to the second subquery.", action='multiply(a=465, b=321)'), function=Function(thought=None, name='multiply', args=[], kwargs={'a': 465, 'b': 321}), observation=149265)
_______
2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 3:
StepOutput(step=3, action=FunctionExpression(thought="Now, let's add 95297 to the result.", action='add(a=149265, b=95297)'), function=Function(thought=None, name='add', args=[], kwargs={'a': 149265, 'b': 95297}), observation=244562)
_______
2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 4:
StepOutput(step=4, action=FunctionExpression(thought="Now, let's divide the result by 13.2.", action='divide(a=244562, b=13.2)'), function=Function(thought=None, name='divide', args=[], kwargs={'a': 244562, 'b': 13.2}), observation=18527.424242424244)
_______
2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 5:
StepOutput(step=5, action=FunctionExpression(thought="Now, let's combine the answers of both subqueries.", action='finish(answer="The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.")'), function=Function(thought=None, name='finish', args=[], kwargs={'answer': 'The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.'}), observation='The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.')
_______
2024-07-10 16:48:50 - [react.py:301:call] - answer:
The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.

For the second query, the printout:

2024-07-10 16:48:51 - [react.py:287:call] - input_query: Give me 5 words rhyming with cool, and make a 4-sentence poem using them
2024-07-10 16:48:52 - [react.py:266:_run_one_step] - Step 1:
StepOutput(step=1, action=FunctionExpression(thought="I need to find 5 words that rhyme with 'cool'.", action='llm_tool(input="What are 5 words that rhyme with 'cool'?")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': "What are 5 words that rhyme with 'cool'?"}), observation='Here are 5 words that rhyme with "cool":nn1. Rulen2. Tooln3. Fooln4. Pooln5. School')
_______
2024-07-10 16:49:00 - [react.py:266:_run_one_step] - Step 2:
StepOutput(step=2, action=FunctionExpression(thought='Now that I have the rhyming words, I need to create a 4-sentence poem using them.', action='llm_tool(input="Create a 4-sentence poem using the words 'rule', 'tool', 'fool', 'pool', and 'school'.")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': "Create a 4-sentence poem using the words 'rule', 'tool', 'fool', 'pool', and 'school'."}), observation="Here is a 4-sentence poem using the words 'rule', 'tool', 'fool', 'pool', and 'school':nnIn the classroom, we learn to rule,nWith a pencil as our trusty tool.nBut if we're not careful, we can be a fool,nAnd end up swimming in the school pool.")
_______
2024-07-10 16:49:12 - [react.py:266:_run_one_step] - Step 3:
StepOutput(step=3, action=FunctionExpression(thought='I have the poem, now I need to finish the task.', action='finish(answer="Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.")'), function=Function(thought=None, name='finish', args=[], kwargs={'answer': "Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool."}), observation="Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.")
_______
2024-07-10 16:49:12 - [react.py:301:call] - answer:
Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.

The comparison between the agent and the vanilla LLM response is shown below:

Answer with agent: The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.
Answer without agent: GeneratorOutput(data="I'd be happy to help you with that!nnThe capital of France is Paris.nnNow, let's tackle the math problem:nn1. 465 × 321 = 149,485n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09nnSo, the answer is 18,544.09!", error=None, usage=None, raw_response="I'd be happy to help you with that!nnThe capital of France is Paris.nnNow, let's tackle the math problem:nn1. 465 × 321 = 149,485n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09nnSo, the answer is 18,544.09!", metadata=None)

For the second query, the comparison is shown below:

Answer with agent: Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.
Answer without agent: GeneratorOutput(data='Here are 5 words that rhyme with "cool":nn1. rulen2. tooln3. fooln4. pooln5. schoolnnAnd here's a 4-sentence poem using these words:nnIn the summer heat, I like to be cool,nFollowing the rule, I take a dip in the pool.nI'm not a fool, I know just what to do,nI grab my tool and head back to school.', error=None, usage=None, raw_response='Here are 5 words that rhyme with "cool":nn1. rulen2. tooln3. fooln4. pooln5. schoolnnAnd here's a 4-sentence poem using these words:nnIn the summer heat, I like to be cool,nFollowing the rule, I take a dip in the pool.nI'm not a fool, I know just what to do,nI grab my tool and head back to school.', metadata=None)

The ReAct agent is particularly helpful for answering queries that require capabilities like computation or more complicated reasoning and planning. However, using it on general queries might be an overkill, as it might take more steps than necessary to answer the query.

Customization

Template

The first thing you want to customize is the template itself. You can do this by passing your own template to the agent’s constructor. We suggest you to modify our default template: DEFAULT_REACT_AGENT_SYSTEM_PROMPT.

Examples for Better Output Format

Secondly, the examples in the constructor allow you to provide more examples to enforce the correct output format. For instance, if we want it to learn how to correctly call multiply, we can pass in a list of FunctionExpression instances with the correct format. Classmethod from_function can be used to create a FunctionExpression instance from a function and its arguments.

from lightrag.core.types import FunctionExpression
# generate an example of calling multiply with key-word arguments
example_using_multiply = FunctionExpression.from_function(
func=multiply,
thought="Now, let's multiply two numbers.",
a=3,
b=4,
)
examples = [example_using_multiply]
# pass it to the agent

We can visualize how this is passed to the planner prompt via:

react.planner.print_prompt()

The above example will be formated as:

<OUTPUT_FORMAT>
Your output should be formatted as a standard JSON instance with the following schema:
```
{
"thought": "Why the function is called (Optional[str]) (optional)",
"action": "FuncName(<kwargs>) Valid function call expression. Example: "FuncName(a=1, b=2)" Follow the data type specified in the function parameters.e.g. for Type object with x,y properties, use "ObjectType(x=1, y=2) (str) (required)"
}
```
Examples:
```
{
"thought": "Now, let's multiply two numbers.",
"action": "multiply(a=3, b=4)"
}
________
{
"thought": "I have finished the task.",
"action": "finish(answer="final answer: 'answer'")"
}
________
```
-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!
-Use double quotes for the keys and string values.
-DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.
-Follow the JSON formatting conventions.
</OUTPUT_FORMAT>

Subclass ReActAgent

If you want to customize the agent further, you can subclass the ReActAgent and override the methods you want to change.

References

[1] A survey on large language model based autonomous agents: Paitesanshi/LLM-Agent-Survey

[2] ReAct: https://arxiv.org/abs/2210.03629

API references