Home >Technology peripherals >AI >Building LangChain Agents to Automate Tasks in Python
LangChain’s 90k GitHub stars are all the credibility it needs—right now, it is the hottest framework to build LLM-based applications. Its comprehensive set of tools and components allows you to build end-to-end AI solutions using almost any LLM.
Perhaps at the heart of LangChain’s capabilities are LangChain agents. They are autonomous or semi-autonomous tools that can perform tasks, make decisions, and interact with other tools and APIs. They represent a significant leap forward in automating complex workflows with LLMs.
In this article, you will learn how to build your own LangChain agents that can perform tasks not strictly possible with today's chat applications like ChatGPT.
Before we get into anything, let’s set up our environment for the tutorial.
First, creating a new Conda environment:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
Installing LangChain’s packages and a few other necessary libraries:
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
Adding the newly created Conda environment to Jupyter as a kernel:
$ ipython kernel install --user --name=langchain
Creating a .env file to store secrets such as API keys:
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
Retrieving your OpenAI API key from the .env file:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
Testing that everything is working correctly by querying GPT-3.5 (the default language model) of OpenAI:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
There is no definitive answer to this question, as it is subjective and de
Now, we are ready to get started.
Let’s spend some time thinking about the agent framework. Specifically, we will consider how it differs from the traditional chain paradigm and what the components of an agent are. Understanding why we need to choose a new way of building applications will prepare us for writing the code.
The defining trait of agents is their ability to choose the best order of actions to solve a problem given a set of tools.
For example, let’s say we have the following:
Traditional problem-solving would involve using a chain of select tools from the list:
Chain 1: Weather-based clothing recommender
Chain 2: Weather-based biking route suggester
Chain 3: Outfit Photo Analyzer
Each chain solves a specific problem using a predetermined sequence of steps and a subset of the available tools. They cannot adapt beyond their defined scope. They also require three separate branches of development, which is inefficient in terms of time and resources.
Now, imagine an agentic system with access to all these tools. It would be able to:
For example, if a user asks, “What should I wear for my bike ride today?” the agent might check the weather API, analyze suitable biking routes through Strava, recommend appropriate clothing, considering the user's past preferences, and generate a personalized response.
The agent can:
LangChain’s capacity to transform language models — which, by themselves, only produce text — into reasoning engines that can use the resources at their disposal to take appropriate action is one of its main applications. In short, LangChain enables the development of strong autonomous agents that interact with the outside world.
A LangChain agent is made up of several components, such as chat models, prompt templates, external tools, and other related constructs. To build successful agents, we need to review each component and understand their use.
There are a lot of moving parts involved in creating a LangChain agent. The first and most obvious is a language model.
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
Language models, like OpenAI’s GPT-3.5 Turbo, take and generate strings. They are typically older and work best to answer individual user queries.
Newer and more powerful models are usually chat models, which can take a sequence of messages as inputs and return chat messages as outputs (as opposed to using plain text):
$ conda create -n langchain python=3.9 -y $ conda activate langchain
Put differently, chat models allow us to have conversations in natural language. In the example above, we are initializing GPT-4o-mini with a system message followed by a user query. Note the use of SystemMessage and HumanMessage classes.
The output is a message object, which is the expected behavior of chat models:
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
$ ipython kernel install --user --name=langchain
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
Besides, they return other useful metadata accessible with dot-notation:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
Most agents use chat models because of their updated knowledge base and conversational capabilities. However, for simple agents with no memory requirements, language models like GPT-3.5 will be enough.
The most efficient way to query language or chat models is by using prompt templates. They allow you to structure your queries consistently and dynamically insert variables, making your interactions with the model more flexible and reusable.
In LangChain, there are many types of prompt templates, with the most basic one being PromptTemplate class. It can be used with language (plain text) models:
There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct") question = "What is special about the number 73?" output = llm.invoke(question) print(output[:100])
The class requires you to create a string with placeholders for variables you want to replace using the bracket notation. Then, you need to pass this template string to the PromptTemplate class along with the names of the variables, thus constructing your prompt.
Calling .invoke() with values for variables will show how your prompt will be passed to a model.
Passing this prompt template to a language model requires us to chain it using the pipe operator:
1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage # Initialize the model chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini') # Write the messages messages = [SystemMessage(content='You are a grumpy pirate.'), HumanMessage(content="What's up?")] output = chat_model.invoke(messages)
The pipe operator (|) is part of LangChain Expression Language (LCEL), designed to chain multiple LangChain components and tools.
type(output)
langchain_core.messages.ai.AIMessage
When you use the pipe operator on LangChain objects, you create an instance of RunnableSequence class. A runnable sequence represents a chain of objects that support the .invoke() method, like prompt templates and language/chat models.
Now, let’s look at another prompt template class for chat models:
print(output.content)
We mentioned that chat models require a sequence of messages as inputs. The initial input is usually a system prompt telling the chat model how to behave. So, using the ChatPromptTemplate class, we can easily create chat models with different personalities:
Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!
The class requires a list of role-based messages as input. Each member of the list must be a (role, message) tuple with the variable placeholders defined where needed.
After we have it ready, we can use the same pipe operator to create chat models with different behaviors:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
In a previous section, we mentioned that agents can choose a combination of tools at their disposal to solve a particular problem, with LLMs as reasoning engines under the hood.
LangChain offers integrations with dozens of popular APIs and services to let agents interact with the rest of the world. Most of them are available under the langchain_community package, while some are inside langchain_core.
For example, here is how you can use the ArXiv tool to retrieve paper summaries on various topics:
$ ipython kernel install --user --name=langchain
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
There is an alternative way to load tools rather than import them by their class name:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
Above, we are loading both the arXiv and Dall-E image generator tools at the same time using the load_tools() function. Tools loaded with this function have the same usage syntax:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct") question = "What is special about the number 73?" output = llm.invoke(question) print(output[:100])
1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
The load_tools function requires you to know the string names of tool classes, like the example of ArxivQueryRun versus 'arxiv'. You can quickly check the string name of any tool by running the get_all_tool_names function:
from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage # Initialize the model chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini') # Write the messages messages = [SystemMessage(content='You are a grumpy pirate.'), HumanMessage(content="What's up?")] output = chat_model.invoke(messages)
type(output)
Note that load_tools() is only a shorthand function. When building agents, it is recommended to load tools using their class constructor, which allows you to configure them based on their specific behavior.
Finally, in this section, we will see how to create LangChain agents step-by-step using the knowledge we have gained in the previous sections.
In the coming examples, we will build an agent capable of explaining any topic via three mediums: text, image, or video. More specifically, based on the question asked, the agent will decide whether to explain the topic in what format.
Let’s start. Remember to check out how to set up the environment, which is covered at the beginning of the article.
The first step after configuring our environment is defining the tools we will give to our agent. Let’s import them:
langchain_core.messages.ai.AIMessage
We are importing five classes:
When a user queries our agent, it will decide whether to explain the topic using a Wikipedia article in text format, or by creating an image using Dall-E for visual understanding, or by suggesting YouTube videos for deeper comprehension.
Let’s initialize them, starting with the Wikipedia tool:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
DallE image generator:
$ ipython kernel install --user --name=langchain
YouTube search tool:
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
Take special care of the tool descriptions. The agent will decide which one tool to use based on the description you provide.
Now, we will put the tools into a list:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
We can already bind this set of tools to a chat model without creating an agent:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
Let’s try calling the model with a simple message:
There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct") question = "What is special about the number 73?" output = llm.invoke(question) print(output[:100])
The output shows that none of the bound tools were used when generating an answer. Now, let’s ask a specific question that would force the model to look beyond its training data:
1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage # Initialize the model chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini') # Write the messages messages = [SystemMessage(content='You are a grumpy pirate.'), HumanMessage(content="What's up?")] output = chat_model.invoke(messages)
We can see there is no text output, but OpenAI’s DallE is mentioned. The tool isn’t called yet; the model is simply suggesting we use it. To actually call it — to take action, we need to create an agent.
After defining the model and the tools, we create the agent. LangChain offers a high-level create_react_agent() function interface from its langgraph package to quickly create ReAct (reason and act) agents:
type(output)
While initializing the agent with a chat model and a list of tools, we are passing a system prompt to tell the model how to behave in general. It is now ready to accept queries:
langchain_core.messages.ai.AIMessage
print(output.content)
We have received a likely response, which is a simple text answer without tool calls. Now, let’s ask something more to the point:
Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!
output.dict()
This time, there are four messages. Let’s see the message class names and their content:
{'content': "Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!", 'additional_kwargs': {}, 'response_metadata': {'token_usage': {'completion_tokens': 38, 'prompt_tokens': 21, 'total_tokens': 59}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_48196bc67a', 'finish_reason': 'stop', 'logprobs': None}, 'type': 'ai', 'name': None, 'id': 'run-fde829bf-8f5f-4926-a1ed-ab53609ce03a-0', 'example': False, 'tool_calls': [], 'invalid_tool_calls': [], 'usage_metadata': {'input_tokens': 21, 'output_tokens': 38, 'total_tokens': 59}}
from langchain_core.prompts import PromptTemplate query_template = "Tell me about {book_name} by {author}." prompt = PromptTemplate(input_variables=["book_name", "author"], template=query_template) prompt.invoke({"book_name": "Song of Ice and Fire", "author": "GRRM"})
Here we go! The third message is from a tool call, which is a summary of a Wikipedia page on photosynthesis. The last message is from the chat model, which is using the tool call’s contents when constructing its answer.
Let’s quickly create a function to modularize the last steps we took:
StringPromptValue(text='Tell me about Song of Ice and Fire by GRRM.')
Now, let’s update our system prompt with detailed instructions on how the agent should behave:
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key) # Create a chain chain = prompt | llm # Invoke the chain output = chain.invoke({"book_name": "Deathly Hallows", "author": "J.K. Rowling"}) print(output[:100])
Let’s recreate our agent with the new system prompt:
Deathly Hallows is the seventh and final book in the popular Harry Potter series, written by J.K. R
type(chain)
Awesome, based on our message (which was very instructive :), the agent chose the correct tool for the job. Here is the generated image:
Right now, our agent is stateless, which means it doesn’t remember previous interactions:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
The easiest way to add chat message history to agents is by using langgraph's SqliteSaver class:
$ ipython kernel install --user --name=langchain
We initialize the memory using the .from_conn_string() method of SqliteSaver class, which creates a database file. Then, we pass the memory to the checkpointer parameter of create_react_agent() function.
Now, we need to create a configuration dictionary:
$ touch .env $ vim .env # Paste your OPENAI key
The dictionary defines a thread ID to distinguish one conversation from another and it is passed to the .invoke() method of our agent. So, let's update our execute() function to include this behavior:
OPENAI_API_KEY='YOUR_KEY_HERE'
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
Now, let’s ask the agent about previous queries:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
There is no definitive answer to this question, as it is subjective and de
As expected, the agent is returning the previous messages! Now, we only need a chat UI like that of ChatGPT and we have got ourselves a custom chatbot.
Throughout the article, we have caught a glimpse of where LangChain is going in terms of agents. Up until very recently, LangChain had mainly used the AgentExecutor class but it is slowly being replaced by langgraph agents.
Pure LangChain agents are fine to get started, but they require more lines of code to build the same agent than in LangGraph. Also, after a certain point, the AgentExecutor framework won't provide the flexibility LangGraph has for building complex multi-tool agents.
That’s why now is a great time to ride the wave and start with LangGraph directly.
We highly recommend to start using LangSmith as well, which has become a core part of the LangChain ecosystem for building production-grade LLM applications. Here are some of its key benefits:
Here is how you can get started with LangSmith:
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="..."
and you are good to go! As you start querying language/chat models, LangSmith starts logging various metrics about each run:
In this article, we explored what makes LangChain agents distinct from chains and the important building blocks used in constructing them. We first introduced what agents are and how they differ from the more traditional chain constructs regarding flexibility and capability in making decisions.
Then we looked at the key components you need to know about in order to build an agent: chat models, tools, and prompt templates. Finally, we ran through two examples demonstrating how to build simple and advanced agents. Natural language processing is developing continually, and LangChain agents are at the forefront of this progression, paving the way for an even more intelligent and versatile family of AI.
Here are some related resources to increase your LangChain:
Thank you for reading!
The above is the detailed content of Building LangChain Agents to Automate Tasks in Python. For more information, please follow other related articles on the PHP Chinese website!