LangChain’s 90k GitHub stars are all the credibility it needs—right now, it is the hottest framework to build LLM-based applications. Its comprehensive set of tools and components allows you to build end-to-end AI solutions using almost any LLM.
Perhaps at the heart of LangChain’s capabilities are LangChain agents. They are autonomous or semi-autonomous tools that can perform tasks, make decisions, and interact with other tools and APIs. They represent a significant leap forward in automating complex workflows with LLMs.
In this article, you will learn how to build your own LangChain agents that can perform tasks not strictly possible with today's chat applications like ChatGPT.
Setup
Before we get into anything, let’s set up our environment for the tutorial.
First, creating a new Conda environment:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
Installing LangChain’s packages and a few other necessary libraries:
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
Adding the newly created Conda environment to Jupyter as a kernel:
$ ipython kernel install --user --name=langchain
Creating a .env file to store secrets such as API keys:
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
Retrieving your OpenAI API key from the .env file:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
Testing that everything is working correctly by querying GPT-3.5 (the default language model) of OpenAI:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
There is no definitive answer to this question, as it is subjective and de
Now, we are ready to get started.
What are LangChain Agents?
Let’s spend some time thinking about the agent framework. Specifically, we will consider how it differs from the traditional chain paradigm and what the components of an agent are. Understanding why we need to choose a new way of building applications will prepare us for writing the code.
Chains vs. Agents
The defining trait of agents is their ability to choose the best order of actions to solve a problem given a set of tools.
For example, let’s say we have the following:
- A weather API
- ML model for clothing recommendations
- Strava API for biking routes
- User preferences database
- Image recognition model
- Language model (text generation)
Traditional problem-solving would involve using a chain of select tools from the list:
Chain 1: Weather-based clothing recommender
- Call the weather API
- Input weather data into the ML clothing model
- Generate clothing recommendations
- Present results to the user
Chain 2: Weather-based biking route suggester
- Call the weather API
- Call the Strava API for popular routes
- Filter routes based on weather conditions
- Present suitable routes to the user
Chain 3: Outfit Photo Analyzer
- Receive the user's outfit photo
- Use an image recognition model to identify clothing items
- Compare with the user preference database
- Generate feedback using the text generation model
- Present the analysis to the user
Each chain solves a specific problem using a predetermined sequence of steps and a subset of the available tools. They cannot adapt beyond their defined scope. They also require three separate branches of development, which is inefficient in terms of time and resources.
Now, imagine an agentic system with access to all these tools. It would be able to:
- Understand the user's query or problem (through natural language with a language model)
- Assess which tools are relevant to the problem (reasoning)
- Dynamically create a workflow using the most appropriate tools
- Execute the workflow, making real-time adjustments if needed (acting)
- Evaluate the outcome and learn from past interactions
For example, if a user asks, “What should I wear for my bike ride today?” the agent might check the weather API, analyze suitable biking routes through Strava, recommend appropriate clothing, considering the user's past preferences, and generate a personalized response.
The agent can:
- Handle a wide variety of problems using the same set of tools
- Create custom workflows for each unique situation
- Adapt its approach based on the specific context and user needs
- Learn from interactions to improve future performance
LangChain’s capacity to transform language models — which, by themselves, only produce text — into reasoning engines that can use the resources at their disposal to take appropriate action is one of its main applications. In short, LangChain enables the development of strong autonomous agents that interact with the outside world.
Key components
A LangChain agent is made up of several components, such as chat models, prompt templates, external tools, and other related constructs. To build successful agents, we need to review each component and understand their use.
Language and chat models
There are a lot of moving parts involved in creating a LangChain agent. The first and most obvious is a language model.
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
Language models, like OpenAI’s GPT-3.5 Turbo, take and generate strings. They are typically older and work best to answer individual user queries.
Newer and more powerful models are usually chat models, which can take a sequence of messages as inputs and return chat messages as outputs (as opposed to using plain text):
$ conda create -n langchain python=3.9 -y $ conda activate langchain
Put differently, chat models allow us to have conversations in natural language. In the example above, we are initializing GPT-4o-mini with a system message followed by a user query. Note the use of SystemMessage and HumanMessage classes.
The output is a message object, which is the expected behavior of chat models:
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
$ ipython kernel install --user --name=langchain
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
Besides, they return other useful metadata accessible with dot-notation:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
Most agents use chat models because of their updated knowledge base and conversational capabilities. However, for simple agents with no memory requirements, language models like GPT-3.5 will be enough.
Prompt templates
The most efficient way to query language or chat models is by using prompt templates. They allow you to structure your queries consistently and dynamically insert variables, making your interactions with the model more flexible and reusable.
In LangChain, there are many types of prompt templates, with the most basic one being PromptTemplate class. It can be used with language (plain text) models:
There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct") question = "What is special about the number 73?" output = llm.invoke(question) print(output[:100])
The class requires you to create a string with placeholders for variables you want to replace using the bracket notation. Then, you need to pass this template string to the PromptTemplate class along with the names of the variables, thus constructing your prompt.
Calling .invoke() with values for variables will show how your prompt will be passed to a model.
Passing this prompt template to a language model requires us to chain it using the pipe operator:
1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage # Initialize the model chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini') # Write the messages messages = [SystemMessage(content='You are a grumpy pirate.'), HumanMessage(content="What's up?")] output = chat_model.invoke(messages)
The pipe operator (|) is part of LangChain Expression Language (LCEL), designed to chain multiple LangChain components and tools.
type(output)
langchain_core.messages.ai.AIMessage
When you use the pipe operator on LangChain objects, you create an instance of RunnableSequence class. A runnable sequence represents a chain of objects that support the .invoke() method, like prompt templates and language/chat models.
Now, let’s look at another prompt template class for chat models:
print(output.content)
We mentioned that chat models require a sequence of messages as inputs. The initial input is usually a system prompt telling the chat model how to behave. So, using the ChatPromptTemplate class, we can easily create chat models with different personalities:
Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!
The class requires a list of role-based messages as input. Each member of the list must be a (role, message) tuple with the variable placeholders defined where needed.
After we have it ready, we can use the same pipe operator to create chat models with different behaviors:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
Tools
In a previous section, we mentioned that agents can choose a combination of tools at their disposal to solve a particular problem, with LLMs as reasoning engines under the hood.
LangChain offers integrations with dozens of popular APIs and services to let agents interact with the rest of the world. Most of them are available under the langchain_community package, while some are inside langchain_core.
For example, here is how you can use the ArXiv tool to retrieve paper summaries on various topics:
$ ipython kernel install --user --name=langchain
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
There is an alternative way to load tools rather than import them by their class name:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
Above, we are loading both the arXiv and Dall-E image generator tools at the same time using the load_tools() function. Tools loaded with this function have the same usage syntax:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct") question = "What is special about the number 73?" output = llm.invoke(question) print(output[:100])
1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
The load_tools function requires you to know the string names of tool classes, like the example of ArxivQueryRun versus 'arxiv'. You can quickly check the string name of any tool by running the get_all_tool_names function:
from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage # Initialize the model chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini') # Write the messages messages = [SystemMessage(content='You are a grumpy pirate.'), HumanMessage(content="What's up?")] output = chat_model.invoke(messages)
type(output)
Note that load_tools() is only a shorthand function. When building agents, it is recommended to load tools using their class constructor, which allows you to configure them based on their specific behavior.
Step-by-Step Workflow of How to Build LangChain Agents
Finally, in this section, we will see how to create LangChain agents step-by-step using the knowledge we have gained in the previous sections.
In the coming examples, we will build an agent capable of explaining any topic via three mediums: text, image, or video. More specifically, based on the question asked, the agent will decide whether to explain the topic in what format.
Let’s start. Remember to check out how to set up the environment, which is covered at the beginning of the article.
1. Defining tools
The first step after configuring our environment is defining the tools we will give to our agent. Let’s import them:
langchain_core.messages.ai.AIMessage
We are importing five classes:
- WikipediaAPIWrapper: to configure how to access the Wikipedia API
- WikipediaQueryRun: to generate Wikipedia page summaries
- YouTubeSearchTool: to search YouTube videos on topics
- DallEAPIWrapper: to configure how to access OpenAI's DallE endpoint
- OpenAIDALLEImageGenerationTool: to generate images using prompts
When a user queries our agent, it will decide whether to explain the topic using a Wikipedia article in text format, or by creating an image using Dall-E for visual understanding, or by suggesting YouTube videos for deeper comprehension.
Let’s initialize them, starting with the Wikipedia tool:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
DallE image generator:
$ ipython kernel install --user --name=langchain
YouTube search tool:
$ touch .env $ vim .env # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'
Take special care of the tool descriptions. The agent will decide which one tool to use based on the description you provide.
Now, we will put the tools into a list:
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
We can already bind this set of tools to a chat model without creating an agent:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
Let’s try calling the model with a simple message:
There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct") question = "What is special about the number 73?" output = llm.invoke(question) print(output[:100])
The output shows that none of the bound tools were used when generating an answer. Now, let’s ask a specific question that would force the model to look beyond its training data:
1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage # Initialize the model chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini') # Write the messages messages = [SystemMessage(content='You are a grumpy pirate.'), HumanMessage(content="What's up?")] output = chat_model.invoke(messages)
We can see there is no text output, but OpenAI’s DallE is mentioned. The tool isn’t called yet; the model is simply suggesting we use it. To actually call it — to take action, we need to create an agent.
2. Creating a simple agent
After defining the model and the tools, we create the agent. LangChain offers a high-level create_react_agent() function interface from its langgraph package to quickly create ReAct (reason and act) agents:
type(output)
While initializing the agent with a chat model and a list of tools, we are passing a system prompt to tell the model how to behave in general. It is now ready to accept queries:
langchain_core.messages.ai.AIMessage
print(output.content)
We have received a likely response, which is a simple text answer without tool calls. Now, let’s ask something more to the point:
Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!
output.dict()
This time, there are four messages. Let’s see the message class names and their content:
{'content': "Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!", 'additional_kwargs': {}, 'response_metadata': {'token_usage': {'completion_tokens': 38, 'prompt_tokens': 21, 'total_tokens': 59}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_48196bc67a', 'finish_reason': 'stop', 'logprobs': None}, 'type': 'ai', 'name': None, 'id': 'run-fde829bf-8f5f-4926-a1ed-ab53609ce03a-0', 'example': False, 'tool_calls': [], 'invalid_tool_calls': [], 'usage_metadata': {'input_tokens': 21, 'output_tokens': 38, 'total_tokens': 59}}
from langchain_core.prompts import PromptTemplate query_template = "Tell me about {book_name} by {author}." prompt = PromptTemplate(input_variables=["book_name", "author"], template=query_template) prompt.invoke({"book_name": "Song of Ice and Fire", "author": "GRRM"})
Here we go! The third message is from a tool call, which is a summary of a Wikipedia page on photosynthesis. The last message is from the chat model, which is using the tool call’s contents when constructing its answer.
Let’s quickly create a function to modularize the last steps we took:
StringPromptValue(text='Tell me about Song of Ice and Fire by GRRM.')
3. Refining the system prompt
Now, let’s update our system prompt with detailed instructions on how the agent should behave:
from langchain_openai import OpenAI llm = OpenAI(api_key=api_key) # Create a chain chain = prompt | llm # Invoke the chain output = chain.invoke({"book_name": "Deathly Hallows", "author": "J.K. Rowling"}) print(output[:100])
Let’s recreate our agent with the new system prompt:
Deathly Hallows is the seventh and final book in the popular Harry Potter series, written by J.K. R
type(chain)
Awesome, based on our message (which was very instructive :), the agent chose the correct tool for the job. Here is the generated image:
4. Adding memory to agents
Right now, our agent is stateless, which means it doesn’t remember previous interactions:
$ conda create -n langchain python=3.9 -y $ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
The easiest way to add chat message history to agents is by using langgraph's SqliteSaver class:
$ ipython kernel install --user --name=langchain
We initialize the memory using the .from_conn_string() method of SqliteSaver class, which creates a database file. Then, we pass the memory to the checkpointer parameter of create_react_agent() function.
Now, we need to create a configuration dictionary:
$ touch .env $ vim .env # Paste your OPENAI key
The dictionary defines a thread ID to distinguish one conversation from another and it is passed to the .invoke() method of our agent. So, let's update our execute() function to include this behavior:
OPENAI_API_KEY='YOUR_KEY_HERE'
import os from dotenv import load_dotenv load_dotenv() api_key = os.getenv('OPENAI_API_KEY')
Now, let’s ask the agent about previous queries:
from langchain_openai import OpenAI llm = OpenAI(openai_api_key=api_key) question = "Is Messi the best footballer of all time?" output = llm.invoke(question) print(output[:75])
There is no definitive answer to this question, as it is subjective and de
As expected, the agent is returning the previous messages! Now, we only need a chat UI like that of ChatGPT and we have got ourselves a custom chatbot.
Future Trends And Developments
Throughout the article, we have caught a glimpse of where LangChain is going in terms of agents. Up until very recently, LangChain had mainly used the AgentExecutor class but it is slowly being replaced by langgraph agents.
Pure LangChain agents are fine to get started, but they require more lines of code to build the same agent than in LangGraph. Also, after a certain point, the AgentExecutor framework won't provide the flexibility LangGraph has for building complex multi-tool agents.
That’s why now is a great time to ride the wave and start with LangGraph directly.
We highly recommend to start using LangSmith as well, which has become a core part of the LangChain ecosystem for building production-grade LLM applications. Here are some of its key benefits:
- Debugging: LangSmith provides detailed traces of your agent’s execution, making it easier to identify and fix issues.
- Performance Optimization: With LangSmith, you can analyze token usage, latency, and other performance metrics to optimize your agent’s efficiency.
- Testing and Evaluation: LangSmith facilitates the creation and management of test datasets, enabling you to rigorously evaluate your agent’s performance across a range of scenarios.
- Monitoring: In production environments, LangSmith offers real-time monitoring capabilities, allowing you to track your agent’s performance and detect anomalies quickly.
Here is how you can get started with LangSmith:
- Sign up for a free account here.
- Set environment variables.
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="..."
and you are good to go! As you start querying language/chat models, LangSmith starts logging various metrics about each run:
Conclusion
In this article, we explored what makes LangChain agents distinct from chains and the important building blocks used in constructing them. We first introduced what agents are and how they differ from the more traditional chain constructs regarding flexibility and capability in making decisions.
Then we looked at the key components you need to know about in order to build an agent: chat models, tools, and prompt templates. Finally, we ran through two examples demonstrating how to build simple and advanced agents. Natural language processing is developing continually, and LangChain agents are at the forefront of this progression, paving the way for an even more intelligent and versatile family of AI.
Here are some related resources to increase your LangChain:
- Developing LLM Applications with LangChain Course
- An Introduction to Prompt Engineering with LangChain
- How to Build LLM Applications with LangChain Tutorial
- Building a GPT Model with Browsing Capabilities Using LangChain Tools
- LangChain vs LlamaIndex: A Detailed Comparison
Thank you for reading!
The above is the detailed content of Building LangChain Agents to Automate Tasks in Python. For more information, please follow other related articles on the PHP Chinese website!

Cyberattacks are evolving. Gone are the days of generic phishing emails. The future of cybercrime is hyper-personalized, leveraging readily available online data and AI to craft highly targeted attacks. Imagine a scammer who knows your job, your f

In his inaugural address to the College of Cardinals, Chicago-born Robert Francis Prevost, the newly elected Pope Leo XIV, discussed the influence of his namesake, Pope Leo XIII, whose papacy (1878-1903) coincided with the dawn of the automobile and

This tutorial demonstrates how to integrate your Large Language Model (LLM) with external tools using the Model Context Protocol (MCP) and FastAPI. We'll build a simple web application using FastAPI and convert it into an MCP server, enabling your L

Explore Dia-1.6B: A groundbreaking text-to-speech model developed by two undergraduates with zero funding! This 1.6 billion parameter model generates remarkably realistic speech, including nonverbal cues like laughter and sneezes. This article guide

I wholeheartedly agree. My success is inextricably linked to the guidance of my mentors. Their insights, particularly regarding business management, formed the bedrock of my beliefs and practices. This experience underscores my commitment to mentor

AI Enhanced Mining Equipment The mining operation environment is harsh and dangerous. Artificial intelligence systems help improve overall efficiency and security by removing humans from the most dangerous environments and enhancing human capabilities. Artificial intelligence is increasingly used to power autonomous trucks, drills and loaders used in mining operations. These AI-powered vehicles can operate accurately in hazardous environments, thereby increasing safety and productivity. Some companies have developed autonomous mining vehicles for large-scale mining operations. Equipment operating in challenging environments requires ongoing maintenance. However, maintenance can keep critical devices offline and consume resources. More precise maintenance means increased uptime for expensive and necessary equipment and significant cost savings. AI-driven

Marc Benioff, Salesforce CEO, predicts a monumental workplace revolution driven by AI agents, a transformation already underway within Salesforce and its client base. He envisions a shift from traditional markets to a vastly larger market focused on

The Rise of AI in HR: Navigating a Workforce with Robot Colleagues The integration of AI into human resources (HR) is no longer a futuristic concept; it's rapidly becoming the new reality. This shift impacts both HR professionals and employees, dem


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

SublimeText3 Linux new version
SublimeText3 Linux latest version

WebStorm Mac version
Useful JavaScript development tools
