Home >Technology peripherals >AI >Building LangChain Agents to Automate Tasks in Python

Building LangChain Agents to Automate Tasks in Python

William Shakespeare
William ShakespeareOriginal
2025-03-04 10:35:10812browse

LangChain’s 90k GitHub stars are all the credibility it needs—right now, it is the hottest framework to build LLM-based applications. Its comprehensive set of tools and components allows you to build end-to-end AI solutions using almost any LLM.

Perhaps at the heart of LangChain’s capabilities are LangChain agents. They are autonomous or semi-autonomous tools that can perform tasks, make decisions, and interact with other tools and APIs. They represent a significant leap forward in automating complex workflows with LLMs.

In this article, you will learn how to build your own LangChain agents that can perform tasks not strictly possible with today's chat applications like ChatGPT.

Setup

Before we get into anything, let’s set up our environment for the tutorial.

First, creating a new Conda environment:

$ conda create -n langchain python=3.9 -y
$ conda activate langchain

Installing LangChain’s packages and a few other necessary libraries:

$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv

Adding the newly created Conda environment to Jupyter as a kernel:

$ ipython kernel install --user --name=langchain

Creating a .env file to store secrets such as API keys:

$ touch .env
$ vim .env  # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'

Retrieving your OpenAI API key from the .env file:

import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

Testing that everything is working correctly by querying GPT-3.5 (the default language model) of OpenAI:

from langchain_openai import OpenAI
llm = OpenAI(openai_api_key=api_key)
question = "Is Messi the best footballer of all time?"
output = llm.invoke(question)
print(output[:75])
There is no definitive answer to this question, as it is subjective and de

Now, we are ready to get started.

What are LangChain Agents?

Let’s spend some time thinking about the agent framework. Specifically, we will consider how it differs from the traditional chain paradigm and what the components of an agent are. Understanding why we need to choose a new way of building applications will prepare us for writing the code.

Chains vs. Agents

The defining trait of agents is their ability to choose the best order of actions to solve a problem given a set of tools.

For example, let’s say we have the following:

  • A weather API
  • ML model for clothing recommendations
  • Strava API for biking routes
  • User preferences database
  • Image recognition model
  • Language model (text generation)

Traditional problem-solving would involve using a chain of select tools from the list:

Chain 1: Weather-based clothing recommender

  1. Call the weather API
  2. Input weather data into the ML clothing model
  3. Generate clothing recommendations
  4. Present results to the user

Chain 2: Weather-based biking route suggester

  1. Call the weather API
  2. Call the Strava API for popular routes
  3. Filter routes based on weather conditions
  4. Present suitable routes to the user

Chain 3: Outfit Photo Analyzer

  1. Receive the user's outfit photo
  2. Use an image recognition model to identify clothing items
  3. Compare with the user preference database
  4. Generate feedback using the text generation model
  5. Present the analysis to the user

Each chain solves a specific problem using a predetermined sequence of steps and a subset of the available tools. They cannot adapt beyond their defined scope. They also require three separate branches of development, which is inefficient in terms of time and resources.

Now, imagine an agentic system with access to all these tools. It would be able to:

  1. Understand the user's query or problem (through natural language with a language model)
  2. Assess which tools are relevant to the problem (reasoning)
  3. Dynamically create a workflow using the most appropriate tools
  4. Execute the workflow, making real-time adjustments if needed (acting)
  5. Evaluate the outcome and learn from past interactions

For example, if a user asks, “What should I wear for my bike ride today?” the agent might check the weather API, analyze suitable biking routes through Strava, recommend appropriate clothing, considering the user's past preferences, and generate a personalized response.

The agent can:

  • Handle a wide variety of problems using the same set of tools
  • Create custom workflows for each unique situation
  • Adapt its approach based on the specific context and user needs
  • Learn from interactions to improve future performance

LangChain’s capacity to transform language models — which, by themselves, only produce text — into reasoning engines that can use the resources at their disposal to take appropriate action is one of its main applications. In short, LangChain enables the development of strong autonomous agents that interact with the outside world.

Key components

A LangChain agent is made up of several components, such as chat models, prompt templates, external tools, and other related constructs. To build successful agents, we need to review each component and understand their use.

Language and chat models

There are a lot of moving parts involved in creating a LangChain agent. The first and most obvious is a language model.

$ conda create -n langchain python=3.9 -y
$ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv

Language models, like OpenAI’s GPT-3.5 Turbo, take and generate strings. They are typically older and work best to answer individual user queries.

Newer and more powerful models are usually chat models, which can take a sequence of messages as inputs and return chat messages as outputs (as opposed to using plain text):

$ conda create -n langchain python=3.9 -y
$ conda activate langchain

Put differently, chat models allow us to have conversations in natural language. In the example above, we are initializing GPT-4o-mini with a system message followed by a user query. Note the use of SystemMessage and HumanMessage classes.

The output is a message object, which is the expected behavior of chat models:

$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv
$ ipython kernel install --user --name=langchain
$ touch .env
$ vim .env  # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'

Besides, they return other useful metadata accessible with dot-notation:

import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')
from langchain_openai import OpenAI
llm = OpenAI(openai_api_key=api_key)
question = "Is Messi the best footballer of all time?"
output = llm.invoke(question)
print(output[:75])

Most agents use chat models because of their updated knowledge base and conversational capabilities. However, for simple agents with no memory requirements, language models like GPT-3.5 will be enough.

Prompt templates

The most efficient way to query language or chat models is by using prompt templates. They allow you to structure your queries consistently and dynamically insert variables, making your interactions with the model more flexible and reusable.

In LangChain, there are many types of prompt templates, with the most basic one being PromptTemplate class. It can be used with language (plain text) models:

There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI
llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct")
question = "What is special about the number 73?"
output = llm.invoke(question)
print(output[:100])

The class requires you to create a string with placeholders for variables you want to replace using the bracket notation. Then, you need to pass this template string to the PromptTemplate class along with the names of the variables, thus constructing your prompt.

Calling .invoke() with values for variables will show how your prompt will be passed to a model.

Passing this prompt template to a language model requires us to chain it using the pipe operator:

1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage
# Initialize the model
chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini')
# Write the messages
messages = [SystemMessage(content='You are a grumpy pirate.'),
           HumanMessage(content="What's up?")]
output = chat_model.invoke(messages)

The pipe operator (|) is part of LangChain Expression Language (LCEL), designed to chain multiple LangChain components and tools.

type(output)
langchain_core.messages.ai.AIMessage

When you use the pipe operator on LangChain objects, you create an instance of RunnableSequence class. A runnable sequence represents a chain of objects that support the .invoke() method, like prompt templates and language/chat models.

Now, let’s look at another prompt template class for chat models:

print(output.content)

We mentioned that chat models require a sequence of messages as inputs. The initial input is usually a system prompt telling the chat model how to behave. So, using the ChatPromptTemplate class, we can easily create chat models with different personalities:

Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!

The class requires a list of role-based messages as input. Each member of the list must be a (role, message) tuple with the variable placeholders defined where needed.

After we have it ready, we can use the same pipe operator to create chat models with different behaviors:

$ conda create -n langchain python=3.9 -y
$ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv

Tools

In a previous section, we mentioned that agents can choose a combination of tools at their disposal to solve a particular problem, with LLMs as reasoning engines under the hood.

LangChain offers integrations with dozens of popular APIs and services to let agents interact with the rest of the world. Most of them are available under the langchain_community package, while some are inside langchain_core.

For example, here is how you can use the ArXiv tool to retrieve paper summaries on various topics:

$ ipython kernel install --user --name=langchain
$ touch .env
$ vim .env  # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'

There is an alternative way to load tools rather than import them by their class name:

import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

Above, we are loading both the arXiv and Dall-E image generator tools at the same time using the load_tools() function. Tools loaded with this function have the same usage syntax:

from langchain_openai import OpenAI
llm = OpenAI(openai_api_key=api_key)
question = "Is Messi the best footballer of all time?"
output = llm.invoke(question)
print(output[:75])
There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI
llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct")
question = "What is special about the number 73?"
output = llm.invoke(question)
print(output[:100])
1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make

Building LangChain Agents to Automate Tasks in Python

The load_tools function requires you to know the string names of tool classes, like the example of ArxivQueryRun versus 'arxiv'. You can quickly check the string name of any tool by running the get_all_tool_names function:

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage
# Initialize the model
chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini')
# Write the messages
messages = [SystemMessage(content='You are a grumpy pirate.'),
           HumanMessage(content="What's up?")]
output = chat_model.invoke(messages)
type(output)

Note that load_tools() is only a shorthand function. When building agents, it is recommended to load tools using their class constructor, which allows you to configure them based on their specific behavior.

Step-by-Step Workflow of How to Build LangChain Agents

Finally, in this section, we will see how to create LangChain agents step-by-step using the knowledge we have gained in the previous sections.

In the coming examples, we will build an agent capable of explaining any topic via three mediums: text, image, or video. More specifically, based on the question asked, the agent will decide whether to explain the topic in what format.

Let’s start. Remember to check out how to set up the environment, which is covered at the beginning of the article. 

1. Defining tools

The first step after configuring our environment is defining the tools we will give to our agent. Let’s import them:

langchain_core.messages.ai.AIMessage

We are importing five classes:

  • WikipediaAPIWrapper: to configure how to access the Wikipedia API
  • WikipediaQueryRun: to generate Wikipedia page summaries
  • YouTubeSearchTool: to search YouTube videos on topics
  • DallEAPIWrapper: to configure how to access OpenAI's DallE endpoint
  • OpenAIDALLEImageGenerationTool: to generate images using prompts

When a user queries our agent, it will decide whether to explain the topic using a Wikipedia article in text format, or by creating an image using Dall-E for visual understanding, or by suggesting YouTube videos for deeper comprehension.

Let’s initialize them, starting with the Wikipedia tool:

$ conda create -n langchain python=3.9 -y
$ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv

DallE image generator:

$ ipython kernel install --user --name=langchain

Building LangChain Agents to Automate Tasks in Python

YouTube search tool:

$ touch .env
$ vim .env  # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'

Take special care of the tool descriptions. The agent will decide which one tool to use based on the description you provide. 

Now, we will put the tools into a list:

import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

We can already bind this set of tools to a chat model without creating an agent:

from langchain_openai import OpenAI
llm = OpenAI(openai_api_key=api_key)
question = "Is Messi the best footballer of all time?"
output = llm.invoke(question)
print(output[:75])

Let’s try calling the model with a simple message:

There is no definitive answer to this question, as it is subjective and de
from langchain_openai import OpenAI
llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct")
question = "What is special about the number 73?"
output = llm.invoke(question)
print(output[:100])

The output shows that none of the bound tools were used when generating an answer. Now, let’s ask a specific question that would force the model to look beyond its training data:

1. Prime Number: 73 is a prime number, which means it is only divisible by 1 and itself. This make
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage
# Initialize the model
chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini')
# Write the messages
messages = [SystemMessage(content='You are a grumpy pirate.'),
           HumanMessage(content="What's up?")]
output = chat_model.invoke(messages)

We can see there is no text output, but OpenAI’s DallE is mentioned. The tool isn’t called yet; the model is simply suggesting we use it. To actually call it — to take action, we need to create an agent.

2. Creating a simple agent

After defining the model and the tools, we create the agent. LangChain offers a high-level create_react_agent() function interface from its langgraph package to quickly create ReAct (reason and act) agents:

type(output)

While initializing the agent with a chat model and a list of tools, we are passing a system prompt to tell the model how to behave in general. It is now ready to accept queries:

langchain_core.messages.ai.AIMessage
print(output.content)

We have received a likely response, which is a simple text answer without tool calls. Now, let’s ask something more to the point:

Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!
output.dict()

This time, there are four messages. Let’s see the message class names and their content:

{'content': "Arrr, not much but the sound of waves and the creakin' of me ship. What do ye want? Make it quick, I've got treasure to hunt and rum to drink!",
'additional_kwargs': {},
'response_metadata': {'token_usage': {'completion_tokens': 38,
  'prompt_tokens': 21,
  'total_tokens': 59},
 'model_name': 'gpt-4o-mini-2024-07-18',
 'system_fingerprint': 'fp_48196bc67a',
 'finish_reason': 'stop',
 'logprobs': None},
'type': 'ai',
'name': None,
'id': 'run-fde829bf-8f5f-4926-a1ed-ab53609ce03a-0',
'example': False,
'tool_calls': [],
'invalid_tool_calls': [],
'usage_metadata': {'input_tokens': 21,
 'output_tokens': 38,
 'total_tokens': 59}}
from langchain_core.prompts import PromptTemplate
query_template = "Tell me about {book_name} by {author}."
prompt = PromptTemplate(input_variables=["book_name", "author"], template=query_template)
prompt.invoke({"book_name": "Song of Ice and Fire", "author": "GRRM"})

Here we go! The third message is from a tool call, which is a summary of a Wikipedia page on photosynthesis. The last message is from the chat model, which is using the tool call’s contents when constructing its answer.

Let’s quickly create a function to modularize the last steps we took:

StringPromptValue(text='Tell me about Song of Ice and Fire by GRRM.')

3. Refining the system prompt

Now, let’s update our system prompt with detailed instructions on how the agent should behave:

from langchain_openai import OpenAI
llm = OpenAI(api_key=api_key)
# Create a chain
chain = prompt | llm
# Invoke the chain
output = chain.invoke({"book_name": "Deathly Hallows", "author": "J.K. Rowling"})
print(output[:100])

Let’s recreate our agent with the new system prompt:

Deathly Hallows is the seventh and final book in the popular Harry Potter series, written by J.K. R
type(chain)

Awesome, based on our message (which was very instructive :), the agent chose the correct tool for the job. Here is the generated image:

Building LangChain Agents to Automate Tasks in Python

4. Adding memory to agents

Right now, our agent is stateless, which means it doesn’t remember previous interactions:

$ conda create -n langchain python=3.9 -y
$ conda activate langchain
$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv

The easiest way to add chat message history to agents is by using langgraph's SqliteSaver class:

$ ipython kernel install --user --name=langchain

We initialize the memory using the .from_conn_string() method of SqliteSaver class, which creates a database file. Then, we pass the memory to the checkpointer parameter of create_react_agent() function.

Now, we need to create a configuration dictionary:

$ touch .env
$ vim .env  # Paste your OPENAI key

The dictionary defines a thread ID to distinguish one conversation from another and it is passed to the .invoke() method of our agent. So, let's update our execute() function to include this behavior:

OPENAI_API_KEY='YOUR_KEY_HERE'
import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

Now, let’s ask the agent about previous queries:

from langchain_openai import OpenAI
llm = OpenAI(openai_api_key=api_key)
question = "Is Messi the best footballer of all time?"
output = llm.invoke(question)
print(output[:75])
There is no definitive answer to this question, as it is subjective and de

As expected, the agent is returning the previous messages! Now, we only need a chat UI like that of ChatGPT and we have got ourselves a custom chatbot.

Future Trends And Developments

Throughout the article, we have caught a glimpse of where LangChain is going in terms of agents. Up until very recently, LangChain had mainly used the AgentExecutor class but it is slowly being replaced by langgraph agents.

Pure LangChain agents are fine to get started, but they require more lines of code to build the same agent than in LangGraph. Also, after a certain point, the AgentExecutor framework won't provide the flexibility LangGraph has for building complex multi-tool agents.

That’s why now is a great time to ride the wave and start with LangGraph directly.

We highly recommend to start using LangSmith as well, which has become a core part of the LangChain ecosystem for building production-grade LLM applications. Here are some of its key benefits:

  • Debugging: LangSmith provides detailed traces of your agent’s execution, making it easier to identify and fix issues.
  • Performance Optimization: With LangSmith, you can analyze token usage, latency, and other performance metrics to optimize your agent’s efficiency.
  • Testing and Evaluation: LangSmith facilitates the creation and management of test datasets, enabling you to rigorously evaluate your agent’s performance across a range of scenarios.
  • Monitoring: In production environments, LangSmith offers real-time monitoring capabilities, allowing you to track your agent’s performance and detect anomalies quickly.

Here is how you can get started with LangSmith:

  1. Sign up for a free account here.
  2. Set environment variables.

export LANGCHAIN_TRACING_V2="true"

export LANGCHAIN_API_KEY="..."

and you are good to go! As you start querying language/chat models, LangSmith starts logging various metrics about each run:

Building LangChain Agents to Automate Tasks in Python

Conclusion

In this article, we explored what makes LangChain agents distinct from chains and the important building blocks used in constructing them. We first introduced what agents are and how they differ from the more traditional chain constructs regarding flexibility and capability in making decisions.

Then we looked at the key components you need to know about in order to build an agent: chat models, tools, and prompt templates. Finally, we ran through two examples demonstrating how to build simple and advanced agents. Natural language processing is developing continually, and LangChain agents are at the forefront of this progression, paving the way for an even more intelligent and versatile family of AI.

Here are some related resources to increase your LangChain:

  • Developing LLM Applications with LangChain Course 
  • An Introduction to Prompt Engineering with LangChain 
  • How to Build LLM Applications with LangChain Tutorial
  • Building a GPT Model with Browsing Capabilities Using LangChain Tools
  • LangChain vs LlamaIndex: A Detailed Comparison

Thank you for reading!

The above is the detailed content of Building LangChain Agents to Automate Tasks in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn