Home >Backend Development >Python Tutorial >How to Build AI Agents that can Use any Website

How to Build AI Agents that can Use any Website

Susan Sarandon
Susan SarandonOriginal
2025-01-08 00:02:40256browse

Connecting AI Agents to the Web: A Developer's Journey and the Rise of Computer Use

One major hurdle in AI agent development over the past two years has been reliably granting web access. Consider an AI agent designed to send emails: how do you connect it to Gmail or Outlook? APIs, websites, or autonomous web agents? This article explores various methods.

APIs and SDKs: A Limited Approach

Many developers utilize APIs and SDKs. This offers low latency and robust authentication, but limitations exist:

  • API Unavailability: Not all web services provide APIs.
  • Documentation Challenges: Outdated or poorly written documentation is common.
  • Feature Gaps: APIs often lack the full functionality of their corresponding websites, hindering specific tasks.

Fortunately, several services offer API call libraries:

  • Composio: Provides tools for AI agents with strong authentication.
  • Langchain tools: A resource for Langchain/graph agents.
  • Apify: A vast community-driven API library.

However, for universal web service access, we must move beyond APIs.

Website Interaction: The Human Approach

Reliable AI agent website interaction enables automation of any web-based human task. But how?

Many developers initially use browser testing frameworks like Selenium or Playwright. This approach, however, faces challenges:

  • Fragility: Website changes (e.g., A/B testing) easily break scripts.
  • Detectability: Test browsers are easily identified and blocked.
  • Production Deployment: Hosting browsers, managing authentication, and rotating proxies are complex in production.

To address these issues, we experimented with a Browser SDK that:

  1. Employs natural language selectors (e.g., get_element("find the login button")) instead of brittle CSS selectors.
  2. Integrates built-in authentication.
  3. Offers pre-configured remote hosting with built-in rotating proxies to prevent blocking.

This work, now open-source (Dendrite SDK), is no longer under active development but remains available for study and adaptation. Similar alternatives include:

  • AgentQL: A Python library.
  • Stagehand: A JavaScript/TypeScript library.

Computer Use: The Future of Web AI Agents?

Rich Sutton's "Bitter Lesson" highlights the dominance of generalizable AI solutions scalable with increased compute. Anthropic's Computer Use embodies this principle, allowing LLMs to directly control computers/browsers using mouse and keyboard input, eliminating the need for scripts and API calls. Their approach emphasizes general computer skills over task-specific tools. This aligns perfectly with the Bitter Lesson, suggesting that the most versatile AI agents will directly interact with the web like humans. Early results show high reliability in complex tasks using well-crafted prompts, often enhanced by Anthropic's prompt improver.

Conclusion: Embracing the Future

While APIs remain valuable, the future likely favors Computer Use-like approaches for most AI agents. If an agent can log in and use a website's search function, extracting conclusions from top results, why rely on the entire database via an API? The question for AI developers is whether to embrace this generalizable approach or risk facing the limitations of more specialized methods.

Note: This is my first dev.to post. Feedback on improving future posts is welcome. Questions on AI agents or AI-driven task automation are also encouraged. How to Build AI Agents that can Use any Website How to Build AI Agents that can Use any Website How to Build AI Agents that can Use any Website How to Build AI Agents that can Use any Website

The above is the detailed content of How to Build AI Agents that can Use any Website. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn