Home  >  Article  >  Backend Development  >  Scraping New Telegram Channels

Scraping New Telegram Channels

Barbara Streisand
Barbara StreisandOriginal
2024-11-09 22:12:02306browse

Scraping New Telegram Channels

Scraping New Telegram Channels Daily with Python and GroupFind API

Telegram channels are growing every day, and finding the newest ones can give you insights into trending communities and popular topics. Using the GroupFind API, we can easily pull in fresh channels daily and save them to a CSV for analysis or monitoring. In this tutorial, I’ll walk you through a simple Python script to automate this process.

The GroupFind API

The GroupFind API offers an endpoint for retrieving newly listed Telegram groups:

https://api.groupfind.org/api/groups?skip=0&sort=newest

This endpoint returns data in JSON format, with fields like groupTitle, category, memberCount, tags, and more. We’ll use this data to build our CSV, updating it daily with new listings.

Setting Up the Python Script

Let’s start by importing the necessary libraries and setting up a function to pull the latest data and save it to a CSV file.

Step 1: Import Required Libraries

import requests
import csv
from datetime import datetime
import time

Step 2: Define the Function to Fetch and Save Data

Here, we’ll set up a function that:

  1. Makes a GET request to the API.
  2. Extracts relevant data.
  3. Writes or appends to a CSV file.
def fetch_and_save_new_telegram_channels():
    url = "https://api.groupfind.org/api/groups?skip=0&sort=newest"
    response = requests.get(url)

    if response.status_code == 200:
        channels = response.json()

        filename = "new_telegram_channels.csv"
        fieldnames = [
            "ID", "Title", "Category", "Member Count", "NSFW", 
            "Description", "Tags", "Profile Photo URL", "Added Date"
        ]

        with open(filename, mode="a", newline="", encoding="utf-8") as file:
            writer = csv.DictWriter(file, fieldnames=fieldnames)

            if file.tell() == 0:
                writer.writeheader()  # Write header only once

            for channel in channels:
                writer.writerow({
                    "ID": channel["id"],
                    "Title": channel["groupTitle"],
                    "Category": channel["category"],
                    "Member Count": channel["memberCount"],
                    "NSFW": channel["isNsfw"],
                    "Description": channel["groupDescription"],
                    "Tags": ", ".join(channel["tags"]),
                    "Profile Photo URL": channel["profilePhoto"],
                    "Added Date": channel["addedDate"]
                })

        print(f"Successfully added {len(channels)} new channels to {filename}.")
    else:
        print("Failed to fetch data. Status code:", response.status_code)

Step 3: Automate Daily Fetching with a Scheduler

To automate this script to run daily, we can use Python’s built-in time module for simplicity, or set it up as a cron job on a server.

def run_daily():
    while True:
        print(f"Running script at {datetime.now()}")
        fetch_and_save_new_telegram_channels()
        time.sleep(86400)  # Wait for 24 hours

Running the Script

Simply run the script, and it will fetch new Telegram channels each day, appending them to new_telegram_channels.csv. The file will accumulate data over time, providing a growing record of fresh Telegram communities.

if __name__ == "__main__":
    run_daily()

The above is the detailed content of Scraping New Telegram Channels. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn