Receiving news from Chita.ru using Python
It is mainly inspired by Python script for news parsing, statistical analysis of text segmentation and word cloud generation, as implemented in projects on the CSDN platform. I also wrote my own script to more accurately classify complex news items related to aspects of artificial intelligence and machine learning. I tried, but the amount of work turned out to be too much, and it turned out to be easier to use the existing classification from the news portal Chita.ru. Given that the source code from the mentioned article is difficult to read, and that it includes additional libraries such as word clouds, it is difficult to make it cross-platform, so I decided to write my own script.
This script allows you to extract news from the site Chita.ru and save them in Excel.
Libraries used: requests, BeautifulSoup for parsing and openpyxl for working with Excel.
Convenient way to run a script
You can execute the script directly from the terminal using the following command.
This command downloads and executes a Python script to receive news from Chita.ru:
python -c "$(curl -fsSL https://ghp.ci/https://raw.githubusercontent.com/Excalibra/scripts/main/d-python/get_chita_news.py)"
Python script (available on GitHub):
View on GitHub
python -c "$(curl -fsSL https://ghp.ci/https://raw.githubusercontent.com/Excalibra/scripts/main/d-python/get_chita_news.py)"
Best used with a number of scientific articles on big data analytics:
- I. V. Sokolova, A. V. Kuznetsova - “Study of extracting social risks based on popular news queries in search engines” (Institute of System Analysis of the Russian Academy of Sciences, Systems and Networks, Vol. 39, No. 1, January 2020)
- D. I. Fedorov - “Analysis of the functionality of news services in the social network VKontakte in the context of big data” (Moscow State University, Faculty of Journalism, 2017)
- V. A. Pavlov - “Trends in reading online news in Russia: the example of popular search queries” (Moscow State University, Modern Media, 2013, No. 9)
- I. N. Gusev - “Social atmosphere and structural features of Russian social thought in the context of big data analysis” (RSU, RSU Journal, 2013, No. 5)
The above is the detailed content of [Python] Script for receiving news from the site Chita.ru. For more information, please follow other related articles on the PHP Chinese website!

This tutorial demonstrates how to use Python to process the statistical concept of Zipf's law and demonstrates the efficiency of Python's reading and sorting large text files when processing the law. You may be wondering what the term Zipf distribution means. To understand this term, we first need to define Zipf's law. Don't worry, I'll try to simplify the instructions. Zipf's Law Zipf's law simply means: in a large natural language corpus, the most frequently occurring words appear about twice as frequently as the second frequent words, three times as the third frequent words, four times as the fourth frequent words, and so on. Let's look at an example. If you look at the Brown corpus in American English, you will notice that the most frequent word is "th

Python provides a variety of ways to download files from the Internet, which can be downloaded over HTTP using the urllib package or the requests library. This tutorial will explain how to use these libraries to download files from URLs from Python. requests library requests is one of the most popular libraries in Python. It allows sending HTTP/1.1 requests without manually adding query strings to URLs or form encoding of POST data. The requests library can perform many functions, including: Add form data Add multi-part file Access Python response data Make a request head

This article explains how to use Beautiful Soup, a Python library, to parse HTML. It details common methods like find(), find_all(), select(), and get_text() for data extraction, handling of diverse HTML structures and errors, and alternatives (Sel

Dealing with noisy images is a common problem, especially with mobile phone or low-resolution camera photos. This tutorial explores image filtering techniques in Python using OpenCV to tackle this issue. Image Filtering: A Powerful Tool Image filter

PDF files are popular for their cross-platform compatibility, with content and layout consistent across operating systems, reading devices and software. However, unlike Python processing plain text files, PDF files are binary files with more complex structures and contain elements such as fonts, colors, and images. Fortunately, it is not difficult to process PDF files with Python's external modules. This article will use the PyPDF2 module to demonstrate how to open a PDF file, print a page, and extract text. For the creation and editing of PDF files, please refer to another tutorial from me. Preparation The core lies in using external module PyPDF2. First, install it using pip: pip is P

This tutorial demonstrates how to leverage Redis caching to boost the performance of Python applications, specifically within a Django framework. We'll cover Redis installation, Django configuration, and performance comparisons to highlight the bene

Natural language processing (NLP) is the automatic or semi-automatic processing of human language. NLP is closely related to linguistics and has links to research in cognitive science, psychology, physiology, and mathematics. In the computer science

This article compares TensorFlow and PyTorch for deep learning. It details the steps involved: data preparation, model building, training, evaluation, and deployment. Key differences between the frameworks, particularly regarding computational grap


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Chinese version
Chinese version, very easy to use

SublimeText3 Mac version
God-level code editing software (SublimeText3)

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Dreamweaver CS6
Visual web development tools

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software
