Home > Article > Web Front-end > Add two spaces to HTML paragraph spacing
This code can be used to import the necessary libraries to scrape and parse web data and import it into a database: Get web pages using the Python requests library. Use the BeautifulSoup library to parse the page and extract the required data. Established database connection and created tables using SQLite3 library. Write the extracted data into a database table. Commit the changes and close the database connection.
Use Python and SQL to scrape and parse web data
import requests from bs4 import BeautifulSoup import sqlite3
url = 'https://example.com/page/' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser')
Use the find_all()
and get_text()
methods to extract the required data from the page.
titles = soup.find_all('h1') titles = [title.get_text() for title in titles]
conn = sqlite3.connect('database.db') c = conn.cursor()
for title in titles: c.execute('INSERT INTO titles (title) VALUES (?)', (title,))
conn.commit() conn.close()
Use This code scrapes the top product title data from Amazon's home page and stores it in a SQLite database. The following is the demo code:
import requests from bs4 import BeautifulSoup import sqlite3 url = 'https://amazon.com' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') titles = soup.find_all('h2', {'class': 'a-size-medium s-inline s-access-title'}) titles = [title.get_text().strip() for title in titles] conn = sqlite3.connect('amazon_titles.db') c = conn.cursor() for title in titles: c.execute('INSERT INTO titles (title) VALUES (?)', (title,)) conn.commit() conn.close()
The above is the detailed content of Add two spaces to HTML paragraph spacing. For more information, please follow other related articles on the PHP Chinese website!