Home  >  Article  >  Backend Development  >  How can BeautifulSoup be used to extract data from a HTML table in Python?

How can BeautifulSoup be used to extract data from a HTML table in Python?

Patricia Arquette
Patricia ArquetteOriginal
2024-10-28 10:00:29971browse

How can BeautifulSoup be used to extract data from a HTML table in Python?

BeautifulSoup Parsing Table

In Python, BeautifulSoup provides powerful methods for parsing HTML documents. When faced with a scenario like this where you need to retrieve specific data from a table, BeautifulSoup comes in handy.

To extract the targeted line items table, utilize soup.find(), specifying the appropriate attributes within the parentheses. In this case, you'll need:

<code class="python">table = soup.find("table", {"class": "lineItemsTable"})</code>

Next, you can iterate over each row in the table using table.findAll("tr"). Within each row, you can access the table cells (td) using row.findAll("td").

Here's an enhanced code snippet:

<code class="python">data = []
table_body = table.find('tbody')

rows = table_body.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele])  # Remove empty values</code>

This code will produce a list of lists, with each sublist representing a row in the table. It will efficiently capture the necessary data from the website.

The above is the detailed content of How can BeautifulSoup be used to extract data from a HTML table in Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn