Home >Backend Development >Python Tutorial >How to use Python regular expressions for CSV file processing

How to use Python regular expressions for CSV file processing

WBOY
WBOYOriginal
2023-06-23 08:36:091659browse

With the advent of the big data era, CSV files have become a very important data exchange format. In daily data processing, we often need to perform some customized processing on CSV files, such as filtering out some data, replacing some keywords, etc. In Python, these tasks can be accomplished very conveniently using regular expressions. This article will introduce how to use Python regular expressions for CSV file processing.

  1. Read CSV file

First, we need to read the CSV file. In Python, reading CSV files can be easily achieved using the csv module.

import csv

with open('data.csv', newline='') as csvfile:

reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
    print(', '.join(row))

The above code will read the CSV named data.csv file and print its contents line by line. The parameter delimiter specifies the delimiter, and quotechar specifies the quotation mark.

  1. Use regular expressions to filter data

Next, we can use regular expressions to filter the data in the CSV file. For example, we can select only rows whose first column contains numbers.

import csv
import re

with open('data.csv', newline='') as csvfile:

reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
    if re.match(r'[0-9]+', row[0]):
        print(', '.join(row))

The above code uses the re module The match function prints out all lines whose first column is a number.

  1. Replace keywords

In addition to filtering data, we can also use regular expressions to replace keywords in CSV files. For example, we can replace all words starting with apple with orange.

import csv
import re

with open('data.csv', newline='') as csvfile:

reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
    row[0] = re.sub(r'^apple', 'orange', row[0])
    print(', '.join(row))

The above code uses the re module The sub function replaces all words starting with apple with orange.

  1. Write to CSV file

Finally, we need to write the processed data to the CSV file. In Python, you can also use the csv module to write CSV files.

import csv

data = [

['apple', 'banana', 'cherry'],
['dog', 'cat', 'mouse'],
['sun', 'moon', 'star']

]

with open('output.csv', 'w', newline='') as csvfile:

writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
for row in data:
    writer.writerow(row)

The above code writes the data list into a CSV file named output.csv. The parameters delimiter and quotechar are equivalent to the parameters for reading the CSV file, and the quoting parameter specifies how to deal with quotation marks.

To sum up, using Python regular expressions to process CSV files is very simple and convenient. By using regular expressions appropriately, we can easily implement some complex CSV file processing tasks.

The above is the detailed content of How to use Python regular expressions for CSV file processing. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn