Home >Backend Development >Python Tutorial >How to use Python regular expressions for CSV file processing
With the advent of the big data era, CSV files have become a very important data exchange format. In daily data processing, we often need to perform some customized processing on CSV files, such as filtering out some data, replacing some keywords, etc. In Python, these tasks can be accomplished very conveniently using regular expressions. This article will introduce how to use Python regular expressions for CSV file processing.
First, we need to read the CSV file. In Python, reading CSV files can be easily achieved using the csv module.
import csv
with open('data.csv', newline='') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"') for row in reader: print(', '.join(row))
The above code will read the CSV named data.csv file and print its contents line by line. The parameter delimiter specifies the delimiter, and quotechar specifies the quotation mark.
Next, we can use regular expressions to filter the data in the CSV file. For example, we can select only rows whose first column contains numbers.
import csv
import re
with open('data.csv', newline='') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"') for row in reader: if re.match(r'[0-9]+', row[0]): print(', '.join(row))
The above code uses the re module The match function prints out all lines whose first column is a number.
In addition to filtering data, we can also use regular expressions to replace keywords in CSV files. For example, we can replace all words starting with apple with orange.
import csv
import re
with open('data.csv', newline='') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"') for row in reader: row[0] = re.sub(r'^apple', 'orange', row[0]) print(', '.join(row))
The above code uses the re module The sub function replaces all words starting with apple with orange.
Finally, we need to write the processed data to the CSV file. In Python, you can also use the csv module to write CSV files.
import csv
data = [
['apple', 'banana', 'cherry'], ['dog', 'cat', 'mouse'], ['sun', 'moon', 'star']
]
with open('output.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL) for row in data: writer.writerow(row)
The above code writes the data list into a CSV file named output.csv. The parameters delimiter and quotechar are equivalent to the parameters for reading the CSV file, and the quoting parameter specifies how to deal with quotation marks.
To sum up, using Python regular expressions to process CSV files is very simple and convenient. By using regular expressions appropriately, we can easily implement some complex CSV file processing tasks.
The above is the detailed content of How to use Python regular expressions for CSV file processing. For more information, please follow other related articles on the PHP Chinese website!