Home  >  Article  >  Backend Development  >  Summary of the ten most commonly used file operations in Python

Summary of the ten most commonly used file operations in Python

coldplay.xixi
coldplay.xixiforward
2020-12-28 17:45:072600browse

Python tutorialIntroduces the ten most commonly used file operations, full of useful information~~

Summary of the ten most commonly used file operations in Python

Recommended (free): Python tutorial (video)

There are many daily needs for batch processing of files. Writing scripts in Python can be very convenient, but in In this process, you will inevitably have to deal with documents. For the first time, there will be many document operations that you have no way to start with, so you can only find Du Niang.

In this article, Brother Dong has compiled 10 of the most commonly used file operations in Python, which are used in both batch processing and reading files. I believe this review will be helpful.

1. Display the current directory

When we want to know what the current working directory is, we can simply use the os module getcwd() function, or use cwd() of pathlib as shown below.

>>> # 第一种方法:显示当前目录
... import os
... print("当前工作目录:", os.getcwd())
... 
Current Work Directory: /Users/ycui1/PycharmProjects/Medium_Python_Tutorials

>>> # 第二种方法:或者我们也可以使用 pathlib
... from pathlib import Path
... print("当前工作目录:", Path.cwd())
... 
Current Work Directory: /Users/ycui1/PycharmProjects/Medium_Python_Tutorials
If you are using an older version of Python (

2. Create a new directory

To create a directory, you can use the mkdir() function of the os module. This function will create a directory under the specified path, if only the directory name is used, a folder will be created in the current directory, that is, the concept of absolute paths and relative paths.

>>> # 在当前文件夹创建新目录
... os.mkdir("test_folder")
... print("目录是否存在:", os.path.exists("test_folder"))
... 
目录是否存在: True
>>> # 在特定文件夹创建新目录
... os.mkdir('/Users/ycui1/PycharmProjects/tmp_folder')
... print("目录是否存在:", os.path.exists('/Users/ycui1/PycharmProjects/tmp_folder'))
... 
目录是否存在: True

However, if you want to create a multi-level directory, such as a folder under a folder), you need to use the makedirs() function.

>>> # 创建包含子目录的目录
... os.makedirs('tmp_level0/tmp_level1')
... print("目录是否存在:", os.path.exists("tmp_level0/tmp_level1"))
... 
Is the directory there: True

If you are using the latest version of Python (≥3.4), you may consider leveraging the pathlib module to create a new directory. Not only does it create subdirectories, but it also handles any missing directories in the path.

# 使用 pathlib
from pathlib import Path
Path("test_folder").mkdir(parents=True, exist_ok=True)

One problem to note is that if you try to run some of the above codes multiple times, you may encounter the problem "Cannot create a new directory that already exists." We can handle this by setting the exist_ok parameter to True (the default False value will prevent us from creating the directory).

>>> # 使用 pathlib
... from pathlib import Path
... Path("test_folder").mkdir(parents=True, exist_ok=False)
... 
Traceback (most recent call last):
  File "", line 3, in 
  File "/Users/ycui1/.conda/envs/Medium/lib/python3.8/pathlib.py", line 1284, in mkdir
    self._accessor.mkdir(self, mode)
FileExistsError: [Errno 17] File exists: 'test_folder'

3. Delete directories and files

After we finish working on some files or folders, we may want to delete it. To do this, we can use the remove() function in the os module to delete the file. If we want to delete a folder, we should use rmdir() instead.

>>> # 删除一个文件
... print(f"* 删除文件前 {os.path.isfile('tmp.txt')}")
... os.remove('tmp.txt')
... print(f"* 删除文件后 {os.path.exists('tmp.txt')}")
... 
* 删除文件前 True
* 删除文件后 False
>>> # 删除一个文件夹
... print(f"* 删除文件夹前 {os.path.isdir('tmp_folder')}")
... os.rmdir('tmp_folder')
... print(f"* 删除文件夹后 {os.path.exists('tmp_folder')}")
... 
* 删除文件夹前 True
* 删除文件夹后 False

If you use the pathlib module, you can use the unlink() method, and to delete the directory, you can use the rmdir() method. Note that both methods are instance methods of the Path object.

4. Get the file list

When we analyze a certain job or machine learning project for data processing, we need to get the file list in a specific directory.

Typically, file names have matching patterns. Suppose we want to find all .txt files in the directory, we can use the method glob() of the Path object to achieve this. The glob() method creates a generator that allows us to iterate.

>>> txt_files = list(Path('.').glob("*.txt"))
... print("Txt files:", txt_files)
... 
Txt files: [PosixPath('hello_world.txt'), PosixPath('hello.txt')]

Alternatively, it is also convenient to use the glob module directly, as shown below, which has similar functionality by creating a list of file names that can be used. In most cases, such as file reading and writing, both can be used.

>>> from glob import glob
... files = list(glob('h*'))
... print("以h开头的文件:", files)
... 
Files starting with h: ['hello_world.txt', 'hello.txt']

5. Moving and Copying Files

Moving Files

One of the common file management tasks is moving and copying files . In Python, these tasks can be done very easily. To move a file, simply rename the file by replacing its old directory with the target directory. Suppose we need to move all .txt files to another folder, use Path to achieve this.

>>> target_folder = Path("目标文件")
... target_folder.mkdir(parents=True,exist_ok=True)
... source_folder = Path('.')
... 
... txt_files = source_folder.glob('*.txt')
... for txt_file in txt_files:
...     filename = txt_file.name
...     target_path = target_folder.joinpath(filename)
...     print(f"** 移动文件 {filename}")
...     print("目标文件存在:", target_path.exists())
...     txt_file.rename(target_path)
...     print("目标文件存在:", target_path.exists(), '\n')
... 
** 移动文件 hello_world.txt
目标文件存在: False
目标文件存在: True 

** 移动文件 hello.txt
目标文件存在: False
目标文件存在: True

Copy files

We can make use of the functions available in the _shutil_ module, the _shutil_ module is another one in the standard library for file operations useful modules. We can copy() use this function in a module by specifying the source and destination files as strings. A simple example is shown below. Of course, you can use the copy() function in conjunction with the glob() function to process a bunch of files with the same pattern.

>>> import shutil
... 
... source_file = "target_folder/hello.txt"
... target_file = "hello2.txt"
... target_file_path = Path(target_file)
... print("* 复制前,文件存在:", target_file_path.exists())
... shutil.copy(source_file, target_file)
... print("* 复制后,文件存在:", target_file_path.exists())
... 
* 复制前,文件存在: False
* 复制后,文件存在: True

6. Check directory/file

The above example has been using the exists() method to check whether a specific path exists. If it exists, it returns True; if it does not exist, it returns False. This feature is available in both the os and pathlib modules, and their respective usage is as follows.

# os 模块中 exists() 用法
os.path.exists('path_to_check')

# pathlib 模块中 exists() 用法
Path('directory_path').exists()

Using pathlib, we can also check whether the path is a directory or a file.

# 检查路径是否是目录
os.path.isdir('需要检查的路径')
Path('需要检查的路径').is_dir()

# 检查路径是否是文件
os.path.isfile('需要检查的路径')
Path('需要检查的路径').is_file()

7. Get file information

File name

When processing files, in many cases it is necessary to extract the file name . Using Path is very simple. You can view the name attribute path.name on the Path object. If you do not want to add a suffix, you can view the stem attribute path.stem.

for py_file in Path().glob('c*.py'):
...     print('Name with extension:', py_file.name)
...     print('Name only:', py_file.stem)
... 
带文件后缀: closures.py
只有文件名: closures
带文件后缀: counter.py
只有文件名: counter
带文件后缀: context_management.py
只有文件名: context_management

File suffix

如果想单独提取文件的后缀,可查看Path对象的suffix属性。

>>> file_path = Path('closures.py')
... print("文件后缀:", file_path.suffix)
... 
File Extension: .py

文件更多信息

如果要获取有关文件的更多信息,例如文件大小和修改时间,则可以使用该stat()方法,该方法和os.stat()一样。

>>> # 路径 path 对象
... current_file_path = Path('iterable_usages.py')
... file_stat = current_file_path.stat()
... 
>>> # 获取文件大小:
... print("文件大小(Bytes):", file_stat.st_size)
文件大小(Bytes): 3531
>>> # 获取最近访问时间
... print("最近访问时间:", file_stat.st_atime)
最近访问时间: 1595435202.310935
>>> # 获取最近修改时间
... print("最近修改时间:", file_stat.st_mtime)
最近修改时间: 1594127561.3204417

8. 读取文件

最重要的文件操作之一就是从文件中读取数据。读取文件,最常规的方法是使用内置open()函数创建文件对象。默认情况下,该函数将以读取模式打开文件,并将文件中的数据视为文本。

>>> # 读取所有的文本
... with open("hello2.txt", 'r') as file:
...     print(file.read())
... 
Hello World!
Hello Python!
>>> # 逐行的读取
... with open("hello2.txt", 'r') as file:
...     for i, line in enumerate(file, 1):
...         print(f"* 读取行 #{i}: {line}") 
... 
* 读取行 #1: Hello World!

* 读取行 #2: Hello Python!

如果文件中没有太多数据,则可以使用该read()方法一次读取所有内容。但如果文件很大,则应考虑使用生成器,生成器可以逐行处理数据。

默认将文件内容视为文本。如果要使用二进制文件,则应明确指定用r还是rb

另一个棘手的问题是文件的编码。在正常情况下,open()处理编码使用utf-8编码,如果要使用其他编码处理文件,应设置encoding参数。

9. 写入文件

仍然使用open()函数,将模式改为wa打开文件来创建文件对象。w模式下会覆盖旧数据写入新数据,a模式下可在原有数据基础上增加新数据。

>>> # 向文件中写入新数据
... with open("hello3.txt", 'w') as file:
...     text_to_write = "Hello Files From Writing"
...     file.write(text_to_write)
... 
>>> # 增加一些数据
... with open("hello3.txt", 'a') as file:
...     text_to_write = "\nHello Files From Appending"
...     file.write(text_to_write)
... 
>>> # 检查文件数据是否正确
... with open("hello3.txt") as file:
...     print(file.read())
... 
Hello Files From Writing
Hello Files From Appending

上面每次打开文件时都使用with语句。

with语句为我们创建了一个处理文件的上下文,当我们完成文件操作后,它可以关闭文件对象。这点很重要,如果我们不及时关闭打开的文件对象,它很有可能会被损坏。

10. 压缩和解压缩文件

压缩文件

zipfile模块提供了文件压缩的功能。使用ZipFile()函数创建一个zip文件对象,类似于我们对open()函数所做的操作,两者都涉及创建由上下文管理器管理的文件对象。

>>> from zipfile import ZipFile
... 
... # 创建压缩文件
... with ZipFile('text_files.zip', 'w') as file:
...     for txt_file in Path().glob('*.txt'):
...         print(f"*添加文件: {txt_file.name} 到压缩文件")
...         file.write(txt_file)
... 
*添加文件: hello3.txt 到压缩文件
*添加文件: hello2.txt 到压缩文件

解压缩文件

>>> # 解压缩文件
... with ZipFile('text_files.zip') as zip_file:
...     zip_file.printdir()
...     zip_file.extractall()
... 
File Name                                             Modified             Size
hello3.txt                                     2020-07-30 20:29:50           51
hello2.txt                                     2020-07-30 18:29:52           26

结论

以上就是整理的Python常用文件操作,全部使用内置函数实现。当然,也可以借助比如pandas等库来完成一些操作。

The above is the detailed content of Summary of the ten most commonly used file operations in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete