Summary of the ten most commonly used file operations in Python

Python tutorial Introduces the ten most commonly used file operations, full of useful information~~

Summary of the ten most commonly used file operations in Python

There are many daily needs for batch processing of files. Writing scripts in Python can be very convenient, but in In this process, you will inevitably have to deal with documents. For the first time, there will be many document operations that you have no way to start with, so you can only find Du Niang.

In this article, Brother Dong has compiled 10 of the most commonly used file operations in Python, which are used in both batch processing and reading files. I believe this review will be helpful.

1. Display the current directory

When we want to know what the current working directory is, we can simply use the os module getcwd() function, or use cwd() of pathlib as shown below.

>>> # 第一种方法:显示当前目录
... import os
... print("当前工作目录:", os.getcwd())
Current Work Directory: /Users/ycui1/PycharmProjects/Medium_Python_Tutorials

>>> # 第二种方法:或者我们也可以使用 pathlib
... from pathlib import Path
... print("当前工作目录:", Path.cwd())
Current Work Directory: /Users/ycui1/PycharmProjects/Medium_Python_Tutorials
If you are using an older version of Python (

2. Create a new directory

To create a directory, you can use the mkdir() function of the os module. This function will create a directory under the specified path, if only the directory name is used, a folder will be created in the current directory, that is, the concept of absolute paths and relative paths.

>>> # 在当前文件夹创建新目录
... os.mkdir("test_folder")
... print("目录是否存在:", os.path.exists("test_folder"))
目录是否存在: True
>>> # 在特定文件夹创建新目录
... os.mkdir('/Users/ycui1/PycharmProjects/tmp_folder')
... print("目录是否存在:", os.path.exists('/Users/ycui1/PycharmProjects/tmp_folder'))
目录是否存在: True

However, if you want to create a multi-level directory, such as a folder under a folder), you need to use the makedirs() function.

>>> # 创建包含子目录的目录
... os.makedirs('tmp_level0/tmp_level1')
... print("目录是否存在:", os.path.exists("tmp_level0/tmp_level1"))
Is the directory there: True

If you are using the latest version of Python (≥3.4), you may consider leveraging the pathlib module to create a new directory. Not only does it create subdirectories, but it also handles any missing directories in the path.

# 使用 pathlib
from pathlib import Path
Path("test_folder").mkdir(parents=True, exist_ok=True)

One problem to note is that if you try to run some of the above codes multiple times, you may encounter the problem "Cannot create a new directory that already exists." We can handle this by setting the exist_ok parameter to True (the default False value will prevent us from creating the directory).

>>> # 使用 pathlib
... from pathlib import Path
... Path("test_folder").mkdir(parents=True, exist_ok=False)
Traceback (most recent call last):
  File "", line 3, in 
  File "/Users/ycui1/.conda/envs/Medium/lib/python3.8/pathlib.py", line 1284, in mkdir
    self._accessor.mkdir(self, mode)
FileExistsError: [Errno 17] File exists: 'test_folder'

3. Delete directories and files

After we finish working on some files or folders, we may want to delete it. To do this, we can use the remove() function in the os module to delete the file. If we want to delete a folder, we should use rmdir() instead.

>>> # 删除一个文件
... print(f"* 删除文件前 {os.path.isfile('tmp.txt')}")
... os.remove('tmp.txt')
... print(f"* 删除文件后 {os.path.exists('tmp.txt')}")
* 删除文件前 True
* 删除文件后 False
>>> # 删除一个文件夹
... print(f"* 删除文件夹前 {os.path.isdir('tmp_folder')}")
... os.rmdir('tmp_folder')
... print(f"* 删除文件夹后 {os.path.exists('tmp_folder')}")
* 删除文件夹前 True
* 删除文件夹后 False

If you use the pathlib module, you can use the unlink() method, and to delete the directory, you can use the rmdir() method. Note that both methods are instance methods of the Path object.

4. Get the file list

When we analyze a certain job or machine learning project for data processing, we need to get the file list in a specific directory.

Typically, file names have matching patterns. Suppose we want to find all .txt files in the directory, we can use the method glob() of the Path object to achieve this. The glob() method creates a generator that allows us to iterate.

>>> txt_files = list(Path('.').glob("*.txt"))
... print("Txt files:", txt_files)
Txt files: [PosixPath('hello_world.txt'), PosixPath('hello.txt')]

Alternatively, it is also convenient to use the glob module directly, as shown below, which has similar functionality by creating a list of file names that can be used. In most cases, such as file reading and writing, both can be used.

>>> from glob import glob
... files = list(glob('h*'))
... print("以h开头的文件:", files)
Files starting with h: ['hello_world.txt', 'hello.txt']

5. Moving and Copying Files

Moving Files

One of the common file management tasks is moving and copying files . In Python, these tasks can be done very easily. To move a file, simply rename the file by replacing its old directory with the target directory. Suppose we need to move all .txt files to another folder, use Path to achieve this.

>>> target_folder = Path("目标文件")
... target_folder.mkdir(parents=True,exist_ok=True)
... source_folder = Path('.')
... txt_files = source_folder.glob('*.txt')
... for txt_file in txt_files:
...     filename = txt_file.name
...     target_path = target_folder.joinpath(filename)
...     print(f"** 移动文件 {filename}")
...     print("目标文件存在:", target_path.exists())
...     txt_file.rename(target_path)
...     print("目标文件存在:", target_path.exists(), '\n')
** 移动文件 hello_world.txt
目标文件存在: False
目标文件存在: True 

** 移动文件 hello.txt
目标文件存在: False
目标文件存在: True

Copy files

We can make use of the functions available in the _shutil_ module, the _shutil_ module is another one in the standard library for file operations useful modules. We can copy() use this function in a module by specifying the source and destination files as strings. A simple example is shown below. Of course, you can use the copy() function in conjunction with the glob() function to process a bunch of files with the same pattern.

>>> import shutil
... source_file = "target_folder/hello.txt"
... target_file = "hello2.txt"
... target_file_path = Path(target_file)
... print("* 复制前,文件存在:", target_file_path.exists())
... shutil.copy(source_file, target_file)
... print("* 复制后,文件存在:", target_file_path.exists())
* 复制前,文件存在: False
* 复制后,文件存在: True

6. Check directory/file

The above example has been using the exists() method to check whether a specific path exists. If it exists, it returns True; if it does not exist, it returns False. This feature is available in both the os and pathlib modules, and their respective usage is as follows.

# os 模块中 exists() 用法

# pathlib 模块中 exists() 用法

Using pathlib, we can also check whether the path is a directory or a file.

# 检查路径是否是目录

# 检查路径是否是文件

7. Get file information

File name

When processing files, in many cases it is necessary to extract the file name . Using Path is very simple. You can view the name attribute path.name on the Path object. If you do not want to add a suffix, you can view the stem attribute path.stem.

for py_file in Path().glob('c*.py'):
...     print('Name with extension:', py_file.name)
...     print('Name only:', py_file.stem)
带文件后缀: closures.py
只有文件名: closures
带文件后缀: counter.py
只有文件名: counter
带文件后缀: context_management.py
只有文件名: context_management

File suffix


>>> file_path = Path('closures.py')
... print("文件后缀:", file_path.suffix)
File Extension: .py



>>> # 路径 path 对象
... current_file_path = Path('iterable_usages.py')
... file_stat = current_file_path.stat()
>>> # 获取文件大小:
... print("文件大小(Bytes):", file_stat.st_size)
文件大小(Bytes): 3531
>>> # 获取最近访问时间
... print("最近访问时间:", file_stat.st_atime)
最近访问时间: 1595435202.310935
>>> # 获取最近修改时间
... print("最近修改时间:", file_stat.st_mtime)
最近修改时间: 1594127561.3204417

8. 读取文件


>>> # 读取所有的文本
... with open("hello2.txt", 'r') as file:
...     print(file.read())
Hello World!
Hello Python!
>>> # 逐行的读取
... with open("hello2.txt", 'r') as file:
...     for i, line in enumerate(file, 1):
...         print(f"* 读取行 #{i}: {line}") 
* 读取行 #1: Hello World!

* 读取行 #2: Hello Python!




9. 写入文件


>>> # 向文件中写入新数据
... with open("hello3.txt", 'w') as file:
...     text_to_write = "Hello Files From Writing"
...     file.write(text_to_write)
>>> # 增加一些数据
... with open("hello3.txt", 'a') as file:
...     text_to_write = "\nHello Files From Appending"
...     file.write(text_to_write)
>>> # 检查文件数据是否正确
... with open("hello3.txt") as file:
...     print(file.read())
Hello Files From Writing
Hello Files From Appending



10. 压缩和解压缩文件



>>> from zipfile import ZipFile
... # 创建压缩文件
... with ZipFile('text_files.zip', 'w') as file:
...     for txt_file in Path().glob('*.txt'):
...         print(f"*添加文件: {txt_file.name} 到压缩文件")
...         file.write(txt_file)
*添加文件: hello3.txt 到压缩文件
*添加文件: hello2.txt 到压缩文件


>>> # 解压缩文件
... with ZipFile('text_files.zip') as zip_file:
...     zip_file.printdir()
...     zip_file.extractall()
File Name                                             Modified             Size
hello3.txt                                     2020-07-30 20:29:50           51
hello2.txt                                     2020-07-30 18:29:52           26



