Home > Article > Backend Development > Summary of the ten most commonly used file operations in Python
Python tutorialIntroduces the ten most commonly used file operations, full of useful information~~
Recommended (free): Python tutorial (video)
There are many daily needs for batch processing of files. Writing scripts in Python can be very convenient, but in In this process, you will inevitably have to deal with documents. For the first time, there will be many document operations that you have no way to start with, so you can only find Du Niang.
In this article, Brother Dong has compiled 10 of the most commonly used file operations in Python, which are used in both batch processing and reading files. I believe this review will be helpful.
1. Display the current directory
When we want to know what the current working directory is, we can simply use the os
module getcwd()
function, or use cwd()
of pathlib
as shown below.
>>> # 第一种方法:显示当前目录 ... import os ... print("当前工作目录:", os.getcwd()) ... Current Work Directory: /Users/ycui1/PycharmProjects/Medium_Python_Tutorials >>> # 第二种方法:或者我们也可以使用 pathlib ... from pathlib import Path ... print("当前工作目录:", Path.cwd()) ... Current Work Directory: /Users/ycui1/PycharmProjects/Medium_Python_Tutorials
If you are using an older version of Python (2. Create a new directory
To create a directory, you can use the
mkdir()
function of theos
module. This function will create a directory under the specified path, if only the directory name is used, a folder will be created in the current directory, that is, the concept of absolute paths and relative paths.>>> # 在当前文件夹创建新目录 ... os.mkdir("test_folder") ... print("目录是否存在:", os.path.exists("test_folder")) ... 目录是否存在: True >>> # 在特定文件夹创建新目录 ... os.mkdir('/Users/ycui1/PycharmProjects/tmp_folder') ... print("目录是否存在:", os.path.exists('/Users/ycui1/PycharmProjects/tmp_folder')) ... 目录是否存在: TrueHowever, if you want to create a multi-level directory, such as a folder under a folder), you need to use the
makedirs()
function.>>> # 创建包含子目录的目录 ... os.makedirs('tmp_level0/tmp_level1') ... print("目录是否存在:", os.path.exists("tmp_level0/tmp_level1")) ... Is the directory there: TrueIf you are using the latest version of Python (≥3.4), you may consider leveraging the
pathlib
module to create a new directory. Not only does it create subdirectories, but it also handles any missing directories in the path.# 使用 pathlib from pathlib import Path Path("test_folder").mkdir(parents=True, exist_ok=True)One problem to note is that if you try to run some of the above codes multiple times, you may encounter the problem "Cannot create a new directory that already exists." We can handle this by setting the
exist_ok
parameter toTrue
(the default False value will prevent us from creating the directory).>>> # 使用 pathlib ... from pathlib import Path ... Path("test_folder").mkdir(parents=True, exist_ok=False) ... Traceback (most recent call last): File "<input>", line 3, in <module> File "/Users/ycui1/.conda/envs/Medium/lib/python3.8/pathlib.py", line 1284, in mkdir self._accessor.mkdir(self, mode) FileExistsError: [Errno 17] File exists: 'test_folder'3. Delete directories and files
After we finish working on some files or folders, we may want to delete it. To do this, we can use the
remove()
function in theos
module to delete the file. If we want to delete a folder, we should usermdir()
instead.>>> # 删除一个文件 ... print(f"* 删除文件前 {os.path.isfile('tmp.txt')}") ... os.remove('tmp.txt') ... print(f"* 删除文件后 {os.path.exists('tmp.txt')}") ... * 删除文件前 True * 删除文件后 False >>> # 删除一个文件夹 ... print(f"* 删除文件夹前 {os.path.isdir('tmp_folder')}") ... os.rmdir('tmp_folder') ... print(f"* 删除文件夹后 {os.path.exists('tmp_folder')}") ... * 删除文件夹前 True * 删除文件夹后 FalseIf you use the
pathlib
module, you can use theunlink()
method, and to delete the directory, you can use thermdir()
method. Note that both methods are instance methods of the Path object.4. Get the file list
When we analyze a certain job or machine learning project for data processing, we need to get the file list in a specific directory.
Typically, file names have matching patterns. Suppose we want to find all .txt files in the directory, we can use the method
glob()
of the Path object to achieve this. Theglob()
method creates a generator that allows us to iterate.>>> txt_files = list(Path('.').glob("*.txt")) ... print("Txt files:", txt_files) ... Txt files: [PosixPath('hello_world.txt'), PosixPath('hello.txt')]Alternatively, it is also convenient to use the
glob module
directly, as shown below, which has similar functionality by creating a list of file names that can be used. In most cases, such as file reading and writing, both can be used.>>> from glob import glob ... files = list(glob('h*')) ... print("以h开头的文件:", files) ... Files starting with h: ['hello_world.txt', 'hello.txt']5. Moving and Copying Files
Moving Files
One of the common file management tasks is moving and copying files . In Python, these tasks can be done very easily. To move a file, simply rename the file by replacing its old directory with the target directory. Suppose we need to move all .txt files to another folder, use
Path
to achieve this.>>> target_folder = Path("目标文件") ... target_folder.mkdir(parents=True,exist_ok=True) ... source_folder = Path('.') ... ... txt_files = source_folder.glob('*.txt') ... for txt_file in txt_files: ... filename = txt_file.name ... target_path = target_folder.joinpath(filename) ... print(f"** 移动文件 {filename}") ... print("目标文件存在:", target_path.exists()) ... txt_file.rename(target_path) ... print("目标文件存在:", target_path.exists(), '\n') ... ** 移动文件 hello_world.txt 目标文件存在: False 目标文件存在: True ** 移动文件 hello.txt 目标文件存在: False 目标文件存在: TrueCopy files
We can make use of the functions available in the
_shutil_
module, the _shutil_ module is another one in the standard library for file operations useful modules. We cancopy()
use this function in a module by specifying the source and destination files as strings. A simple example is shown below. Of course, you can use thecopy()
function in conjunction with theglob()
function to process a bunch of files with the same pattern.>>> import shutil ... ... source_file = "target_folder/hello.txt" ... target_file = "hello2.txt" ... target_file_path = Path(target_file) ... print("* 复制前,文件存在:", target_file_path.exists()) ... shutil.copy(source_file, target_file) ... print("* 复制后,文件存在:", target_file_path.exists()) ... * 复制前,文件存在: False * 复制后,文件存在: True6. Check directory/file
The above example has been using the
exists()
method to check whether a specific path exists. If it exists, it returns True; if it does not exist, it returns False. This feature is available in both theos
andpathlib
modules, and their respective usage is as follows.# os 模块中 exists() 用法 os.path.exists('path_to_check') # pathlib 模块中 exists() 用法 Path('directory_path').exists()Using
pathlib
, we can also check whether the path is a directory or a file.# 检查路径是否是目录 os.path.isdir('需要检查的路径') Path('需要检查的路径').is_dir() # 检查路径是否是文件 os.path.isfile('需要检查的路径') Path('需要检查的路径').is_file()7. Get file information
File name
When processing files, in many cases it is necessary to extract the file name . Using Path is very simple. You can view the name attribute
path.name
on the Path object. If you do not want to add a suffix, you can view the stem attributepath.stem
.for py_file in Path().glob('c*.py'): ... print('Name with extension:', py_file.name) ... print('Name only:', py_file.stem) ... 带文件后缀: closures.py 只有文件名: closures 带文件后缀: counter.py 只有文件名: counter 带文件后缀: context_management.py 只有文件名: context_managementFile suffix
如果想单独提取文件的后缀,可查看Path对象的
suffix
属性。>>> file_path = Path('closures.py') ... print("文件后缀:", file_path.suffix) ... File Extension: .py文件更多信息
如果要获取有关文件的更多信息,例如文件大小和修改时间,则可以使用该
stat()
方法,该方法和os.stat()
一样。>>> # 路径 path 对象 ... current_file_path = Path('iterable_usages.py') ... file_stat = current_file_path.stat() ... >>> # 获取文件大小: ... print("文件大小(Bytes):", file_stat.st_size) 文件大小(Bytes): 3531 >>> # 获取最近访问时间 ... print("最近访问时间:", file_stat.st_atime) 最近访问时间: 1595435202.310935 >>> # 获取最近修改时间 ... print("最近修改时间:", file_stat.st_mtime) 最近修改时间: 1594127561.32044178. 读取文件
最重要的文件操作之一就是从文件中读取数据。读取文件,最常规的方法是使用内置
open()
函数创建文件对象。默认情况下,该函数将以读取模式打开文件,并将文件中的数据视为文本。>>> # 读取所有的文本 ... with open("hello2.txt", 'r') as file: ... print(file.read()) ... Hello World! Hello Python! >>> # 逐行的读取 ... with open("hello2.txt", 'r') as file: ... for i, line in enumerate(file, 1): ... print(f"* 读取行 #{i}: {line}") ... * 读取行 #1: Hello World! * 读取行 #2: Hello Python!如果文件中没有太多数据,则可以使用该
read()
方法一次读取所有内容。但如果文件很大,则应考虑使用生成器,生成器可以逐行处理数据。默认将文件内容视为文本。如果要使用二进制文件,则应明确指定用
r
还是rb
。另一个棘手的问题是文件的编码。在正常情况下,
open()
处理编码使用utf-8
编码,如果要使用其他编码处理文件,应设置encoding
参数。9. 写入文件
仍然使用
open()
函数,将模式改为w
或a
打开文件来创建文件对象。w
模式下会覆盖旧数据写入新数据,a
模式下可在原有数据基础上增加新数据。>>> # 向文件中写入新数据 ... with open("hello3.txt", 'w') as file: ... text_to_write = "Hello Files From Writing" ... file.write(text_to_write) ... >>> # 增加一些数据 ... with open("hello3.txt", 'a') as file: ... text_to_write = "\nHello Files From Appending" ... file.write(text_to_write) ... >>> # 检查文件数据是否正确 ... with open("hello3.txt") as file: ... print(file.read()) ... Hello Files From Writing Hello Files From Appending上面每次打开文件时都使用
with
语句。
with
语句为我们创建了一个处理文件的上下文,当我们完成文件操作后,它可以关闭文件对象。这点很重要,如果我们不及时关闭打开的文件对象,它很有可能会被损坏。10. 压缩和解压缩文件
压缩文件
zipfile
模块提供了文件压缩的功能。使用ZipFile()
函数创建一个zip
文件对象,类似于我们对open()函数所做的操作,两者都涉及创建由上下文管理器管理的文件对象。>>> from zipfile import ZipFile ... ... # 创建压缩文件 ... with ZipFile('text_files.zip', 'w') as file: ... for txt_file in Path().glob('*.txt'): ... print(f"*添加文件: {txt_file.name} 到压缩文件") ... file.write(txt_file) ... *添加文件: hello3.txt 到压缩文件 *添加文件: hello2.txt 到压缩文件解压缩文件
>>> # 解压缩文件 ... with ZipFile('text_files.zip') as zip_file: ... zip_file.printdir() ... zip_file.extractall() ... File Name Modified Size hello3.txt 2020-07-30 20:29:50 51 hello2.txt 2020-07-30 18:29:52 26结论
以上就是整理的Python常用文件操作,全部使用内置函数实现。当然,也可以借助比如
pandas
等库来完成一些操作。
The above is the detailed content of Summary of the ten most commonly used file operations in Python. For more information, please follow other related articles on the PHP Chinese website!