Home  >  Article  >  Backend Development  >  Python3. Example of traversing a folder to extract specific file names

Python3. Example of traversing a folder to extract specific file names

不言
不言Original
2018-04-27 10:49:032282browse

The following is an example of Python3 traversing a folder to extract a specific file name. It has a good reference value and I hope it will be helpful to everyone. Let’s take a look together

When processing files in batches, it is often necessary to first traverse a certain path to extract file names under specific conditions. This article writes a violent traversal but very concise method. It is really very concise but very violent.

The example goal is: Obtain the contents of the folder whose name ends with "_BAD" under the folder where remote sensing data is stored. Because there are many levels under this file (year/month/product type/), there are many target folders and they exist at the last level, so it is very annoying to check manually.

The code is as follows (after summarizing the knowledge points and code):

# -*- coding: utf-8 -*-
"""
遍历某路径下所有文件夹,获得特定文件夹下所有文件
很暴力,真的遍历了所有的文件夹
20180124
@author: 墨大宝
"""
import os
TARGETPATH = r'F:\MODIS_DATA'
records = []
for currentDir, _, includedFiles in os.walk(TARGETPATH):
 if not currentDir.endswith('_BAD'): continue
 else:
  records.append(currentDir) # 将以“_BAD”结尾的文件夹名加入records
  records.extend(includedFiles) # 将该文件夹内的文件名列表扩展到records
# 将records写入.txt
txtFile = open(os.path.join(TARGETPATH, '02_04_BAD.txt'), 'w')
txtFile.write(os.linesep.join(records))
txtFile.close()
# 将排序后的records写入.txt
with open(os.path.join(TARGETPATH, '02_04_BAD_SORTED.txt'), 'w') as txtFile:
 txtFile.write('\n'.join(sorted(records)))

os.walk () Return Directory tree generator. Each time a tuple in the format of (dirpath, dirnames, filenames) is generated, the elements are the current path, the folder list under the current path, and the file name list under the current path.

List's .append(), .extend() and .sort() methods are all modified in place, but the sorted() function is not.

When writing a list to a .txt file, you need to convert the list to str. Directly using the str() function to force the conversion will be ugly. Using newlines to connect each element of the list will look much better.

os.path represents the system newline character, which is "\r\n" under Windows and "\n" in other systems. However, no matter whether you use os.path or "\n" to connect the list elements, and finally open it with Windows Notepad, the line breaks will be the same. However, if you open it with vs code, os.path will break one more line, which means it will look like one line between lines. What exactly is here? What I mentioned is that it may be related to Python's write mechanism, so I won't delve into it for the time being (leaving a hole).

Regarding file reading and writing, most information recommends the with as form, which is indeed more concise.

PS:

The reason why os.walk() is violent is that it really traverses all the files in the given path according to the directory tree. Folders and files, it will be slower if the file size is large and the file names you are looking for are few (in fact, I don’t think it is much slower). If you use os.listdir() to write a recursive function, the execution efficiency may be higher, but os.walk () The logic is simple and easy to write, feel free to do whatever you want, I did it!

Related recommendations:

Python example of deleting a non-empty folder

Python copies files to the specified directory

Python's method of obtaining the program execution file path

The above is the detailed content of Python3. Example of traversing a folder to extract specific file names. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn