Home  >  Article  >  Backend Development  >  Recommended collection | 1 Python library, 4 awesome functions!

Recommended collection | 1 Python library, 4 awesome functions!

Python当打之年
Python当打之年forward
2023-08-10 15:34:521189browse

Today I introduce a Python library[filestools], which is written by a very familiar person Developed by a boss.

The filestools library currently contains four tool packages. I really like these four functions, which are:
  • Ⅰ Tree directory Display;
  • Ⅱ Text file difference comparison;
  • Ⅲ Picture watermark;
  • Ⅳ Convert the curl network request command into the requests library request code;
## Give it first You can go directly to the instructions for using the filestools library, so that you can learn by yourself later[Enter the official website to view the name of the boss]
https://pypi.org/project /filestools/
##This library needs to be installed before use. One line of command can do it:
pip install filestools -i https://pypi.org/simple/ -U
1. Tree directory display

This function can help us display recursively , all files and folders in the specified directory, and the size of each file and folder is displayed at a glance.
Let’s take windows as an example to explain.
The entire operation is performed in the CMD black window. First you have to learn how to switch to a specified disk and directory.
# 这样即可将C盘,切换到D盘
C:\Users\Administrator>D:

# 使用cd命令,可以切换到指定盘的指定目录
C:\Users\Administrator>cd C:\Users\Administrator\Desktop\python三剑客\加盟店爬虫
There are two commands here: tree and tree2.
  • If in your system, the local python priority is higher than the priority of the system environment, directly execute the tree command;
  • If your system has a higher priority than the local python environment, in addition to adjusting the order of environment variables to modify the priority, you can also use the tree2 command, which is consistent with tree.But if you execute the tree command at this time, you will not see the effect;
Based on my computer, I will demonstrate it to everyone:
Recommended collection | 1 Python library, 4 awesome functions!
You can see: When I execute the tree command here, what is displayed is the system display before the library is installed.
This is caused by the system environment having a higher priority than the local python.
Recommended collection | 1 Python library, 4 awesome functions!
At this point, we can directly execute the tree2 command.
Recommended collection | 1 Python library, 4 awesome functions!
Of course, everyone doesn’t like to execute commands in the CMD window.这里我们直接在jupyter notebook中,执行如下操作:
from treedir.tree import tree_dir
tree_dir(r"C:\Users\Administrator\Desktop\python三剑客\加盟店爬虫", m_level=7, no_calc=False)
结果如下:
Recommended collection | 1 Python library, 4 awesome functions!
关于tree_dir()函数,分别介绍如下3个参数
  • path:递归显示的目录路径,默认为当前目录;
  • m_level:递归展示的最大层数,默认为7层;
  • no_calc:指定该参数后,对于超过递归显示的最大层数的文件夹,不再继续递归计算文件夹大小;

2. 文本文件差异比较

这个功能可以帮助我们比较两个文件的差异,输出到HTML网页中。比如说,我们写了一段代码,后面改动了。但是由于代码太多,我们不知道改了哪里,此时,使用这个功能,很好的帮助我们对比。
我们来看这样一个例子,我曾经有一个<span style="font-size: 15px;">a.txt</span>文件,经过一段时间后,我对其中的内容做了修改,得到了最后的<span style="font-size: 15px;">b.txt</span>
Recommended collection | 1 Python library, 4 awesome functions!
需求:想看看对哪里做了修改(如果内容很多的话)
from filediff.diff import file_diff_compare
file_diff_compare("a.txt", "b.txt")
这会在当前工作目录下,生成一个<span style="font-size: 15px;">html</span>网页文件。
Recommended collection | 1 Python library, 4 awesome functions!
双击打开,观察其中的内容:
Recommended collection | 1 Python library, 4 awesome functions!
其中:<span style="font-size: 15px;">黄色</span>表示改动过的内容,<span style="font-size: 15px;">绿色</span>表示新添加过的内容,<span style="font-size: 15px;">红色</span>表示已经删除过的内容。
对于file_diff_compare()函数,有如下7个参数:
from filediff.diff import file_diff_compare
file_diff_compare(file1, file2, diff_out=&#39;diff_result.html&#39;, max_width=70, numlines=0, show_all=False, no_browser=False)
对这7个参数,分别介绍如下:
  • file1 / file2:待比较的两个文件,必须文本文件;
  • diff_out:差异结果保存的文件名(网页格式),默认值diff_result.html;
  • max_width:每行超过多少字符就自动换行,默认值70;
  • numlines:在差异行基础上前后显示多少行,默认是0;
  • show_all:只要设置这个参数就表示显示全部原始数据,此时-n参数无效,默认不显示全部;
  • no_browser:设置这个参数,在生成结果后,不会自动打开游览器。When set to False, the browser will automatically open;

3. Add watermark to pictures

This should be the best I have ever seen<span style="font-size: 15px;">Picture with watermark</span>Code, the add_mark() function is called to add watermark to the picture.
from watermarker.marker import add_mark

# 注意:有些参数是默认参数,你可以随意修改的;
add_mark(file, mark, out=&#39;output&#39;, color=&#39;#8B8B1B&#39;, size=30, opacity=0.15, space=75, angle=30)
关于add_mark()函数,分别介绍如下8个参数
  • file:待添加水印的照片;
  • mark:使用哪些字作为水印;
  • out:添加水印后保存的位置;
  • color:水印字体的颜色,默认颜色#8B8B1B;
  • size:水印字体的大小,默认50;
  • opacity:水印字体的透明度,默认0.15;
  • space:水印字体之间的间隔, 默认75个空格;
  • angle:水印字体的旋转角度,默认30度;
比如我们执行如下命令:
from watermarker.marker import add_mark
add_mark(file=r"C:\Users\Administrator\Desktop\大学.jpg", out=r"C:\Users\Administrator\Desktop\python三剑客\加盟店爬虫", mark="黄同学", opacity=0.2, angle=30, space=30)
我们想要给<span style="font-size: 15px;">大学.jpg</span>添加一个<span style="font-size: 15px;">黄同学</span>水印,保存的位置在<span style="font-size: 15px;">加盟店爬虫</span>文件夹下,透明度是<span style="font-size: 15px;">0.2</span>,旋转角度是<span style="font-size: 15px;">30°</span>,字体之间的间隔是<span style="font-size: 15px;">30</span>
The original picture is as follows:
Recommended collection | 1 Python library, 4 awesome functions!
##The final effect is as follows:
Recommended collection | 1 Python library, 4 awesome functions!
##4. curl network request to requests library request code

When we write crawlers, we often use some parameter information, such as this:
Recommended collection | 1 Python library, 4 awesome functions!
Would it be troublesome if you manually copy each one?
This function can solve this problem, it can convert cURL into Python code, we only need to copy it.
The general steps are as follows:
  • Ⅰ First, copy the captured network in Google browser The request is cURL (bash);
  • Ⅱ Convert it to python code through the curl2py command;
Take the Python position of
Internship Network<span style="font-size: 15px;"></span> as an example for explanation.
http://www.shixi.com/search/index?key=python
Follow the picture below Operation, we copied curl for a single request.
可以看到: 这里有各种不同的请求url,然后<span style="font-size: 12px;">-H</span>后面是该请求对应的各种参数。我们需要请求哪个链接,就复制对应的curl。
仔细观察下图哦:
Recommended collection | 1 Python library, 4 awesome functions!
复制了curl后,可以粘贴出来,看看有哪些东西。
curl &#39;http://www.shixi.com/search/index?key=python&#39; \
  -H &#39;Connection: keep-alive&#39; \
  -H &#39;Cache-Control: max-age=0&#39; \
  -H &#39;Upgrade-Insecure-Requests: 1&#39; \
  -H &#39;User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36&#39; \
  -H &#39;Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9&#39; \
  -H &#39;Referer: http://www.shixi.com/&#39; \
  -H &#39;Accept-Language: zh-CN,zh;q=0.9&#39; \
  -H &#39;Cookie: UM_distinctid=17a50a2c8ea537-046c01e944e72f-6373267-100200-17a50a2c8eb4ff; PHPSESSID=rpprvtdrcrvt54fkr7msgcde17; CNZZDATA1261027457=1711789791-1624850487-https%253A%252F%252Fwww.baidu.com%252F%7C1627741311; Hm_lvt_536f42de0bcce9241264ac5d50172db7=1627741268; Hm_lpvt_536f42de0bcce9241264ac5d50172db7=1627741334&#39; \
  --compressed \
  --insecure
有了上述curl后,就可以通过curl2py命令,将其转换为python代码。
from curl2py.curlParseTool import curlCmdGenPyScript

curl_cmd = """curl &#39;http://www.shixi.com/search/index?key=python&#39; \
  -H &#39;Connection: keep-alive&#39; \
  -H &#39;Cache-Control: max-age=0&#39; \
  -H &#39;Upgrade-Insecure-Requests: 1&#39; \
  -H &#39;User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36&#39; \
  -H &#39;Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9&#39; \
  -H &#39;Referer: http://www.shixi.com/&#39; \
  -H &#39;Accept-Language: zh-CN,zh;q=0.9&#39; \
  -H &#39;Cookie: UM_distinctid=17a50a2c8ea537-046c01e944e72f-6373267-100200-17a50a2c8eb4ff; PHPSESSID=rpprvtdrcrvt54fkr7msgcde17; CNZZDATA1261027457=1711789791-1624850487-https%253A%252F%252Fwww.baidu.com%252F%7C1627741311; Hm_lvt_536f42de0bcce9241264ac5d50172db7=1627741268; Hm_lpvt_536f42de0bcce9241264ac5d50172db7=1627741334&#39; \
  --compressed \
  --insecure"""

output = curlCmdGenPyScript(curl_cmd)
print(output)
最终结果如下:
Recommended collection | 1 Python library, 4 awesome functions!
可以看到,很多参数都被转换为规则的Python代码了,我们直接拿着用即可,是不是很方便。

The above is the detailed content of Recommended collection | 1 Python library, 4 awesome functions!. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:Python当打之年. If there is any infringement, please contact admin@php.cn delete