Python automatic operation GUI artifact——PyAutoGUI-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

Python automatic operation GUI artifact——PyAutoGUI

PHPz

Apr 11, 2023 pm 10:13 PM

pythonguipyautogui

Python automatic operation GUI artifact——PyAutoGUI

We have previously talked about how to use Python to automate page operations in the browser. No matter which method is used, the corresponding operations are performed by locating elements on the page.

Today we will talk about how to automate operations on the desktop. Similar to browser page automation operations, desktop automation operations also require locating the position of the mouse on the desktop, and then performing corresponding operations based on the positioned position.

GUI Control Artifact

Our protagonist today is pyautogui. pyautogui is a pure Python GUI automation tool that allows the program to automatically control a series of mouse and keyboard operations to achieve automated testing. the goal of.

The installation of this module is also the same as usual:

pip3 install pyautogui

It can be used directly after installation.

Mouse operation

Mouse movement

The most basic desktop operation is mouse operation. We can control the movement of the mouse:

# 移动鼠标
pyautogui.moveTo(200,400,duration=2)
pyautogui.moveRel(200,500,duration=2)

The entire desktop is based on The upper left corner is the origin of the coordinate axis, and all operations use this origin to determine the operating position.

The first line of code moves the mouse to the specified pixel (200,400) position. The second line of code moves the mouse 200px to the right and 400px downward according to the current point.

Both lines of code have a common parameter duration. This parameter represents the movement time, that is, the movement operation is completed within the specified time, and the unit is seconds.

Run these two lines of code and observe the changes of the screen mouse. Isn’t it amazing?

We can also get the mouse position:

print(pyautogui.position())

This is easy to understand, it is to get the coordinate position of the mouse in the current screen. By running this line of code, we will get information such as the following:

Point(x=400, y=900)

Mouse click

Usually, our mouse has two buttons on the left and right. Advanced point mouse There is also a button in the middle.

My mouse only has two buttons, and there is no button in the middle. Alas~

Python automatic operation GUI artifact——PyAutoGUI

pyautogui has corresponding processing for these three button operations:

# 鼠标点击，默认左键
pyautogui.click(100,100)
# 单击左键
pyautogui.click(100,100,button='left')
# 单击右键
pyautogui.click(100,300,button='right')
# 单击中间
pyautogui.click(100,300,button='middle')

Mouse click, if the button parameter is not specified, the default is to click the left button. The first two parameters are the location of the click coordinates.

Run this code and see what happens to your desktop?

In addition to click operations, the mouse also has double-click operations:

# 双击左键
pyautogui.doubleClick(10,10)
# 双击右键
pyautogui.rightClick(10,10)
# 双击中键
pyautogui.middleClick(10,10)

The operation function is also very simple. I believe everyone can understand it at a glance. If you can’t understand it at a glance, please take a few more glances!

Friends who are familiar with the front-end may immediately think that mouse operations include a process of pressing and releasing, and our screen operations also have corresponding controls:

# 鼠标按下
pyautogui.mouseDown()
# 鼠标释放
pyautogui.mouseUp()

Mouse drag

We can control the mouse to drag to the specified coordinate position and set the operation time:

pyautogui.dragTo(100,300,duration=1)

This operation effect is similar to the previous movement.

Based on the previous experience of moving, we also drag the mouse in the direction:

pyautogui.dragRel(100,300,duration=4)

Mouse scrolling

In desktop operations, we sometimes need to scroll the mouse to reach up or to At this time, we can use the scroll function to control:

pyautogui.scroll(30000)

The parameter is an integer, indicating how many units to scroll up or down. This unit may be different depending on different operating systems. If you scroll up, pass in a positive integer, and if you scroll down, pass in a negative integer.

Screen processing

Get screenshots

Let’s first assume a scenario: I want to find a red dot on the screen now, what would you do? The usual approach is to get the color value of the red point, and then compare the points on the screen one by one until it is found.

pyautogui provides support for our operation scenario. There are three functions to complete these three things.

im = pyautogui.screenshot()
im.save('screenshot.png')
rgb = im.getpixel((100, 500))
print(rgb)
match = pyautogui.pixelMatchesColor(500,500,(12,120,400))
print(match)

The first is to get the screenshot function, which can return a Pillow image object; the second is to get the color of the specified coordinate point in the screenshot and return the rgb color value; the third is to specify The color of the coordinate point is compared with the color of the target, and a Boolean value is returned.

Let’s upgrade the requirements again:

I now want to find the edge browser icon on the screen, what will you do?

The usual approach is to first know what the edge browser icon looks like, whether it is green or blue, fat or thin, right? Then we match icons on the screen until we find an icon that is the same as our target icon, and we get the result.

So, our code is as follows:

# 图像识别（一个）
oneLocation = pyautogui.locateOnScreen('1.png')
print(oneLocation)
# 图像识别（多个）
allLocation = pyautogui.locateAllOnScreen('1.png')
print(list(allLocation))

You can intercept the icon of an application on the desktop, save it as a picture, and then use the above lines of code to identify it. The identification is successful. You will return a result similar to the following:

Box(left=20, top=89, width=33, height=34)
[Box(left=20, top=89, width=33, height=34)]

This is the location of the picture on the desktop. If the picture cannot be found, None will be returned.

Keyboard input

Keyboard functions

Keyboard input has the following commonly used functions:

keyDown()：模拟按键按下
keyUP()：模拟按键松开
press()：模拟一次按键过程，即 keyDown 和 keyUP 的组合
typewrite()：模拟键盘输出内容

举个例子，大家平时输入感叹号（！）是怎么操作键盘的？

按住 shift 按键，然后再按住 1 按键，就可以了。用 pyautogui 控制就是：

pyautogui.keyDown('shift')
pyautogui.press('1')
pyautogui.keyUp('shift')

运行上面的代码，如果你的鼠标是放在编辑框中，你会得到一个感叹号！

我们还可以直接输出内容：

pyautogui.typewrite('python', 1)

第一个参数是输出的内容，第二个参数是间隔时间，单位是秒。

运行上面代码，你的编辑器里面就会每隔1秒钟按顺序输出 python 的6个字母。

特殊符号

有时我们需要输入键盘的一些特殊的符号按键，比如换行、方向键等，这些有相对应的键盘字符串表示：

pyautogui.typewrite(['p','y','t','h','o','n','enter'])

运行上面代码，编辑器里面就会输出 python 之后换行。

其他特殊按键对应的字符串请参考官方说明。

快捷键

如果我要复制一个内容，大部分情况下会使用快键键 ctrl + c，按照上面讲的，我们应该这么实现：

pyautogui.keyDown('ctrl')
pyautogui.keyDown('c')
pyautogui.keyUp('c')
pyautogui.keyUp('ctrl')

这样写起来很麻烦，而且需要掌控按键的按下和释放的顺序。

pyautogui 为我们提供了一个快捷的函数：

pyautogui.hotkey('ctrl','c')

实现的效果和上面的4行代码相同。

信息框

当你在模拟一个桌面操作的时候，如果有分支操作需要根据实际情况来判断，你是不是需要有一个地方可以让你选择走哪个分支？

pyautogui 贴心地考虑到了这种情况，你可以通过弹出一个选择框来中断当前的操作，选择操作分支。

way = pyautogui.confirm('领导，该走哪条路？', buttons=['农村路', '水路', '陆路'])
print(way)

这里就是我们 HTML 页面的 confirm 选择框，选择了选项之后，我们可以获取到选择的选项，然后基于这个选项做判断，进入相应的操作分支。

除了选择确认框之外，还有其他一些提示信息框：

# 警告框
alert = pyautogui.alert(text='警告！敌军来袭！', title='警告框')
print(alert)
# 密码框
password = pyautogui.password('请输入密码')
print(password)
# 普通输入框
input = pyautogui.prompt('请输入指令：')
print(input)

总结

pyautogui 的基本知识就给大家介绍到这里，这个 python 模块的功能十分强大，函数都非常简单，对 python 初学者比较友好。学了这些基本知识之后，你可以运用这些基本知识的组合，去实现一些有趣的桌面自动化操作，快去尝试一把吧！

知识在于分享，转发这篇文章，让更多的人看到~

The above is the detailed content of Python automatic operation GUI artifact——PyAutoGUI. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Merging Lists in Python: Choosing the Right MethodMay 14, 2025 am 12:11 AM

TomergelistsinPython,youcanusethe operator,extendmethod,listcomprehension,oritertools.chain,eachwithspecificadvantages:1)The operatorissimplebutlessefficientforlargelists;2)extendismemory-efficientbutmodifiestheoriginallist;3)listcomprehensionoffersf

How to concatenate two lists in python 3?May 14, 2025 am 12:09 AM

In Python 3, two lists can be connected through a variety of methods: 1) Use operator, which is suitable for small lists, but is inefficient for large lists; 2) Use extend method, which is suitable for large lists, with high memory efficiency, but will modify the original list; 3) Use * operator, which is suitable for merging multiple lists, without modifying the original list; 4) Use itertools.chain, which is suitable for large data sets, with high memory efficiency.

Python concatenate list stringsMay 14, 2025 am 12:08 AM

Using the join() method is the most efficient way to connect strings from lists in Python. 1) Use the join() method to be efficient and easy to read. 2) The cycle uses operators inefficiently for large lists. 3) The combination of list comprehension and join() is suitable for scenarios that require conversion. 4) The reduce() method is suitable for other types of reductions, but is inefficient for string concatenation. The complete sentence ends.

Python execution, what is that?May 14, 2025 am 12:06 AM

PythonexecutionistheprocessoftransformingPythoncodeintoexecutableinstructions.1)Theinterpreterreadsthecode,convertingitintobytecode,whichthePythonVirtualMachine(PVM)executes.2)TheGlobalInterpreterLock(GIL)managesthreadexecution,potentiallylimitingmul

Python: what are the key featuresMay 14, 2025 am 12:02 AM

Key features of Python include: 1. The syntax is concise and easy to understand, suitable for beginners; 2. Dynamic type system, improving development speed; 3. Rich standard library, supporting multiple tasks; 4. Strong community and ecosystem, providing extensive support; 5. Interpretation, suitable for scripting and rapid prototyping; 6. Multi-paradigm support, suitable for various programming styles.

Python: compiler or Interpreter?May 13, 2025 am 12:10 AM

Python is an interpreted language, but it also includes the compilation process. 1) Python code is first compiled into bytecode. 2) Bytecode is interpreted and executed by Python virtual machine. 3) This hybrid mechanism makes Python both flexible and efficient, but not as fast as a fully compiled language.

Python For Loop vs While Loop: When to Use Which?May 13, 2025 am 12:07 AM

Useaforloopwheniteratingoverasequenceorforaspecificnumberoftimes;useawhileloopwhencontinuinguntilaconditionismet.Forloopsareidealforknownsequences,whilewhileloopssuitsituationswithundeterminediterations.

Python loops: The most common errorsMay 13, 2025 am 12:07 AM

Pythonloopscanleadtoerrorslikeinfiniteloops,modifyinglistsduringiteration,off-by-oneerrors,zero-indexingissues,andnestedloopinefficiencies.Toavoidthese:1)Use'i

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055612 fails to install in Windows 10?

4 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Nordhold: Fusion System, Explained

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),