Home > Article > Backend Development > How to call Python scripts in Excel to automate data processing!
Speaking of Excel, it is definitely the king in the field of data processing. Although it has been born for more than 30 years, there are still 750 million loyal users around the world, and as an Internet celebrity language Python has only 7 million developers.
Excel is the most popular programming language in the world. Yes, you read that right. Since Microsoft introduced the LAMBDA definition function, Excel has been able to implement algorithms in programming languages, so it is Turing complete, just like JavaScript, Java, and Python.
Although Excel is an indispensable tool for small-scale data scenarios, it will be somewhat inadequate when facing big data.
We know that an Excel table can display up to 1,048,576 rows and 16,384 columns. Processing a table with hundreds of thousands of rows may cause some lag. Of course, you can use VBA for data processing, or you can use Python. to operate Excel.
This is the topic of this article, Python’s third-party library-xlwings, which serves as an interactive tool between Python and Excel, allowing you to easily call Python scripts through VBA to achieve complex data analysis.
For example, automatically import data:
or randomly match text:
VBA, as the built-in macro language of Excel, can do almost anything, including automation, data processing, analysis and modeling, etc. So why use Python to integrate Excel VBA? There are three main reasons:
For Python enthusiasts, come You may already be very familiar with data science libraries such as pandas and numpy. If you can use them for Excel data analysis, it will be even more powerful.
There are many libraries in Python that can operate Excel, such as xlsxwriter, openpyxl, pandas, xlwings, etc.
But compared to other libraries, xlwings has almost the best overall performance, and xlwings can call Python code through Excel macros.
The picture comes from Early Python
I won’t explain much about the introductory use of xlwings here.
Installing xlwings is very simple. Quick installation can be achieved through pip on the command line:
pip install python
After installing xlwings, you need to install the Excel integration plug-in of xlwings. Before installation, you need to close all Excel applications. Otherwise, an error will be reported.
Also enter the following command on the command line:
xlwings addin install
The following prompt indicates that the integrated plug-in is installed successfully.
After xlwings and the plug-in are installed, open Excel at this time and you will find an xlwings menu box appearing on the toolbar, which means that the xlwings plug-in is successfully installed. It serves as a bridge for VBA calls. Python script for matchmaking.
In addition, if your menu bar does not yet display "Development Tools", you need to add "Development Tools" to the ribbon because we need to use macros.
The steps are simple:
1. On the "File" tab, go to "Customize > Options".
2. Under "Customize Ribbon" and "Main Tab", select the "Development Tools" checkbox.
The menu bar displays the development tools, and you can start using macros.
If you still don’t know what a macro is, you can temporarily understand it as a tool for automation and batch processing.
At this point, the preliminary preparation work is completed, and the next step is actual combat!
要想在excel中调用python脚本,需要写VBA程序来实现,但对于不懂VBA的小伙伴来说就是个麻烦事。
但xlwings解决了这个问题,不需要你写VBA代码就能直接在excel中调用python脚本,并将结果输出到excel表中。
xlwings会帮助你创建.xlsm
和.py
两个文件,在.py
文件里写python代码,在.xlsm
文件里点击执行,就完成了excel与python的交互。
怎么创建这两个文件呢?非常简单,直接在命令行输入以下代码即可:
xlwings quickstart ProjectName
这里的ProjectName
可以自定义,是创建后文件的名字。
如果你想把文件创建到指定文件夹里,需要提前将命令行导航到指定目录。
创建好后,在指定文件夹里会出现两个文件,就是之前说的.xlsm
和.py
文件。
我们打开.xlsm
文件,这是一个excel宏文件,xlwings已经提前帮你写好了调用Python的VBA代码。
按快捷键Alt + F11
,就能调出VBA编辑器。
Sub SampleCall()<br>mymodule = Left(ThisWorkbook.Name, (InStrRev(ThisWorkbook.Name, ".", -1, vbTextCompare) - 1))<br>RunPython "import " & mymodule & ";" & mymodule & ".main()"<br>End Sub<br><br>
里面这串代码主要执行两个步骤:
1、在.xlsm
文件相同位置查找相同名称的.py
文件
2、调用.py
脚本里的main()
函数
我们先来看一个简单的例子,自动在excel表里输入['a','b','c','d','e']
第一步:我们把.py
文件里的代码改成以下形式。
import xlwings as xw import pandas as pd def main(): wb = xw.Book.caller() values = ['a','b','c','d','e'] wb.sheets[0].range('A1').value = values @xw.func def hello(name): return f"Hello {name}!" if __name__ == "__main__": xw.Book("PythonExcelTest.xlsm").set_mock_caller() main()
然后在.xlsm
文件sheet1
中创建一个按钮,并设置默认的宏,变成一个触发按钮。
设置好触发按钮后,我们直接点击它,就会发现第一行出现了['a','b','c','d','e']
。
同样的,我们可以把鸢尾花数据集自动导入到excel中,只需要在.py文件里改动代码即可,代码如下:
import xlwings as xw import pandas as pd def main(): wb = xw.Book.caller() df = pd.read_csv(r"E:\test\PythonExcelTest\iris.csv") df['total_length'] = df['sepal_length'] + df['petal_length'] wb.sheets[0].range('A1').value = df @xw.func def hello(name): return f"Hello {name}!" if __name__ == "__main__": xw.Book("PythonExcelTest.xlsm").set_mock_caller() main()
好了,这就是在excel中调用Python脚本的全过程,你可以试试其他有趣的玩法,比如实现机器学习算法、文本清洗、数据匹配、自动化报告等等。
Excel+Python,简直法力无边。
参考medium文章
The above is the detailed content of How to call Python scripts in Excel to automate data processing!. For more information, please follow other related articles on the PHP Chinese website!