Home >Backend Development >Python Tutorial >How to use python to operate Excel artifact openpyxl

How to use python to operate Excel artifact openpyxl

PHPz
PHPzforward
2023-05-12 10:01:052522browse

Excel xlsx

xlsx is the file extension for the Open XML spreadsheet file format used by Microsoft Excel. xlsm files support macros. xlsx is a proprietary binary format, while xlsx is based on the Office Open XML format.

$ sudo pip3 install openpyxl

We use the pip3 tool to install openpyxl.

Openpyxl creates new file

In the first example, we use openpyxl to create a new xlsx file.

write_xlsx.py

#!/usr/bin/env python
 
from openpyxl import Workbook
import time
 
book = Workbook()
sheet = book.active
 
sheet['A1'] = 56
sheet['A2'] = 43
 
now = time.strftime("%x")
sheet['A3'] = now
 
book.save("sample.xlsx")

In the example, we create a new xlsx file. We write data to three cells.

from openpyxl import Workbook

From the openpyxl module, we import the Workbook class. The workbook is a container for all other parts of the document.

book = Workbook()

We create a new workbook. Always create a workbook with at least one worksheet.

sheet = book.active

We get a reference to the active worksheet.

sheet['A1'] = 56
sheet['A2'] = 43

We write numerical data into cells A1 and A2.

now = time.strftime("%x")
sheet['A3'] = now

We write the current date into cell A3.

book.save("sample.xlsx")

We use the save() method to write the content to the sample.xlsx file.

Openpyxl Writing to Cells

There are two basic ways to write to cells: using the worksheet's key (such as A1 or D3), or via cell() Methods use row and column notation.

write2cell.py

#!/usr/bin/env python
 
from openpyxl import Workbook
 
book = Workbook()
sheet = book.active
 
sheet['A1'] = 1
sheet.cell(row=2, column=2).value = 2
 
book.save('write2cell.xlsx')

In the example, we write two values ​​to two cells.

sheet['A1'] = 1

Here we assign the value to the A1 cell.

sheet.cell(row=2, column=2).value = 2

In this row, we write to cell B2 using row and column notation.

Openpyxl appended values

Using the append() method, we can append a set of values ​​to the bottom of the current worksheet.

appending_values.py

#!/usr/bin/env python
 
from openpyxl import Workbook
 
book = Workbook()
sheet = book.active
 
rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)
 
for row in rows:
    sheet.append(row)
 
book.save('appending.xlsx')

In the example, we append three columns of data to the current worksheet.

rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)

Data is stored in tuples of tuples.

for row in rows:
    sheet.append(row)

We go through the container row by row and insert rows of data using the append() method.

OpenPyXL Reading Cells

In the example below, we read previously written data from the sample.xlsx file.

read_cells.py

#!/usr/bin/env python
 
import openpyxl
 
book = openpyxl.load_workbook('sample.xlsx')
 
sheet = book.active
 
a1 = sheet['A1']
a2 = sheet['A2']
a3 = sheet.cell(row=3, column=1)
 
print(a1.value)
print(a2.value) 
print(a3.value)

This example loads an existing xlsx file and reads three cells.

book = openpyxl.load_workbook('sample.xlsx')

Use the load_workbook() method to open the file.

a1 = sheet['A1']
a2 = sheet['A2']
a3 = sheet.cell(row=3, column=1)

We read the contents of cells A1, A2 and A3. In the third row, we use the cell() method to get the value of cell A3.

$ ./read_cells.py 
56
43
10/26/16

This is the output of the example.

OpenPyXL reading multiple cells

We have the following data table:

We use the range operator to read the data.

read_cells2.py

#!/usr/bin/env python
 
import openpyxl
 
book = openpyxl.load_workbook('items.xlsx')
 
sheet = book.active
 
cells = sheet['A1': 'B6']
 
for c1, c2 in cells:
    print("{0:8} {1:8}".format(c1.value, c2.value))

In the example, we read data from two columns using range operations.

cells = sheet['A1': 'B6']

In this row, we read data from cells A1-B6.

for c1, c2 in cells:
    print("{0:8} {1:8}".format(c1.value, c2.value))

format() function is used to output data neatly on the console.

$ ./read_cells2.py 
Items    Quantity
coins          23
chairs          3
pencils         5
bottles         8
books          30

Openpyxl Iterate by rows

iter_rows()The method returns the cells in the worksheet as rows.

iterating_by_rows.py

#!/usr/bin/env python
 
from openpyxl import Workbook
 
book = Workbook()
sheet = book.active
 
rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)
 
for row in rows:
    sheet.append(row)
 
for row in sheet.iter_rows(min_row=1, min_col=1, max_row=6, max_col=3):
    for cell in row:
        print(cell.value, end=" ")
    print()    
 
book.save('iterbyrows.xlsx')

This example iterates through the data row by row.

for row in sheet.iter_rows(min_row=1, min_col=1, max_row=6, max_col=3):

We provide the boundaries for iteration.

$ ./iterating_by_rows.py 
88 46 57 
89 38 12 
23 59 78 
56 21 98 
24 18 43 
34 15 67

Openpyxl Iterate by columns

iter_cols()The method returns the cells in the worksheet as columns.

iterating_by_columns.py

#!/usr/bin/env python
 
from openpyxl import Workbook
 
book = Workbook()
sheet = book.active
 
rows = (
    (88, 46, 57),
    (89, 38, 12),
    (23, 59, 78),
    (56, 21, 98),
    (24, 18, 43),
    (34, 15, 67)
)
 
for row in rows:
    sheet.append(row)
 
for row in sheet.iter_cols(min_row=1, min_col=1, max_row=6, max_col=3):
    for cell in row:
        print(cell.value, end=" ")
    print()    
 
book.save('iterbycols.xlsx')

This example iterates through the data column by column.

$ ./iterating_by_columns.py 
88 89 23 56 24 34 
46 38 59 21 18 15 
57 12 78 98 43 67

Statistics

For the next example, we need to create an xlsx file containing numbers. For example, we created 25 rows of numbers in 10 columns using the RANDBETWEEN() function.

mystats.py

#!/usr/bin/env python
 
import openpyxl
import statistics as stats
 
book = openpyxl.load_workbook('numbers.xlsx', data_only=True)
 
sheet = book.active
 
rows = sheet.rows
 
values = []
 
for row in rows:
    for cell in row:
        values.append(cell.value)
 
print("Number of values: {0}".format(len(values)))
print("Sum of values: {0}".format(sum(values)))
print("Minimum value: {0}".format(min(values)))
print("Maximum value: {0}".format(max(values)))
print("Mean: {0}".format(stats.mean(values)))
print("Median: {0}".format(stats.median(values)))
print("Standard deviation: {0}".format(stats.stdev(values)))
print("Variance: {0}".format(stats.variance(values)))

In the example, we read all the values ​​from the worksheet and calculate some basic statistics.

import statistics as stats

Import the statistics module to provide some statistical functions such as median and variance.

book = openpyxl.load_workbook('numbers.xlsx', data_only=True)

Using the data_only option we get the value from the cell instead of the formula.

rows = sheet.rows

We get all rows of cells that are not empty.

for row in rows:
    for cell in row:
        values.append(cell.value)

In two for loops, we form a list of integer values ​​from the cells.

print("Number of values: {0}".format(len(values)))
print("Sum of values: {0}".format(sum(values)))
print("Minimum value: {0}".format(min(values)))
print("Maximum value: {0}".format(max(values)))
print("Mean: {0}".format(stats.mean(values)))
print("Median: {0}".format(stats.median(values)))
print("Standard deviation: {0}".format(stats.stdev(values)))
print("Variance: {0}".format(stats.variance(values)))

We calculate and print mathematical statistics about values. Some functionality is built-in, others are imported via the statistics module.

$ ./mystats.py 
Number of values: 312
Sum of values: 15877
Minimum value: 0
Maximum value: 100
Mean: 50.88782051282051
Median: 54.0
Standard deviation: 28.459203819700967
Variance: 809.9262820512821

Openpyxl Filter & Sort Data

Drawings have the auto_filter attribute, which allows setting filtering and sorting conditions.

Please note that Openpyxl sets the conditions, but we must apply them in the spreadsheet application.

filter_sort.py

#!/usr/bin/env python
 
from openpyxl import Workbook
 
wb = Workbook()
sheet = wb.active
 
data = [
    ['Item', 'Colour'],
    ['pen', 'brown'],
    ['book', 'black'],
    ['plate', 'white'],
    ['chair', 'brown'],
    ['coin', 'gold'],
    ['bed', 'brown'],
    ['notebook', 'white'],
]
 
for r in data:
    sheet.append(r)
 
sheet.auto_filter.ref = 'A1:B8'
sheet.auto_filter.add_filter_column(1, ['brown', 'white'])
sheet.auto_filter.add_sort_condition('B2:B8')
 
wb.save('filtered.xlsx')

In the example, we create a worksheet that contains items and their colors. We set up a filter and a sorting condition.

Openpyxl Dimensions

To get those cells that actually contain data, we can use dimensions.

dimensions.py

#!/usr/bin/env python
 
from openpyxl import Workbook
 
book = Workbook()
sheet = book.active
 
sheet['A3'] = 39
sheet['B3'] = 19
 
rows = [
    (88, 46),
    (89, 38),
    (23, 59),
    (56, 21),
    (24, 18),
    (34, 15)
]
 
for row in rows:
    sheet.append(row)
 
print(sheet.dimensions)
print("Minimum row: {0}".format(sheet.min_row))
print("Maximum row: {0}".format(sheet.max_row))
print("Minimum column: {0}".format(sheet.min_column))
print("Maximum column: {0}".format(sheet.max_column))
 
for c1, c2 in sheet[sheet.dimensions]:
    print(c1.value, c2.value)
 
book.save('dimensions.xlsx')

This example calculates the dimensions of two columns of data.

sheet['A3'] = 39
sheet['B3'] = 19
 
rows = [
    (88, 46),
    (89, 38),
    (23, 59),
    (56, 21),
    (24, 18),
    (34, 15)
]
 
for row in rows:
    sheet.append(row)

We add data to the worksheet. Note that we start adding on the third line.

print(sheet.dimensions)

dimensionsThe property returns the upper left and lower right corner cells of a non-empty range of cells.

print("Minimum row: {0}".format(sheet.min_row))
print("Maximum row: {0}".format(sheet.max_row))

Using the min_row and max_row properties we can get the minimum and maximum row containing data.

print("Minimum column: {0}".format(sheet.min_column))
print("Maximum column: {0}".format(sheet.max_column))

通过min_columnmax_column属性,我们获得了包含数据的最小和最大列。

for c1, c2 in sheet[sheet.dimensions]:
    print(c1.value, c2.value)

我们遍历数据并将其打印到控制台。

$ ./dimensions.py 
A3:B9
Minimum row: 3
Maximum row: 9
Minimum column: 1
Maximum column: 2
39 19
88 46
89 38
23 59
56 21
24 18
34 15

工作表

每个工作簿可以有多个工作表。

Figure: Sheets

让我们有一张包含这三张纸的工作簿。

sheets.py

#!/usr/bin/env python
 
import openpyxl
 
book = openpyxl.load_workbook('sheets.xlsx')
 
print(book.get_sheet_names())
 
active_sheet = book.active
print(type(active_sheet))
 
sheet = book.get_sheet_by_name("March")
print(sheet.title)

该程序可用于 Excel 工作表。

print(book.get_sheet_names())

get_sheet_names()方法返回工作簿中可用工作表的名称。

active_sheet = book.active
print(type(active_sheet))

我们获取活动表并将其类型打印到终端。

sheet = book.get_sheet_by_name("March")

我们使用get_sheet_by_name()方法获得对工作表的引用。

print(sheet.title)

检索到的工作表的标题将打印到终端。

$ ./sheets.py 
['January', 'February', 'March']
<class &#39;openpyxl.worksheet.worksheet.Worksheet&#39;>
March

这是程序的输出。

sheets2.py

#!/usr/bin/env python
 
import openpyxl
 
book = openpyxl.load_workbook(&#39;sheets.xlsx&#39;)
 
book.create_sheet("April")
 
print(book.sheetnames)
 
sheet1 = book.get_sheet_by_name("January")
book.remove_sheet(sheet1)
 
print(book.sheetnames)
 
book.create_sheet("January", 0)
print(book.sheetnames)
 
book.save(&#39;sheets2.xlsx&#39;)

在此示例中,我们创建一个新工作表。

book.create_sheet("April")

使用create_sheet()方法创建一个新图纸。

print(book.sheetnames)

图纸名称也可以使用sheetnames属性显示。

book.remove_sheet(sheet1)

可以使用remove_sheet()方法将纸张取出。

book.create_sheet("January", 0)

可以在指定位置创建一个新图纸。 在我们的例子中,我们在索引为 0 的位置创建一个新工作表。

$ ./sheets2.py 
[&#39;January&#39;, &#39;February&#39;, &#39;March&#39;, &#39;April&#39;]
[&#39;February&#39;, &#39;March&#39;, &#39;April&#39;]
[&#39;January&#39;, &#39;February&#39;, &#39;March&#39;, &#39;April&#39;]

可以更改工作表的背景颜色。

sheets3.py

#!/usr/bin/env python
 
import openpyxl
 
book = openpyxl.load_workbook('sheets.xlsx')
 
sheet = book.get_sheet_by_name("March")
sheet.sheet_properties.tabColor = "0072BA"
 
book.save('sheets3.xlsx')

该示例修改了标题为“ March”的工作表的背景颜色。

sheet.sheet_properties.tabColor = "0072BA"

我们将tabColor属性更改为新颜色。

第三工作表的背景色已更改为某种蓝色。

合并单元格

单元格可以使用merge_cells()方法合并,而可以不使用unmerge_cells()方法合并。 当我们合并单元格时,除了左上角的所有单元格都将从工作表中删除。

merging_cells.py

#!/usr/bin/env python
 
from openpyxl import Workbook
from openpyxl.styles import Alignment
 
book = Workbook()
sheet = book.active
 
sheet.merge_cells(&#39;A1:B2&#39;)
 
cell = sheet.cell(row=1, column=1)
cell.value = &#39;Sunny day&#39;
cell.alignment = Alignment(horizontal=&#39;center&#39;, vertical=&#39;center&#39;)
 
book.save(&#39;merging.xlsx&#39;)

在该示例中,我们合并了四个单元格:A1,B1,A2 和 B2。 最后一个单元格中的文本居中。

from openpyxl.styles import Alignment

为了使文本在最后一个单元格中居中,我们使用了openpyxl.styles模块中的Alignment类。

sheet.merge_cells(&#39;A1:B2&#39;)

我们用merge_cells()方法合并四个单元格。

cell = sheet.cell(row=1, column=1)

我们得到了最后一个单元格。

cell.value = &#39;Sunny day&#39;
cell.alignment = Alignment(horizontal=&#39;center&#39;, vertical=&#39;center&#39;)

我们将文本设置为合并的单元格并更新其对齐方式。

Openpyxl 冻结窗格

冻结窗格时,在滚动到工作表的另一个区域时,我们会保持工作表的某个区域可见。

freezing.py

#!/usr/bin/env python
 
from openpyxl import Workbook
from openpyxl.styles import Alignment
 
book = Workbook()
sheet = book.active
 
sheet.freeze_panes = &#39;B2&#39;
 
book.save(&#39;freezing.xlsx&#39;)

该示例通过单元格 B2 冻结窗格。

sheet.freeze_panes = &#39;B2&#39;

要冻结窗格,我们使用freeze_panes属性。

Openpyxl 公式

下一个示例显示如何使用公式。 openpyxl不进行计算; 它将公式写入单元格。

formulas.py

#!/usr/bin/env python
 
from openpyxl import Workbook
 
book = Workbook()
sheet = book.active
 
rows = (
    (34, 26),
    (88, 36),
    (24, 29),
    (15, 22),
    (56, 13),
    (76, 18)
)
 
for row in rows:
    sheet.append(row)
 
cell = sheet.cell(row=7, column=2)
cell.value = "=SUM(A1:B6)"
cell.font = cell.font.copy(bold=True)
 
book.save(&#39;formulas.xlsx&#39;)

在示例中,我们使用SUM()函数计算所有值的总和,并以粗体显示输出样式。

rows = (
    (34, 26),
    (88, 36),
    (24, 29),
    (15, 22),
    (56, 13),
    (76, 18)
)
 
for row in rows:
    sheet.append(row)

我们创建两列数据。

cell = sheet.cell(row=7, column=2)

我们得到显示计算结果的单元格。

cell.value = "=SUM(A1:B6)"

我们将一个公式写入单元格。

cell.font = cell.font.copy(bold=True)

我们更改字体样式。

OpenPyXL 图像

在下面的示例中,我们显示了如何将图像插入到工作表中。

write_image.py

#!/usr/bin/env python
 
from openpyxl import Workbook
from openpyxl.drawing.image import Image
 
book = Workbook()
sheet = book.active
 
img = Image("icesid.png")
sheet[&#39;A1&#39;] = &#39;This is Sid&#39;
 
sheet.add_image(img, &#39;B2&#39;)
 
book.save("sheet_image.xlsx")

在示例中,我们将图像写到一张纸上。

from openpyxl.drawing.image import Image

我们使用openpyxl.drawing.image模块中的Image类。

img = Image("icesid.png")

创建一个新的Image类。 icesid.png图像位于当前工作目录中。

sheet.add_image(img, &#39;B2&#39;)

我们使用add_image()方法添加新图像。

Openpyxl 图表

openpyxl库支持创建各种图表,包括条形图,折线图,面积图,气泡图,散点图和饼图。

根据文档,openpyxl仅支持在工作表中创建图表。 现有工作簿中的图表将丢失。

create_bar_chart.py

#!/usr/bin/env python
 
from openpyxl import Workbook
from openpyxl.chart import (
    Reference,
    Series,
    BarChart
)
 
book = Workbook()
sheet = book.active
 
rows = [
    ("USA", 46),
    ("China", 38),
    ("UK", 29),
    ("Russia", 22),
    ("South Korea", 13),
    ("Germany", 11)
]
 
for row in rows:
    sheet.append(row)
 
data = Reference(sheet, min_col=2, min_row=1, max_col=2, max_row=6)
categs = Reference(sheet, min_col=1, min_row=1, max_row=6)
 
chart = BarChart()
chart.add_data(data=data)
chart.set_categories(categs)
 
chart.legend = None
chart.y_axis.majorGridlines = None
chart.varyColors = True
chart.title = "Olympic Gold medals in London"
 
sheet.add_chart(chart, "A8")    
 
book.save("bar_chart.xlsx")

在此示例中,我们创建了一个条形图,以显示 2012 年伦敦每个国家/地区的奥运金牌数量。

from openpyxl.chart import (
    Reference,
    Series,
    BarChart
)

openpyxl.chart模块具有使用图表的工具。

book = Workbook()
sheet = book.active

创建一个新的工作簿。

rows = [
    ("USA", 46),
    ("China", 38),
    ("UK", 29),
    ("Russia", 22),
    ("South Korea", 13),
    ("Germany", 11)
]
 
for row in rows:
    sheet.append(row)

我们创建一些数据并将其添加到活动工作表的单元格中。

data = Reference(sheet, min_col=2, min_row=1, max_col=2, max_row=6)

对于Reference类,我们引用表中代表数据的行。 在我们的案例中,这些是奥运金牌的数量。

categs = Reference(sheet, min_col=1, min_row=1, max_row=6)

我们创建一个类别轴。 类别轴是将数据视为一系列非数字文本标签的轴。 在我们的案例中,我们有代表国家名称的文本标签。

chart = BarChart()
chart.add_data(data=data)
chart.set_categories(categs)

我们创建一个条形图并为其设置数据和类别。

chart.legend = None
chart.y_axis.majorGridlines = None

使用legendmajorGridlines属性,可以关闭图例和主要网格线。

chart.varyColors = True

varyColors设置为True,每个条形都有不同的颜色。

chart.title = "Olympic Gold medals in London"

为图表设置标题。

sheet.add_chart(chart, "A8")

使用add_chart()方法将创建的图表添加到工作表中。

In this tutorial, we use the openpyxl library. We have read data from Excel file and written data to Excel file.

How to use python to operate Excel artifact openpyxl

The above is the detailed content of How to use python to operate Excel artifact openpyxl. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete