What are the commonly used Python data visualization libraries?-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

What are the commonly used Python data visualization libraries?

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Apr 22, 2023 pm 04:16 PM

pythondata visualizationaltair

What are the commonly used Python data visualization libraries?

What library would you use for data visualization in Python?

Today I will share with you a powerful member of the Python data visualization library-Altair!

It is very simple, friendly, and built on the powerful Vega-Lite JSON specification. We only need short code to generate beautiful and effective visualizations.

What is Altair

Altair is a statistical visualization Python library that currently has more than 3,000 stars on GitHub.

With Altair, we can focus more energy and time on understanding the data itself and its meaning, and be freed from the complex data visualization process.

Simply put, Altair is a visual grammar and a declarative language for creating, saving and sharing interactive visual designs. It can use JSON format to describe the visual appearance and interaction process, and generate network-based image.

Let’s take a look at the visualization effects made using Altair!

What are the commonly used Python data visualization libraries?

#Altair's advantages

Altair can comprehensively understand, understand and analyze data through aggregation, data transformation, data interaction, graphic composite and other methods. These processes can help us increase our understanding of the data itself and its meaning, and cultivate intuitive data analysis thinking.

In general, the characteristics of Altair include the following aspects.

Generate Altair's Python code based on Vega-Lite's JSON syntax rules.
Show the statistical visualization process in the started Jupyter Notebook, JupyterLab and nteract.
You can export the visualization work as a picture in PNG/SVG format, a web page in HTML format that can be run independently, or you can view the running effect in the online Vega-Lite editor.

In Altair, the dataset used should be loaded in a "clean format". DataFrame in Pandas is one of the main data structures used by Altair. Altair has a good loading effect on Pandas DataFrame, and the loading method is simple and efficient. For example, use Pandas to read an Excel data set, and use Altair to load the implementation code of Pandas return values, as shown below:

import altair as alt
import pandas as pd
data = pd.read_excel( "Index_Chart_Altair.xlsx", sheet_name="Sales", parse_dates=["Year"] )
alt.Chart( data )

Quick test - make a bar chart

Altair places great emphasis on variables Differentiation and combination of types. The value of a variable is data, and there are differences. It can be expressed in the form of numerical values, strings, dates, etc. Variables are storage containers for data, and data are the contents of the storage units of variables.

On the other hand, from the perspective of statistical sampling, the variable is the population and the data is the sample. Samples need to be used to study and analyze the population. Statistical graphs can be generated by combining different variable types with each other to provide a more intuitive understanding of the data.

According to the combination of different variable types, the combination of variable types can be divided into the following types.

Time type variable Quantity type variable.
Time variable Nominal variable.
Quantitative variable Quantitative variable.

Among them, the time variable is a special type of quantitative variable. The time variable can be set as a nominal variable (N) or an ordinal variable (O) to realize the time variable. discretization to form a combination with quantitative variables.

Here we will explain one of the nominal variables and quantitative variables.

If you map quantitative variables to the x-axis, map nominal variables to the y-axis, and still use columns as the encoding style (marking style) of the data, you can draw a bar chart. Bar charts can better use changes in length to compare the gap in profit from merchandise sales, as shown in the figure below.

What are the commonly used Python data visualization libraries?

Compared with the implementation code of the column chart, the changes in the implementation code of the bar chart are as follows.

chart = alt.Chart(df).mark_bar().encode(x="profit:Q",y="product:N")

Complex graphs are also very simple

Let’s demonstrate the average monthly rainfall in different years by partition!

我们可以使用面积图描述西雅图从2012 年到2015 年的每个月的平均降雨量统计情况。接下来，进一步拆分平均降雨量，以年份为分区标准，使用阶梯图将具体年份的每月平均降雨量分区展示，如下图所示。

What are the commonly used Python data visualization libraries?

核心的实现代码如下所示。

…
chart = alt.Chart(df).mark_area(
color="lightblue",
interpolate="step",
line=True,
opacity=0.8
).encode(
alt.X("month(date):T",
axis=alt.Axis(format="%b",
formatType="time",
labelAngle=-15,
labelBaseline="top",
labelPadding=5,
title="month")),
y="mean(precipitation):Q",
facet=alt.Facet("year(date):Q",
columns=4,
header=alt.Header(
labelColor="red",
labelFontSize=15,
title="Seattle Monthly Precipitation from 2012 to 2015",
titleFont="Calibri",
titleFontSize=25,
titlePadding=15)
)
0)
…

在类alt.X()中，使用month 提取时间型变量date 的月份，映射在位置通道x轴上，使用汇总函数mean()计算平均降雨量，使用折线作为编码数据的标记样式。

在实例方法encode()中，使用子区通道facet 设置分区，使用year 提取时间型变量date 的年份，作为拆分从2012 年到2015 年每个月的平均降雨量的分区标准，从而将每年的不同月份的平均降雨量分别显示在对应的子区上。使用关键字参数columns设置子区的列数，使用关键字参数header 设置子区序号和子区标题的相关文本内容。

具体而言，使用Header 架构包装器设置文本内容，也就是使用类alt.Header()的关键字参数完成文本内容的设置任务，关键字参数的含义如下所示。

labelColor：序号标签颜色。
labelFontSize：序号标签大小。
title：子区标题。
titleFont：子区字体。
titleFontSize：子区字体大小。
titlePadding：子区标题与序号标签的留白距离。

The above is the detailed content of What are the commonly used Python data visualization libraries?. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete