Home >Backend Development >Python Tutorial >Wonderful! This Python data visualization tool is powerful!

Wonderful! This Python data visualization tool is powerful!

PHPz
PHPzforward
2023-04-15 09:28:02946browse

Wonderful! This Python data visualization tool is powerful!

With Altair, you can spend more time focusing on the data and its meaning, which I will detail below:

Example

This is a Example of using Altair in JupyterLab to quickly visualize and display a data set:

import altair as alt
# load a simple dataset as a pandas DataFrame
from vega_datasets import data
cars = data.cars()
alt.Chart(cars).mark_point().encode(
x='Horsepower',
y='Miles_per_Gallon',
color='Origin',
)

Wonderful! This Python data visualization tool is powerful!

One of the unique features of Altair derived from Vega-Lite is the declarative syntax, which not only The visualization function is also interactive. With some modifications to the above example, we can create a linked histogram that is filtered based on the scatter plot selection.

import altair as alt
from vega_datasets import data
source = data.cars()
brush = alt.selection(type='interval')
points = alt.Chart(source).mark_point().encode(
x='Horsepower',
y='Miles_per_Gallon',
color=alt.condition(brush, 'Origin', alt.value('lightgray'))
).add_selection(
brush
)
bars = alt.Chart(source).mark_bar().encode(
y='Origin',
color='Origin',
x='count(Origin)'
).transform_filter(
brush
)
points & bars

Wonderful! This Python data visualization tool is powerful!

Installation method

Altair requires the following dependencies:

  • pandas
  • traitlets
  • IPython

If you have cloned the repository, run the following command from the root of the repository:

pip install -e .[dev]

If you do not want to clone the repository, you can use the following command to do so Installation:

pip install git+https://github.com/altair-viz/altair

For more details, you can view the github link:

https://github.com/altair-viz/altair

Three major operations

Next, I will introduce Altair in detail How to create visualizations of filtering, grouping, and merging operations that can be used as part of an exploratory data analysis process.

We construct two data frames of simulated data. The first is the restaurant order and the second is the price of the item in the restaurant order.

# import libraries
import numpy as np
import pandas as pd
import altair as alt
import random
# mock data
orders = pd.DataFrame({
 "order_id": np.arange(1,101),
 "item": np.random.randint(1, 50, size=100),
 "qty": np.random.randint(1, 10, size=100),
 "tip": (np.random.random(100) * 10).round(2)
})
prices = pd.DataFrame({
 "item": np.arange(1,51),
 "price": (np.random.random(50) * 50).round(2)
})
order_type = ["lunch", "dinner"] * 50
random.shuffle(order_type)
orders["order_type"] = order_type

Wonderful! This Python data visualization tool is powerful!

Wonderful! This Python data visualization tool is powerful!

First, we create a simple diagram to Altair syntax structure.

alt.Chart(orders).mark_circle(size=50).encode(
 x="qty", y="tip", color="order_type"
).properties(
 title = "Tip vs Quantity"
)

Wonderful! This Python data visualization tool is powerful!

Altair basic syntax four steps:

  • Pass the data to the Chart object. The data can be a Pandas data frame or point to json or csv The file's URL string.
  • Select the type of visualization (such as mark_circle, mark_line, etc.).
  • encode The encoding function specifies what to plot in a given data frame. Therefore, anything we write in the encoding function must be linked to the dataframe.
  • Use the properties function to specify certain properties of the graph.

Consider a situation where we need to create a scatter plot of pirce and tip values, which are in different data frames. One option is to merge the two dataframes and use these two columns in a scatter plot.

Altair provides a more practical method that allows finding columns in other data frames, similar to Pandas's merge function.

alt.Chart(orders).mark_circle(size=50).encode(
 x="tip", y="price:Q", color="order_type"
).transform_lookup(
 lookup="item",
 from_=alt.LookupData(data=prices, key="item", fields=["price"])
).properties(
 title = "Price vs Tip"
)

Wonderful! This Python data visualization tool is powerful!

The transform_lookup function is similar to Pandas’ merge function. The columns (i.e. rows) used to match the observations are passed to the lookup parameter. The fields parameter is used to select the required columns from another dataframe.

We can also integrate a filter component into the plot, allowing us to plot data points with prices above $10.

alt.Chart(orders).mark_circle(size=50).encode(
 x="tip", y="price:Q", color="order_type"
).transform_lookup(
 lookup="item",
 from_=alt.LookupData(data=prices, key="item", fields=["price"])
).transform_filter(
 alt.FieldGTPredicate(field='price', gt=10)
).properties(
 title = "Price vs Tip"
)

Wonderful! This Python data visualization tool is powerful!

The transform_filter function is used for filtering. FieldGTPredicate handles "greater than" conditions.

In addition to filtering and merging, Altair also allows data points to be grouped before plotting. For example, we can create a bar chart that displays the average price of an item for each order type. Additionally, we can do this for items priced under $20.

alt.Chart(orders).mark_bar().encode(
 y="order_type", x="avg_price:Q"
).transform_lookup(
 lookup="item",
 from_=alt.LookupData(data=prices, key="item", fields=["price"])
).transform_filter(
 alt.FieldLTPredicate(field='price', lt=20)
).transform_aggregate(
 avg_price = "mean(price)", groupby = ["order_type"]
).properties(
 height=200, width=300
)

Wonderful! This Python data visualization tool is powerful!

Let us explain each step in detail:

  • transform_lookup: Find price from price dataframe.
  • transform_filter: Filter prices below $20.
  • transform_aggregate: Group prices by order type and calculate the mean.

Conclusion

The difference between Altair and other common visualization libraries is that it can seamlessly integrate data analysis components into visualization, making it a very practical data Explore tools.

Filtering, merging, and grouping are critical to the exploratory data analysis process. Altair allows you to perform all these operations when creating data visualizations. In this sense, Altair can also be considered a data analysis tool. If you are interested, try it now.

The above is the detailed content of Wonderful! This Python data visualization tool is powerful!. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete