Home  >  Article  >  Backend Development  >  Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

Python当打之年
Python当打之年forward
2023-08-10 15:25:061225browse

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!


#Mid-Autumn Festival, also known as Moon Festival, Moonlight Festival, Moon Eve, Autumn Festival, Mid-Autumn Festival, Moon Worship Festival, Moon Niang Festival, Moon Festival, Reunion Festival, etc. are traditional Chinese folk festivals. Since ancient times, there have been folk customs such as worshiping the moon, appreciating the moon, eating moon cakes, playing with lanterns, admiring osmanthus, and drinking osmanthus wine, which have been passed down to this day and lasted for a long time. In this issue, we analyze the sales of Moubao Mid-Autumn Mooncakes to see
which flavors of mooncakes sell well
, Which places sell mooncakes well, I hope it will be helpful to my friends.
##Involved libraries:
Pandas
  • — Data processing

    Pyecharts
  • — Data Visualization

  • jieba — participle
  • collections — Data statistics

Visualization part:

  • ##Bar — Bar chart
  • Pie — Pie Chart
  • Map— Map
  • Stylecloud — Word Cloud
1. Import module
##
import re
import jieba
import stylecloud
import numpy as np
import pandas as pd
from collections import Counter
from pyecharts.charts import Bar
from pyecharts.charts import Map 
from pyecharts.charts import Pie
from pyecharts.charts import Grid
from pyecharts.charts import Page
from pyecharts.components import Image
from pyecharts.charts import WordCloud
from pyecharts import options as opts 
from pyecharts.globals import SymbolType
from pyecharts.commons.utils import JsCode
2. Pandas data deal with

2.1 Read data ##
df = pd.read_excel("月饼.xlsx")
df.head(10)
Result:

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!
2.2 去除重复值 
print(df.shape)
df.drop_duplicates(inplace=True)
print(df.shape)

(4520, 5)
(1885, 5)
一共有4520条数据,去重后还有1885条数据(某宝一个店铺会在不同页面推荐,导致重复数据比较多)
2.3 空值处理 
处理购买人数为空的记录:df['付款情况'] = df['付款情况'].replace(np.nan,'0人付款')

2.4 处理付款情况字段 

df[df['付款情况'].str.contains("万")]
Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

付款人数超过10000后会直接用""替代,这里我们需要将其恢复:

# 提取数值
df['num'] = [re.findall(r'(\d+\.{0,1}\d*)', i)[0] for i in df['付款情况']] 
df['num'] = df['num'].astype('float')

# 提取单位(万)
df['unit'] = [''.join(re.findall(r'(万)', i)) for i in df['付款情况']] 
df['unit'] = df['unit'].apply(lambda x:10000 if x=='万' else 1)

# 计算销量
df['销量'] = df['num'] * df['unit']
df = df[df['地址'].notna()]
df['省份'] = df['地址'].str.split(' ').apply(lambda x:x[0])

# 删除多余的列
df.drop(['付款情况', 'num', 'unit'], axis=1, inplace=True)

# 重置索引
df = df.reset_index(drop=True)

结果:

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!


3. Pyecharts数据可视化

3.1 月饼商品销量Top10 

代码:

shop_top10 = df.groupby('商品名称')['销量'].sum().sort_values(ascending=False).head(10)
bar0 = (
    Bar()
        .add_xaxis(shop_top10.index.tolist()[::-1])
        .add_yaxis('sales_num', shop_top10.values.tolist()[::-1])
        .reversal_axis()
        .set_global_opts(title_opts=opts.TitleOpts(title='月饼商品销量Top10'),
                         xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-30))) 
        .set_series_opts(label_opts=opts.LabelOpts(position='right'))
)

效果:

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

商品名称太长显示不全,我们调整一下边距

bar1 = (
    Bar()
        .add_xaxis(shop_top10.index.tolist()[::-1])
        .add_yaxis('sales_num', shop_top10.values.tolist()[::-1],itemstyle_opts=opts.ItemStyleOpts(color=JsCode(color_js)))
        .reversal_axis()
        .set_global_opts(title_opts=opts.TitleOpts(title='月饼商品销量Top10'),
             xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-30)),
             ) 
        .set_series_opts(label_opts=opts.LabelOpts(position='right'))
)
# 将图形整体右移
grid = (
    Grid()
        .add(bar1, grid_opts=opts.GridOpts(pos_left='45%', pos_right='10%')) 
)
Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!
这样是不是好多了。

还可以来些其他(比如:形状)设置:

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

3.2 月饼销量排名TOP10店铺 

代码:

shop_top10 = df.groupby('店铺名称')['销量'].sum().sort_values(ascending=False).head(10)
bar3 = (
    Bar(init_opts=opts.InitOpts(
        width='800px', height='600px',))
    .add_xaxis(shop_top10.index.tolist())
    .add_yaxis('', shop_top10.values.tolist(),
               category_gap='30%',
              )

    .set_global_opts(
        xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-30)),
        title_opts=opts.TitleOpts(
            title='月饼销量排名TOP10店铺',
            pos_left='center',
            pos_top='4%',
            title_textstyle_opts=opts.TextStyleOpts(
                color='#ed1941', font_size=16)
        ),
        visualmap_opts=opts.VisualMapOpts(
            is_show=False,
            max_=600000,
            range_color=["#CCD3D9", "#E6B6C2", "#D4587A","#FF69B4", "#DC364C"]
        ),
     )
)
bar3.render_notebook()
效果:

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

稻香村的月饼销量遥遥领先。

3.3 全国各地区月饼销量

province_num = df.groupby('省份')['销量'].sum().sort_values(ascending=False) 
map_chart = Map(init_opts=opts.InitOpts(theme='light',
                                        width='800px',
                                        height='600px'))
map_chart.add('',
              [list(z) for z in zip(province_num.index.tolist(), province_num.values.tolist())],
              maptype='china',
              is_map_symbol_show=False,
              itemstyle_opts={
                  'normal': {
                      'shadowColor': 'rgba(0, 0, 0, .5)', # 阴影颜色
                      'shadowBlur': 5, # 阴影大小
                      'shadowOffsetY': 0, # Y轴方向阴影偏移
                      'shadowOffsetX': 0, # x轴方向阴影偏移
                      'borderColor': '#fff'
                  }
              }
              )
map_chart.set_global_opts(
    visualmap_opts=opts.VisualMapOpts(
        is_show=True,
        is_piecewise=True,
        min_ = 0,
        max_ = 1,
        split_number = 5,
        series_index=0,
        pos_top='70%',
        pos_left='10%',
        range_text=['销量(份):', ''],
        pieces=[
            {'max':2000000, 'min':200000, 'label':'> 200000', 'color': '#990000'},
            {'max':200000, 'min':100000, 'label':'100000-200000', 'color': '#CD5C5C'},
            {'max':100000, 'min':50000, 'label':'50000-100000', 'color': '#F08080'},
            {'max':50000, 'min':10000, 'label':'10000-50000', 'color': '#FFCC99'},
            {'max':10000, 'min':0, 'label':'0-10000', 'color': '#FFE4E1'},
           ],
    ),
    legend_opts=opts.LegendOpts(is_show=False), 
    tooltip_opts=opts.TooltipOpts(
        is_show=True,
        trigger='item',
        formatter='{b}:{c}'
    ),
    title_opts=dict(
        text='全国各地区月饼销量',
        left='center',
        top='5%',
        textStyle=dict(
            color='#DC143C'))
)
map_chart.render_notebook()

结果:

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

From the geographical distribution map, stores are mainly distributed in Beijing, Shandong, Zhejiang, Guangdong, Yunnan and other southeastern regions.
3.4 Proportion of mooncake sales in different price ranges

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

可以看到,50以下的月饼销量占比达到了52%,超过了半数的月饼售价在50元以内,100The sales proportion of mooncakes below has reached as much as 85%. Although there are some with prices above 1,000 yuan, the overall price is still relatively affordable.
3.5 Mooncake flavor distribution

Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!

流心##、Five kernels, egg yolk lotus paste, bean paste yyds! ! !
3.6 Word Cloud
Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!
##The amount of code is relatively large. Due to space reasons, part of the code is not fully displayed. If necessary, it can be obtained below, or is available online. Run (including all code)
https://www.heywhale .com/mw/project/61404e0ff0de6200174ada20

The above is the detailed content of Visualization | Python analyzes Mid-Autumn mooncakes, these flavors are the yyds!. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:Python当打之年. If there is any infringement, please contact admin@php.cn delete