Home >Backend Development >Python Tutorial >Pandas+Pyecharts | Hospital drug sales data visualization

Pandas+Pyecharts | Hospital drug sales data visualization

Python当打之年
Python当打之年forward
2023-08-10 14:43:561374browse

In this issue, we analyze the drug sales data of a hospital within half a year to see which drugs the hospital purchases There are more people buying medicine on those days, etc. I hope it will be helpful to my friends.
Involved libraries:
  • Pandas — Data processing

  • ##Pyecharts — Data visualization

  • collections — Data statistics

Visualization part:

  • ##Line — Line chart
  • Bar — Bar chart
  • # #Calendar— Calendar Chart
  • ##stylecloud — Word cloud diagram
  • ##Get to the point~~

1. Import module

##
import jieba
import stylecloud
import pandas as pd
from PIL import Image
from collections import Counter
from pyecharts.charts import Geo
from pyecharts.charts import Bar
from pyecharts.charts import Line
from pyecharts.charts import Pie
from pyecharts.charts import Calendar
from pyecharts.charts import WordCloud
from pyecharts import options as opts
from pyecharts.commons.utils import JsCode
from pyecharts.globals import ThemeType,SymbolType,ChartType

##2. Pandas data processing

#2.1 Read Get data

df = pd.read_excel("医院药品销售数据.xlsx")

Result:

Pandas+Pyecharts | Hospital drug sales data visualization
2.2 Data size

##

df.shape
(6578, 7)

A total of

6578 pieces of drug purchase data.

2.3 查看索引、数据类型和内存信息 

df.info()
部分列存在数据缺失。

2.4 统计空值数据 

df.isnull().sum()

Pandas+Pyecharts | Hospital drug sales data visualization

2.5 输出空行 

df[df.isnull().T.any()]
Pandas+Pyecharts | Hospital drug sales data visualization
因为购药时间在后面的分析中会用到,所以我们将购药时间为空的行删除,社保卡号用"000"填充,社保卡号、商品编码为一串数字,应为str类型,销售数量应为int类型:
df1 = df.copy()
df1 = df1.dropna(subset=['购药时间'])
df1[df1.isnull().T.any()]
df1['社保卡号'].fillna('0000', inplace=True)
df1['社保卡号'] = df1['社保卡号'].astype(str)
df1['商品编码'] = df1['商品编码'].astype(str)
df1['销售数量'] = df1['销售数量'].astype(int)
Pandas+Pyecharts | Hospital drug sales data visualization

2.6 销售数量,应收金额,实收金额三列的统计情况 

df1[['销售数量','应收金额','实收金额']].describe()
Pandas+Pyecharts | Hospital drug sales data visualization
数据中存在负值,显然不合理,我们将其转换为正值:
df2 = df1.copy()
df2['销售数量'] = df2['销售数量'].abs()
df2['应收金额'] = df2['应收金额'].abs()
df2['实收金额'] = df2['实收金额'].abs()
Pandas+Pyecharts | Hospital drug sales data visualization

2.7 列拆分(购药时间列拆分为两列)

df3 = df2.copy()
df3[['购药日期', '星期']] = df3['购药时间'].str.split(' ', 2, expand = True)
df3 = df3[['购药日期', '星期','社保卡号','商品编码', '商品名称', '销售数量', '应收金额', '实收金额' ]]

Pandas+Pyecharts | Hospital drug sales data visualization


3. Pyecharts数据可视化

3.1 一周各天药品销量柱状图 

代码:

color_js = """new echarts.graphic.LinearGradient(0, 1, 0, 0,
    [{offset: 0, color: '#FFFFFF'}, {offset: 1, color: '#ed1941'}], false)"""

g1 = df3.groupby('星期').sum()
x_data = list(g1.index)
y_data = g1['销售数量'].values.tolist()
b1 = (
        Bar()
        .add_xaxis(x_data)
        .add_yaxis('',y_data ,itemstyle_opts=opts.ItemStyleOpts(color=JsCode(color_js)))
        .set_global_opts(title_opts=opts.TitleOpts(title='一周各天药品销量',pos_top='2%',pos_left = 'center'),
            legend_opts=opts.LegendOpts(is_show=False),
            xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-15)),
            yaxis_opts=opts.AxisOpts(name="销量",name_location='middle',name_gap=50,name_textstyle_opts=opts.TextStyleOpts(font_size=16)))

    )
b1.render_notebook()

Pandas+Pyecharts | Hospital drug sales data visualization

每天销量整理相差不大,周五、周六偏于购药高峰

3.2 药品销量前十柱状图 

代码:

color_js = """new echarts.graphic.LinearGradient(0, 1, 0, 0,
    [{offset: 0, color: '#FFFFFF'}, {offset: 1, color: '#08519c'}], false)"""

g2 = df3.groupby('商品名称').sum().sort_values(by='销售数量', ascending=False)
x_data = list(g2.index)[:10]
y_data = g2['销售数量'].values.tolist()[:10]
b2 = (
        Bar()
        .add_xaxis(x_data)
        .add_yaxis('',y_data ,itemstyle_opts=opts.ItemStyleOpts(color=JsCode(color_js)))
        .set_global_opts(title_opts=opts.TitleOpts(title='药品销量前十',pos_top='2%',pos_left = 'center'),
            legend_opts=opts.LegendOpts(is_show=False),
            xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-15)),
            yaxis_opts=opts.AxisOpts(name="销量",name_location='middle',name_gap=50,name_textstyle_opts=opts.TextStyleOpts(font_size=16)))

    )
b2.render_notebook()
Pandas+Pyecharts | Hospital drug sales data visualization

可以看出:苯磺 酸氨氯地平片(安内真)开博通酒石酸美托洛尔片(倍他乐克)等治疗高血压、心绞痛药物购买量比较多。。

3.3 Top ten drug sales bar chart

Pandas+Pyecharts | Hospital drug sales data visualization

##Sales are basically proportional to sales volume.
3.4 Order volume per week

Pandas+Pyecharts | Hospital drug sales data visualization

#From the data distribution of each day of the week,
Every daythe sales volume is not much different, Friday and Saturday tend to be the peak of drug purchase .
3.5 Number of orders per day in a natural month

Pandas+Pyecharts | Hospital drug sales data visualization

# #It can be seen that the 5th, 15th and 25th are the peak periods for drug sales, especially the 15th of each month.
3.6 Calendar chart
The calendar chart can more intuitively see the sales volume per day and week within a month :

Pandas+Pyecharts | Hospital drug sales data visualization##3.6 Drug Name Word Cloud

Pandas+Pyecharts | Hospital drug sales data visualization


Due to space reasons, some codes are not fully displayed. If necessary, they can be obtained below, also Can be run online (including all code data files)

https:/ /www.heywhale.com/mw/project/61b83bd9c63c620017c629bc

##

The above is the detailed content of Pandas+Pyecharts | Hospital drug sales data visualization. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:Python当打之年. If there is any infringement, please contact admin@php.cn delete