Home  >  Article  >  Backend Development  >  How to implement pivot() in pandas.DataFrame to convert rows to columns (code)

How to implement pivot() in pandas.DataFrame to convert rows to columns (code)

不言
不言forward
2018-10-13 14:34:006326browse

The content of this article is about how pivot() in pandas.DataFrame implements row conversion (code). It has certain reference value. Friends in need can refer to it. I hope It will help you.

Example:

The following table needs to be converted between rows and columns:

The code is as follows:

# -*- coding:utf-8 -*-
import pandas as pd
import MySQLdb
from warnings import filterwarnings
# 由于create table if not exists总会抛出warning,因此使用filterwarnings消除
filterwarnings('ignore', category = MySQLdb.Warning)
from sqlalchemy import create_engine
import sys
if sys.version_info.major<3:
  reload(sys)
  sys.setdefaultencoding("utf-8")
  # 此脚本适用于python2和python3
host,port,user,passwd,db,charset="192.168.1.193",3306,"leo","mysql","test","utf8"

def get_df():
  global host,port,user,passwd,db,charset
  conn_config={"host":host, "port":port, "user":user, "passwd":passwd, "db":db,"charset":charset}
  conn = MySQLdb.connect(**conn_config)
  result_df=pd.read_sql(&#39;select UserName,Subject,Score from TEST&#39;,conn)
  return result_df

def pivot(result_df):
  df_pivoted_init=result_df.pivot(&#39;UserName&#39;,&#39;Subject&#39;,&#39;Score&#39;)
  df_pivoted = df_pivoted_init.reset_index()  # 将行索引也作为DataFrame值的一部分,以方便存储数据库
  return df_pivoted_init,df_pivoted
  # 返回的两个DataFrame,一个是以姓名作index的,一个是以数字序列作index,前者用于unpivot,后者用于save_to_mysql

def unpivot(df_pivoted_init):
  # unpivot需要进行df_pivoted_init二维表格的行、列索引遍历,需要拼SQL因此不能使用save_to_mysql存数据,这里使用SQL和MySQLdb接口存
  insert_sql="insert into test_unpivot(UserName,Subject,Score) values "
  # 处理值为NaN的情况
  df_pivoted_init=df_pivoted_init.add(0,fill_value=0)
  for col in df_pivoted_init.columns:
    for index in df_pivoted_init.index:
      value=df_pivoted_init.at[index,col]
      if value!=0:
        insert_sql=insert_sql+"(&#39;%s&#39;,&#39;%s&#39;,%s)" %(index,col,value)+&#39;,&#39;
  insert_sql = insert_sql.strip(&#39;,&#39;)
  global host, port, user, passwd, db, charset
  conn_config = {"host": host, "port": port, "user": user, "passwd": passwd, "db": db, "charset": charset}
  conn = MySQLdb.connect(**conn_config)
  cur=conn.cursor()
  cur.execute("create table if not exists test_unpivot like TEST")
  cur.execute(insert_sql)
  conn.commit()
  conn.close()

def save_to_mysql(df_pivoted,tablename):
  global host, port, user, passwd, db, charset
  """
  只有使用sqllite时才能指定con=connection实例,其他数据库需要使用sqlalchemy生成engine,engine的定义可以添加?来设置字符集和其他属性
  """
  conn="mysql://%s:%s@%s:%d/%s?charset=%s" %(user,passwd,host,port,db,charset)
  mysql_engine = create_engine(conn)
  df_pivoted.to_sql(name=tablename, con=mysql_engine, if_exists=&#39;replace&#39;, index=False)

# 从TEST表读取源数据至DataFrame结构
result_df=get_df()
# 将源数据行转列为二维表格形式
df_pivoted_init,df_pivoted=pivot(result_df)
# 将二维表格形式的数据存到新表test中
save_to_mysql(df_pivoted,&#39;test&#39;)
# 将被行转列的数据unpivot,存入test_unpivot表中
unpivot(df_pivoted_init)

The result is as follows:

About the pivot method that comes with the Pandas DataFrame class:

DataFrame.pivot(index=None, columns=None , values=None):

Return reshaped DataFrame organized by given index / column values.

There are only 3 parameters here because pivot The subsequent result must be a two-dimensional table, which only requires rows and columns and their corresponding values. And because it is a two-dimensional table, the is_pass column will definitely be lost after unpivot, so I did not check this column at the beginning.

The above is the detailed content of How to implement pivot() in pandas.DataFrame to convert rows to columns (code). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:cnblogs.com. If there is any infringement, please contact admin@php.cn delete