Home >Database >Mysql Tutorial >MySQL逗号分割字段的行列转换测试改进

MySQL逗号分割字段的行列转换测试改进

WBOY
WBOYOriginal
2016-06-07 14:51:141646browse

p由于很多业务表因为历史原因或者性能原因,都使用了违反第一范式的设计模式。即同一个列中存储了多个属性值(具体结构见下表)。/pp这种模式下,应用常常需要将这个列依据分隔符进行分割,并得到列转行的结果。/p span class=cnblogs_code_copy/spanp style


<p>由于很多业务表因为历史原因或者性能原因,都使用了违反第一范式的设计模式。即同一个列中存储了多个属性值(具体结构见下表)。</p><p>这种模式下,应用常常需要将这个列依据分隔符进行分割,并得到列转行的结果。</p>
<span class="cnblogs_code_copy"></span><p style="margin: 10px auto; line-height: 19px; font-family: verdana, sans-serif; font-size: 13px;">表数据:</p>
ID Value
1 tiny,small,big
2 small,medium
3 tiny,big

 

期望得到结果:

ID Value
1 tiny
1 small
1 big
2 small
2 medium
3 tiny
3 big

 

<span style="line-height: 1.5;">#需要处理的表
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">create</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">table</span> tbl_name (ID <span style="color: rgb(0, 0, 255); line-height: 1.5;">int</span> ,mSize <span style="color: rgb(0, 0, 255); line-height: 1.5;">varchar</span>(<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">100</span><span style="line-height: 1.5;">));
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> tbl_name <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">1</span>,<span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">tiny,small,big</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> tbl_name <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">2</span>,<span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">small,medium</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> tbl_name <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">3</span>,<span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">tiny,big</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="line-height: 1.5;">);

#用于循环的自增表
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">create</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">table</span> incre_table (AutoIncreID <span style="color: rgb(0, 0, 255); line-height: 1.5;">int</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> incre_table <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">1</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> incre_table <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">2</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> incre_table <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">3</span>);
复制代码

 

复制代码
<span style="color:rgb(0,0,255); line-height:1.5">select</span> a.ID,substring_index(substring_index(a.mSize,<span style="color:rgb(255,0,0); line-height:1.5">'</span><span style="color:rgb(255,0,0); line-height:1.5">,</span><span style="color:rgb(255,0,0); line-height:1.5">'</span>,b.AutoIncreID),<span style="color:rgb(255,0,0); line-height:1.5">'</span><span style="color:rgb(255,0,0); line-height:1.5">,</span><span style="color:rgb(255,0,0); line-height:1.5">'</span>,<span style="color:rgb(128,128,128); line-height:1.5">-</span><span style="color:rgb(128,0,0); line-height:1.5; font-weight:bold">1</span><span style="line-height:1.5">) 
</span><span style="color:rgb(0,0,255); line-height:1.5">from</span><span style="line-height:1.5"> 
tbl_name a
</span><span style="color:rgb(128,128,128); line-height:1.5">join</span><span style="line-height:1.5">
incre_table b
</span><span style="color:rgb(0,0,255); line-height:1.5">on</span> b.AutoIncreID <span style="color:rgb(128,128,128); line-height:1.5"> (length(a.mSize) <span style="color:rgb(128,128,128); line-height:1.5">-</span> length(<span style="color:rgb(255,0,255); line-height:1.5">replace</span>(a.mSize,<span style="color:rgb(255,0,0); line-height:1.5">'</span><span style="color:rgb(255,0,0); line-height:1.5">,</span><span style="color:rgb(255,0,0); line-height:1.5">'</span>,<span style="color:rgb(255,0,0); line-height:1.5">''</span>))<span style="color:rgb(128,128,128); line-height:1.5">+</span><span style="color:rgb(128,0,0); line-height:1.5; font-weight:bold">1</span><span style="line-height:1.5">)
</span><span style="color:rgb(0,0,255); line-height:1.5">order</span> <span style="color:rgb(0,0,255); line-height:1.5">by</span> a.ID;</span>



原理分析:

这个join最基本原理是笛卡尔积。通过这个方式来实现循环。

以下是具体问题分析:

length(a.Size) - length(replace(a.mSize,',',''))+1  表示了,按照逗号分割后,改列拥有的数值数量,下面简称n

复制代码
<span style="color:#00ff">select</span> a.ID,substring_index(substring_index(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,b.AutoIncreID),<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#808080">-</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">) 
</span><span style="color:#00ff">from</span><span style="color:#000000"> 
tbl_name a
</span><span style="color:#808080">join</span><span style="color:#000000">
incre_table b
</span><span style="color:#00ff">on</span> b.AutoIncreID <span style="color:#808080"> (length(a.mSize) <span style="color:#808080">-</span> length(<span style="color:#ff0ff">replace</span>(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#ff00">''</span>))<span style="color:#808080">+</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">)
</span><span style="color:#00ff">order</span> <span style="color:#00ff">by</span> a.ID;</span>
复制代码

原理分析:

这个join最基本原理是笛卡尔积。通过这个方式来实现循环。

以下是具体问题分析:

length(a.Size) - length(replace(a.mSize,',',''))+1  表示了,按照逗号分割后,改列拥有的数值数量,下面简称n

join过程的伪代码:

根据ID进行循环

{

判断:i 是否

{

获取最靠近第 i 个逗号之前的数据, 即 substring_index(substring_index(a.mSize,',',b.ID),',',-1)

i = i +1 

}

ID = ID +1 

}

总结:

这种方法的缺点在于,我们需要一个拥有连续数列的独立表(这里是incre_table)。并且连续数列的最大值一定要大于符合分割的值的个数。

例如有一行的mSize 有100个逗号分割的值,那么我们的incre_table 就需要有至少100个连续行。

当然,mysql内部也有现成的连续数列表可用。如mysql.help_topic: help_topic_id 共有504个数值,一般能满足于大部分需求了。

改写后如下:

复制代码
<span style="color:#00ff">select</span> a.ID,substring_index(substring_index(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,b.help_topic_id<span style="color:#808080">+</span><span style="color:#8000; font-weight:bold">1</span>),<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#808080">-</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">) 
</span><span style="color:#00ff">from</span><span style="color:#000000"> 
tbl_name a
</span><span style="color:#808080">join</span><span style="color:#000000">
mysql.help_topic b
</span><span style="color:#00ff">on</span> b.help_topic_id <span style="color:#808080"> (length(a.mSize) <span style="color:#808080">-</span> length(<span style="color:#ff0ff">replace</span>(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#ff00">''</span>))<span style="color:#808080">+</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">)
</span><span style="color:#00ff">order</span> <span style="color:#00ff">by</span> a.ID;</span>
复制代码 测试实例:


-- SELECT  help_topic_id  FROM mysql.help_topic
-- eg.把一个字段用“,”分隔开组合
select group_concat(user_id ORDER BY user_id ASC) as nids from admin_user
SELECT b.did,GROUP_CONCAT(b.sid ORDER BY adjustment DESC,similar DESC) FROM test b GROUP BY b.did 


-- 1.如果多个导购同1张单的先分解
-- 加时间段
select a.DJBH,a.je,substring_index(substring_index(a.dgy_list_id,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) 
from 
ipos_qtlsd  a
join
mysql.help_topic b
on b.help_topic_id < (length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1) and a.djbh=&#39;BP0102_qtsy000070&#39;
order by a.DJBH;

-- 2.取平均值

-- SELECT  help_topic_id  FROM mysql.help_topic
-- 1.如果多个导购同1张单的先分解
-- @zddm
-- @ rq
select a.DJBH,
substring_index(substring_index(a.dgy_list_id,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJID,
substring_index(substring_index(a.dgy_list_mc,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJMC,
FORMAT(a.je/(length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1),2) AS FJJE,
je 
from 
ipos_qtlsd  a
join
mysql.help_topic b
on b.help_topic_id < (length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1) 
and a.rq BETWEEN UNIX_TIMESTAMP(&#39;2016-04-01&#39;) and UNIX_TIMESTAMP(&#39;2016-05-01&#39;)
and a.djbh=&#39;gd_151125000001&#39;
order by a.DJBH;

-- gd_151125000001 
--3.分解后的指标
-- SELECT  help_topic_id  FROM mysql.help_topic
-- 1.如果多个导购同1张单的先分解
-- @khdm_change 终端代码
-- @start_time 开始时间
-- @end_time  结束时间
-- SELECT * FROM ipos_qtlsd WHERE djbh=&#39;gd_151125000001&#39;
set @khdm_change =&#39;BP0102&#39;;
set @start_time=UNIX_TIMESTAMP(&#39;2016-04-01&#39;);
set @end_time=UNIX_TIMESTAMP(&#39;2016-05-01&#39;);
SELECT FJID,FJMC,SUM(FJJE) 
FROM(
select a.zddm,a.zdmc,a.DJBH,
 substring_index(substring_index(a.dgy_list_id,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJID,
 substring_index(substring_index(a.dgy_list_mc,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJMC,
 FORMAT(a.je/(length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1),2) AS FJJE,
 je 
from ipos_qtlsd  a
 join mysql.help_topic b on b.help_topic_id < (length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1) 
and a.rq BETWEEN  @start_time and @end_time
and a.zd_id=(SELECT id from com_base_kehu where khdm=@khdm_change)
) AA
GROUP BY FJID,FJMC
-- and a.djbh=&#39;gd_151125000001&#39;
-- order by a.DJBH;


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn