首頁 >資料庫 >mysql教程 >MySQL逗号分割字段的行列转换测试改进

MySQL逗号分割字段的行列转换测试改进

WBOY
WBOY原創
2016-06-07 14:51:141653瀏覽

p由于很多业务表因为历史原因或者性能原因,都使用了违反第一范式的设计模式。即同一个列中存储了多个属性值(具体结构见下表)。/pp这种模式下,应用常常需要将这个列依据分隔符进行分割,并得到列转行的结果。/p span class=cnblogs_code_copy/spanp style


<p>由于很多业务表因为历史原因或者性能原因,都使用了违反第一范式的设计模式。即同一个列中存储了多个属性值(具体结构见下表)。</p><p>这种模式下,应用常常需要将这个列依据分隔符进行分割,并得到列转行的结果。</p>
<span class="cnblogs_code_copy"></span><p style="margin: 10px auto; line-height: 19px; font-family: verdana, sans-serif; font-size: 13px;">表数据:</p>
ID Value
1 tiny,small,big
2 small,medium
3 tiny,big

 

期望得到结果:

ID Value
1 tiny
1 small
1 big
2 small
2 medium
3 tiny
3 big

 

<span style="line-height: 1.5;">#需要处理的表
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">create</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">table</span> tbl_name (ID <span style="color: rgb(0, 0, 255); line-height: 1.5;">int</span> ,mSize <span style="color: rgb(0, 0, 255); line-height: 1.5;">varchar</span>(<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">100</span><span style="line-height: 1.5;">));
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> tbl_name <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">1</span>,<span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">tiny,small,big</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> tbl_name <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">2</span>,<span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">small,medium</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> tbl_name <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">3</span>,<span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">tiny,big</span><span style="color: rgb(255, 0, 0); line-height: 1.5;">'</span><span style="line-height: 1.5;">);

#用于循环的自增表
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">create</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">table</span> incre_table (AutoIncreID <span style="color: rgb(0, 0, 255); line-height: 1.5;">int</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> incre_table <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">1</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> incre_table <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">2</span><span style="line-height: 1.5;">);
</span><span style="color: rgb(0, 0, 255); line-height: 1.5;">insert</span> <span style="color: rgb(0, 0, 255); line-height: 1.5;">into</span> incre_table <span style="color: rgb(0, 0, 255); line-height: 1.5;">values</span> (<span style="color: rgb(128, 0, 0); line-height: 1.5; font-weight: bold;">3</span>);
复制代码

 

复制代码
<span style="color:rgb(0,0,255); line-height:1.5">select</span> a.ID,substring_index(substring_index(a.mSize,<span style="color:rgb(255,0,0); line-height:1.5">'</span><span style="color:rgb(255,0,0); line-height:1.5">,</span><span style="color:rgb(255,0,0); line-height:1.5">'</span>,b.AutoIncreID),<span style="color:rgb(255,0,0); line-height:1.5">'</span><span style="color:rgb(255,0,0); line-height:1.5">,</span><span style="color:rgb(255,0,0); line-height:1.5">'</span>,<span style="color:rgb(128,128,128); line-height:1.5">-</span><span style="color:rgb(128,0,0); line-height:1.5; font-weight:bold">1</span><span style="line-height:1.5">) 
</span><span style="color:rgb(0,0,255); line-height:1.5">from</span><span style="line-height:1.5"> 
tbl_name a
</span><span style="color:rgb(128,128,128); line-height:1.5">join</span><span style="line-height:1.5">
incre_table b
</span><span style="color:rgb(0,0,255); line-height:1.5">on</span> b.AutoIncreID <span style="color:rgb(128,128,128); line-height:1.5"> (length(a.mSize) <span style="color:rgb(128,128,128); line-height:1.5">-</span> length(<span style="color:rgb(255,0,255); line-height:1.5">replace</span>(a.mSize,<span style="color:rgb(255,0,0); line-height:1.5">'</span><span style="color:rgb(255,0,0); line-height:1.5">,</span><span style="color:rgb(255,0,0); line-height:1.5">'</span>,<span style="color:rgb(255,0,0); line-height:1.5">''</span>))<span style="color:rgb(128,128,128); line-height:1.5">+</span><span style="color:rgb(128,0,0); line-height:1.5; font-weight:bold">1</span><span style="line-height:1.5">)
</span><span style="color:rgb(0,0,255); line-height:1.5">order</span> <span style="color:rgb(0,0,255); line-height:1.5">by</span> a.ID;</span>



原理分析:

这个join最基本原理是笛卡尔积。通过这个方式来实现循环。

以下是具体问题分析:

length(a.Size) - length(replace(a.mSize,',',''))+1  表示了,按照逗号分割后,改列拥有的数值数量,下面简称n

复制代码
<span style="color:#00ff">select</span> a.ID,substring_index(substring_index(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,b.AutoIncreID),<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#808080">-</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">) 
</span><span style="color:#00ff">from</span><span style="color:#000000"> 
tbl_name a
</span><span style="color:#808080">join</span><span style="color:#000000">
incre_table b
</span><span style="color:#00ff">on</span> b.AutoIncreID <span style="color:#808080"> (length(a.mSize) <span style="color:#808080">-</span> length(<span style="color:#ff0ff">replace</span>(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#ff00">''</span>))<span style="color:#808080">+</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">)
</span><span style="color:#00ff">order</span> <span style="color:#00ff">by</span> a.ID;</span>
复制代码

原理分析:

这个join最基本原理是笛卡尔积。通过这个方式来实现循环。

以下是具体问题分析:

length(a.Size) - length(replace(a.mSize,',',''))+1  表示了,按照逗号分割后,改列拥有的数值数量,下面简称n

join过程的伪代码:

根据ID进行循环

{

判断:i 是否

{

获取最靠近第 i 个逗号之前的数据, 即 substring_index(substring_index(a.mSize,',',b.ID),',',-1)

i = i +1 

}

ID = ID +1 

}

总结:

这种方法的缺点在于,我们需要一个拥有连续数列的独立表(这里是incre_table)。并且连续数列的最大值一定要大于符合分割的值的个数。

例如有一行的mSize 有100个逗号分割的值,那么我们的incre_table 就需要有至少100个连续行。

当然,mysql内部也有现成的连续数列表可用。如mysql.help_topic: help_topic_id 共有504个数值,一般能满足于大部分需求了。

改写后如下:

复制代码
<span style="color:#00ff">select</span> a.ID,substring_index(substring_index(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,b.help_topic_id<span style="color:#808080">+</span><span style="color:#8000; font-weight:bold">1</span>),<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#808080">-</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">) 
</span><span style="color:#00ff">from</span><span style="color:#000000"> 
tbl_name a
</span><span style="color:#808080">join</span><span style="color:#000000">
mysql.help_topic b
</span><span style="color:#00ff">on</span> b.help_topic_id <span style="color:#808080"> (length(a.mSize) <span style="color:#808080">-</span> length(<span style="color:#ff0ff">replace</span>(a.mSize,<span style="color:#ff00">'</span><span style="color:#ff00">,</span><span style="color:#ff00">'</span>,<span style="color:#ff00">''</span>))<span style="color:#808080">+</span><span style="color:#8000; font-weight:bold">1</span><span style="color:#000000">)
</span><span style="color:#00ff">order</span> <span style="color:#00ff">by</span> a.ID;</span>
复制代码 测试实例:


-- SELECT  help_topic_id  FROM mysql.help_topic
-- eg.把一个字段用“,”分隔开组合
select group_concat(user_id ORDER BY user_id ASC) as nids from admin_user
SELECT b.did,GROUP_CONCAT(b.sid ORDER BY adjustment DESC,similar DESC) FROM test b GROUP BY b.did 


-- 1.如果多个导购同1张单的先分解
-- 加时间段
select a.DJBH,a.je,substring_index(substring_index(a.dgy_list_id,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) 
from 
ipos_qtlsd  a
join
mysql.help_topic b
on b.help_topic_id < (length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1) and a.djbh=&#39;BP0102_qtsy000070&#39;
order by a.DJBH;

-- 2.取平均值

-- SELECT  help_topic_id  FROM mysql.help_topic
-- 1.如果多个导购同1张单的先分解
-- @zddm
-- @ rq
select a.DJBH,
substring_index(substring_index(a.dgy_list_id,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJID,
substring_index(substring_index(a.dgy_list_mc,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJMC,
FORMAT(a.je/(length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1),2) AS FJJE,
je 
from 
ipos_qtlsd  a
join
mysql.help_topic b
on b.help_topic_id < (length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1) 
and a.rq BETWEEN UNIX_TIMESTAMP(&#39;2016-04-01&#39;) and UNIX_TIMESTAMP(&#39;2016-05-01&#39;)
and a.djbh=&#39;gd_151125000001&#39;
order by a.DJBH;

-- gd_151125000001 
--3.分解后的指标
-- SELECT  help_topic_id  FROM mysql.help_topic
-- 1.如果多个导购同1张单的先分解
-- @khdm_change 终端代码
-- @start_time 开始时间
-- @end_time  结束时间
-- SELECT * FROM ipos_qtlsd WHERE djbh=&#39;gd_151125000001&#39;
set @khdm_change =&#39;BP0102&#39;;
set @start_time=UNIX_TIMESTAMP(&#39;2016-04-01&#39;);
set @end_time=UNIX_TIMESTAMP(&#39;2016-05-01&#39;);
SELECT FJID,FJMC,SUM(FJJE) 
FROM(
select a.zddm,a.zdmc,a.DJBH,
 substring_index(substring_index(a.dgy_list_id,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJID,
 substring_index(substring_index(a.dgy_list_mc,&#39;,&#39;,b.help_topic_id+1),&#39;,&#39;,-1) AS FJMC,
 FORMAT(a.je/(length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1),2) AS FJJE,
 je 
from ipos_qtlsd  a
 join mysql.help_topic b on b.help_topic_id < (length(a.dgy_list_id) - length(replace(a.dgy_list_id,&#39;,&#39;,&#39;&#39;))+1) 
and a.rq BETWEEN  @start_time and @end_time
and a.zd_id=(SELECT id from com_base_kehu where khdm=@khdm_change)
) AA
GROUP BY FJID,FJMC
-- and a.djbh=&#39;gd_151125000001&#39;
-- order by a.DJBH;


陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn