问题起因: 最近做的项目DB数据量比较大(基本上一个月的数据就是10亿),而工程中Proc参数中包含有id拼接字符串,id拼接字符串格式:1,2,4,5,100,301。当数据量很小的情况下,这样做没有问题,但一旦数据量到达亿级,运行会很耗时,比如:当这样的参数id拼接
最近做的项目DB数据量比较大(基本上一个月的数据就是10亿),而工程中Proc参数中包含有id拼接字符串,id拼接字符串格式:1,2,4,5,100,301。当数据量很小的情况下,这样做没有问题,但一旦数据量到达亿级,运行会很耗时,比如:当这样的参数id拼接字符串中包含有10万个id的时候(我们实际应用中确实有这么多个id需要传到数据库,而且这样的id是从库中取出后,又经过程序的筛选后剩余的id),像这样的语句:
<span>Declare</span> <span>@IDS</span> <span>nvarchar</span>(<span>max</span><span>); </span><span>Set</span> <span>@IDS</span><span>=</span><span>'</span><span>10w个id用逗号分割组成的字符串</span><span>'</span><span>; </span><span>Select</span> T10.<span>TEXT</span>,T10.Name <span>FROM</span> DX.M <span>as</span> T10 <span>inner</span> <span>join</span> dbo.StringToTable(<span>@IDS</span>,<span>'</span><span>,</span><span>'</span>) <span>as</span> T11 <span>on</span> T10.ID<span>=</span>T11.ID;
执行了18个小时还未查询出数据。
备注:
虚拟机配置:内存:64G;CPU核数:40。
我测试了下,性能还算可以。在解析5000个逗号之内性能还行,太多了,性能就急速下降了。
最初的那个版本其实还是很常用的,性能要比改写之后的要好一些(在字符串特别长的情况下)。但是同样存在,如果字符串太长,性能急速下降的问题。
如果真的有5W以上逗号的字符串。这个SqlServer在执行计划上会消耗很多性能。
(自己也可以测试一下解析5000个逗号串和解析5W个字符串的差距,并不是5000字符串消耗时间*10的线性关系)
所以应当写一个循环,一次处理一部分。
比如以下两种方法:
1. 每次截取前1W个字符串,解析出来之后插入到临时表,然后在解析后面的,在插入到临时表,循环处理。最后临时表和实际表进行关联。
insert into #t1
select id
from dbo.stringtotable(@字符串1)
insert into #t1
select id
from dbo.stringtotable(@字符串2)
2。用in的方式,每次where条件 in 一部分。然后将结果union all起来。
类似如下
select id
from table a
where id in (@字符串1)
union all
select id
from table a
where id in (@字符串2)
两种方法都可行。在字符串较短的情况下,第二种方法应该好一些。字符串较长,第一种应该好一些。
<span>Declare</span> <span>@MRE_MROOIDS</span> <span>Nvarchar</span>(<span>Max</span><span>); </span><span>Set</span> <span>@MRE_MROOIDS</span><span>=</span><span>'</span><span>2,4,5,396009,</span><span>'</span><span>; </span><span>--</span><span>Set @MRE_MROOIDS='2,4,5,6,7,8,9,10,11,14,15,16,17,18,20,21,23,24,25,26,29,30';</span> <span>Declare</span> <span>@SplitChar</span> <span>nvarchar</span>(<span>2</span><span>); </span><span>Declare</span> <span>@EndIndex</span> <span>int</span><span>; </span><span>Declare</span> <span>@Step</span> <span>int</span><span>; </span><span>Declare</span> <span>@LastChars</span> <span>nvarchar</span>(<span>MAX</span><span>); </span><span>Declare</span> <span>@CurrentTempChars</span> <span>nvarchar</span>(<span>max</span><span>); </span><span>Set</span> <span>@LastChars</span><span>=</span><span>@MRE_MROOIDS</span><span>; </span><span>Set</span> <span>@Step</span><span>=</span><span>5000</span><span>; </span><span>Set</span> <span>@EndIndex</span><span>=</span><span>0</span><span>; </span><span>Set</span> <span>@SplitChar</span><span>=</span><span>'</span><span>,</span><span>'</span><span>; </span><span>IF</span> <span>EXISTS</span>(<span>SELECT</span> <span>*</span> <span>FROM</span> tempdb.dbo.sysobjects <span>where</span> id<span>=</span><span>OBJECT_ID</span>(N<span>'</span><span>tempdb..#StringToTableEntry_Temp10</span><span>'</span><span>)) </span><span>Begin</span> <span>Drop</span> <span>Table</span><span> #StringToTableEntry_Temp10; </span><span>End</span> <span>Create</span> <span>Table</span> #StringToTableEntry_Temp10(ID <span>INT</span><span>); </span><span>While</span>(<span>LEN</span>(<span>@LastChars</span>)<span>></span><span>@Step</span><span>) </span><span>Begin</span> <span>Set</span> <span>@EndIndex</span><span>=</span> <span>charindex</span>(<span>@SplitChar</span>,<span>@LastChars</span>,<span>@Step</span><span>); </span><span>Set</span> <span>@CurrentTempChars</span><span>=</span><span>SubString</span>(<span>@LastChars</span>,<span>0</span>,<span>@EndIndex</span><span>); </span><span>--</span><span> insert into temp table</span> <span>Insert</span> <span>Into</span><span> #StringToTableEntry_Temp10 </span><span>Select</span> Id <span>from</span> dbo.StringToTable2(<span>@CurrentTempChars</span>,<span>'</span><span>,</span><span>'</span><span>); </span><span>Set</span> <span>@LastChars</span><span>=</span><span>SubString</span>(<span>@LastChars</span>,<span>@EndIndex</span><span>+</span><span>1</span>,<span>LEN</span>(<span>@LastChars</span>)<span>-</span><span>@EndIndex</span><span>+</span><span>1</span><span>) </span><span>--</span><span>Select @LastChars as LastChars;</span> <span>Set</span> <span>@EndIndex</span><span>=</span><span>@EndIndex</span><span>+</span><span>@Step</span><span>; </span><span>End</span> <span>If</span> <span>LEN</span>(<span>@LastChars</span>)<span>></span><span>0</span> <span>Begin</span> <span>Insert</span> <span>Into</span><span> #StringToTableEntry_Temp10 </span><span>Select</span> Id <span>from</span> dbo.StringToTable2(<span>@LastChars</span>,<span>'</span><span>,</span><span>'</span><span>); </span><span>End</span> <span>Select</span> <span>COUNT</span>(<span>0</span>) <span>From</span> #StringToTableEntry_Temp10
StringToTable2函数:
<span>ALTER</span> <span>FUNCTION</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>StringToTable</span><span>]</span><span> ( </span><span>@ids</span> <span>[</span><span>nvarchar</span><span>]</span>(<span>max</span><span>), </span><span>@separator</span> <span>[</span><span>char</span><span>]</span>(<span>1</span><span>) ) </span><span>RETURNS</span> <span>@IdsTable</span> <span>TABLE</span><span> ( </span><span>[</span><span>Id</span><span>]</span> <span>INT</span> <span>NOT</span> <span>NULL</span><span> ) </span><span>AS</span> <span>BEGIN</span> <span>IF</span>(<span>RIGHT</span>(<span>@ids</span>,<span>1</span>)<span>=</span><span>@separator</span><span>) </span><span>BEGIN</span> <span>SET</span> <span>@ids</span><span>=</span><span>SUBSTRING</span>(<span>@ids</span>,<span>0</span>,<span>LEN</span>(<span>@ids</span><span>)); </span><span>END</span> <span>--</span><span>下面的方式性能更好</span> <span>IF</span>(<span>LEN</span>(<span>@ids</span>) <span>></span> <span>0</span><span>) </span><span>BEGIN</span> <span>DECLARE</span> <span>@i</span> <span>int</span><span>; </span><span>SET</span> <span>@i</span> <span>=</span> <span>CHARINDEX</span>(<span>@separator</span>, <span>@ids</span><span>); </span><span>WHILE</span> <span>@i</span> <span>></span> <span>0</span> <span>BEGIN</span> <span>INSERT</span> <span>@IdsTable</span> <span>VALUES</span>(<span>LEFT</span>(<span>@ids</span>, <span>@i</span> <span>-</span> <span>1</span><span>)); </span><span>SET</span> <span>@ids</span> <span>=</span> <span>SUBSTRING</span>(<span>@ids</span>, <span>@i</span> <span>+</span> <span>1</span>, <span>LEN</span>(<span>@ids</span>) <span>-</span> <span>@i</span><span>); </span><span>SET</span> <span>@i</span> <span>=</span> <span>CHARINDEX</span>(<span>@separator</span>, <span>@ids</span><span>); </span><span>END</span> <span>IF</span>(<span>LEN</span>(<span>@ids</span>) <span>></span> <span>0</span><span>) </span><span>BEGIN</span> <span>INSERT</span> <span>@IdsTable</span> <span>VALUES</span>(<span>@ids</span><span>); </span><span>END</span> <span>END</span> <span>RETURN</span><span>; </span><span>END</span>
@MRE_MROOIDS包含id记录 |
@Step长度 |
执行时间 |
100,000 |
100000 |
00:09:15 |
100,000 |
20000 |
00:03:48 |
100,000 |
10000 |
00:01:57 |
100,000 |
5000 |
00:01:01 |