Home >Database >Mysql Tutorial >MySQL的10件事—它们也许和你预想的不一样_MySQL
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> = </span><span class="op"><font color="#808080">NULL</font></span><span> </span> </li> </ol>
在SQL中,NULL什么也不等于,而且NULL也不等于NULL。这个查询不会返回任何结果的,实际上,当构建那个plan的时候,优化器会把这样的语句优化掉。
当搜索NULL值的时候,应该使用这样的查询:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">IS</font></strong></span><span> </span><span class="op"><font color="#808080">NULL</font></span><span> </span> </li> </ol>
#9. 使用附加条件的LEFT JOIN
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="func"><font color="#ff1493">LEFT</font></span><span> </span><span class="op"><font color="#808080">JOIN</font></span><span> </span> </li> <li><span> b </span></li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> b.a = a.id </span> </li> <li> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> = </span><span class="string"><font color="#0000ff">'something'</font></span><span> </span> </li> </ol>
除了从a返回每个记录(至少一次),当没有真正匹配的记录的时候,用NULL值代替缺失的字段之外,LEFT JOIN和INNER JOIN都是一样的。
但是,在LEFT JOIN之后才会检查WHERE条件,所以,上面这个查询在连接之后才会检查column。就像我们刚才了解到的那样,非NULL值才可以满足相等条件,所以,在a的记录中,那些在b中没有对应的条目的记录不可避免地要被过滤掉。
从本质上来说,这个查询是一个INNER JOIN,只是效率要低一些。
为了真正地匹配满足b.column = 'something'条件的记录(这时要返回a中的全部记录,也就是说,不过滤掉那些在b中没有对应的条目的记录),这个条件应该放在ON子句中:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="func"><font color="#ff1493">LEFT</font></span><span> </span><span class="op"><font color="#808080">JOIN</font></span><span> </span> </li> <li><span> b </span></li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> b.a = a.id </span> </li> <li> <span> </span><span class="op"><font color="#808080">AND</font></span><span> b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> = </span><span class="string"><font color="#0000ff">'something'</font></span><span> </span> </li> </ol>
#8. 小于一个值,但是不为NULL
我经常看到这样的查询:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> b </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> <span class="string"><font color="#0000ff">'something'</font></span><span> </span></span> </li> <li> <span> </span><span class="op"><font color="#808080">AND</font></span><span> b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">IS</font></strong></span><span> </span><span class="op"><font color="#808080">NOT</font></span><span> </span><span class="op"><font color="#808080">NULL</font></span><span> </span> </li> </ol>
实际上,这并不是一个错误:这个查询是有效的,是故意这样做的。但是,这里的IS NOT NULL是冗余的。
如果b.column是NULL,那么无法满足b.column
有趣的是,这个附加的NULL检查不能和“大于”查询(例如:b.column > 'something')一起使用。
这是因为,在MySQL中,在ORDER BY的时候,NULL会排在前面,因此,一些人错误地认为NULL比任何其他的值都要小。
这个查询可以被简化:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> b </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> <span class="string"><font color="#0000ff">'something'</font></span><span> </span></span> </li> </ol>
在b.column中,不可能返回NULL
#7. 按照NULL来进行连接
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="op"><font color="#808080">JOIN</font></span><span> b </span> </li> <li> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> = b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span> </li> </ol>
在两个表中,当column是nullable的时候,这个查询不会返回两个字段都是NULL的记录,原因如上所述:两个NULL并不相等。
这个查询应该这样来写:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="op"><font color="#808080">JOIN</font></span><span> b </span> </li> <li> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> = b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span> </li> <li class="alt"> <span> </span><span class="op"><font color="#808080">OR</font></span><span> (a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">IS</font></strong></span><span> </span><span class="op"><font color="#808080">NULL</font></span><span> </span><span class="op"><font color="#808080">AND</font></span><span> b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">IS</font></strong></span><span> </span><span class="op"><font color="#808080">NULL</font></span><span>) </span> </li> </ol>
MySQL的优化器会把这个查询当成一个“等值连接”,然后提供一个特殊的连接条件:ref_or_null
#6. NOT IN和NULL值
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> a.* </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span><span class="op"><font color="#808080">NOT</font></span><span> </span><span class="op"><font color="#808080">IN</font></span><span> </span> </li> <li><span> ( </span></li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> b </span> </li> <li class="alt"><span> ) </span></li> </ol>
如果在b.column中有一个NULL值,那么这个查询是不会返回任何结果的。和其他谓词一样,IN 和 NOT IN 遇到NULL也会被判定为NULL。
你应该使用NOT EXISTS重写这个查询:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> a.* </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> </span><span class="op"><font color="#808080">NOT</font></span><span> EXISTS </span> </li> <li><span> ( </span></li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> </span><span class="op"><font color="#808080">NULL</font></span><span> </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> b </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> b.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> = a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span> </li> <li><span> ) </span></li> </ol>
不像IN,EXISTS总是被判定为true或false的。
#5. 对随机的样本进行排序
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ORDER</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li> <span> RAND(), </span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span> </li> <li class="alt"><span>LIMIT 10 </span></li> </ol>
这个查询试图选出10个随机的记录,按照column来排序。
ORDER BY会按照自然顺序来对输出结果进行排序:这就是说,当第一个表达式的值相等的时候,这些记录才会按照第二个表达式来排序。
但是,RAND()的结果是随机的。要让RAND()的值相等是行不通的,所以,按照RAND()排序以后,再按照column来排序也是没有意义的。
要对随机的样本记录进行排序,可以使用这个查询:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> ( </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> mytable </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">ORDER</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li><span> RAND() </span></li> <li class="alt"><span> LIMIT 10 </span></li> <li><span> ) q </span></li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ORDER</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span> </li> </ol>
#4. 通过一个组来选取任意的记录
这个查询打算通过某个组(定义为grouper来)来选出一些记录
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">DISTINCT</font></strong></span><span>(grouper), a.* </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> </ol>
DISTINCT不是一个函数,它是SELECT子句的一部分。它会应用到SELECT列表中的所有列,实际上,这里的括号是可以省略的。所以,这个查询可能会选出grouper中的值都相同的记录(如果在其他列中,至少有一个列的值是不同的)。
有时,这个查询可以正常地使用( 这主要依赖于MySQL对GROUP BY的扩展):
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> a.* </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">GROUP</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li><span> grouper </span></li> </ol>
在某个组中返回的非聚合的列可以被任意地使用。
首先,这似乎是一个很好的解决方案,但是,它存在着一个很严重的缺陷。它依赖于这样一个假设:虽然可以通过组来任意地获取,但是返回的所有值都要属于一条记录。
虽然当前的实现似乎就是这样的,但是它并没有文档化,无论何时,它都有可能被改变(尤其是,当MySQL学会了在GROUP BY的后面使用index_union的时候)。所以依赖于这个行为并不安全。
如果MySQL支持分析函数的话,这个查询可以很容易地用另一种更清晰的方式来重写。但是,如果这张表拥有一个PRIMARY KEY的话,即使不使用分析函数,也可以做到这一点:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> a.* </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> ( </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">DISTINCT</font></strong></span><span> grouper </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"><span> ) ao </span></li> <li> <span class="op"><font color="#808080">JOIN</font></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> a.id = </span> </li> <li><span> ( </span></li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> id </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a ai </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> ai.grouper = ao.grouper </span> </li> <li><span> LIMIT 1 </span></li> <li class="alt"><span> ) </span></li> </ol>
#3. 通过一个组来选取第一条记录
把前面那个查询稍微变化一下:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> a.* </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">GROUP</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li><span> grouper </span></li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ORDER</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">MIN</font></strong></span><span>(id) </span><span class="keyword"><strong><font color="#006699">DESC</font></strong></span><span> </span> </li> </ol>
和前面那个查询不同,这个查询试图选出id值最小的记录。
同样:无法保证通过a.*返回的非聚合的值都属于id值最小的那条记录(或者任意一条记录)
这样做会更清晰一些:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> a.* </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> ( </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">DISTINCT</font></strong></span><span> grouper </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"><span> ) ao </span></li> <li> <span class="op"><font color="#808080">JOIN</font></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> a.id = </span> </li> <li><span> ( </span></li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> id </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a ai </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> ai.grouper = ao.grouper </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">ORDER</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li class="alt"><span> ai.grouper, ai.id </span></li> <li><span> LIMIT 1 </span></li> <li class="alt"><span> ) </span></li> </ol>
这个查询和前面那个查询类似,但是使用额外的ORDER BY可以确保按id来排序的第一条记录会被返回。
#2. IN和‘,’——值的分隔列表
这个查询试图让column的值匹配用‘,’分隔的字符串中的任意一个值:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span><span class="op"><font color="#808080">IN</font></span><span> (</span><span class="string"><font color="#0000ff">'1, 2, 3'</font></span><span>) </span> </li> </ol>
这不会正常发挥作用的,因为在IN列表中,那个字符串并不会被展开。
如果列column是一个VARCHAR,那么它(作为一个字符串)会和整个列表(也作为一个字符串)进行比较,当然,这不可能匹配。如果 column是某个数值类型,那么这个列表会被强制转换为那种数值类型(在最好的情况下,只有第一项会匹配)
处理这个查询的正确方法应该是使用合适的IN列表来重写它:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> </span><span class="op"><font color="#808080">IN</font></span><span> (1, 2, 3) </span> </li> </ol>
或者,也可以使用内联:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> ( </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> 1 </span><span class="keyword"><strong><font color="#006699">AS</font></strong></span><span> id </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">UNION</font></strong></span><span> </span><span class="op"><font color="#808080">ALL</font></span><span> </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> 2 </span><span class="keyword"><strong><font color="#006699">AS</font></strong></span><span> id </span> </li> <li> <span> </span><span class="keyword"><strong><font color="#006699">UNION</font></strong></span><span> </span><span class="op"><font color="#808080">ALL</font></span><span> </span> </li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> 3 </span><span class="keyword"><strong><font color="#006699">AS</font></strong></span><span> id </span> </li> <li><span> ) q </span></li> <li class="alt"> <span class="op"><font color="#808080">JOIN</font></span><span> a </span> </li> <li> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> a.</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span> = q.id </span> </li> </ol>
但是,有时这是不可能的。
如果不想改变那个查询的参数,可以使用FIND_IN_SET:
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> * </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="keyword"><strong><font color="#006699">WHERE</font></strong></span><span> FIND_IN_SET(</span><span class="keyword"><strong><font color="#006699">column</font></strong></span><span>, </span><span class="string"><font color="#0000ff">'1,2,3'</font></span><span>) </span> </li> </ol>
但是,这个函数不可以利用索引从表中检索行,会在a上执行全表扫描。
#1. LEFT JOIN和COUNT(*)
<ol class="dp-sql"> <li class="alt"><span><span class="keyword"><strong><font color="#006699">SELECT</font></strong></span><span> a.id, </span><span class="func"><font color="#ff1493">COUNT</font></span><span>(*) </span></span></li> <li> <span class="keyword"><strong><font color="#006699">FROM</font></strong></span><span> a </span> </li> <li class="alt"> <span class="func"><font color="#ff1493">LEFT</font></span><span> </span><span class="op"><font color="#808080">JOIN</font></span><span> </span> </li> <li><span> b </span></li> <li class="alt"> <span class="keyword"><strong><font color="#006699">ON</font></strong></span><span> b.a = a.id </span> </li> <li> <span class="keyword"><strong><font color="#006699">GROUP</font></strong></span><span> </span><span class="keyword"><strong><font color="#006699">BY</font></strong></span><span> </span> </li> <li class="alt"><span> a.id </span></li> </ol>
这个查询试图统计出对于a中的每条记录来说,在b中匹配的记录的数目。
问题是,在这样一个查询中,COUNT(*)永远不会返回一个0。对于a中某条记录来说,如果没有匹配的记录,那么那条记录还是会被返回和计数。
只有需要统计b中的记录数目的时候才应该使用COUNT。既然可以使用COUNT(*),那么我们也可以使用一个参数来调用它(忽略掉NULL),我们可以把b.a传递给它。在这个例子中,作为一个连接主键,它不可以为空,但是如果不想匹配,它也可以为空。
原文标题:10 things in MySQL (that won’t work as expected)