mysql ft指的是FullText,即全文索引;全文索引是為了解決需要基於相似度的查詢,而不是精確數值比較;全文索引在大量的數據面前,能比like快N倍,速度不是一個數量級。
MySQL 全文索引(FullText)
全文索引是為了解決需要基於相似度的查詢,而不是精確數值比較。
雖然使用 like %
也可以實現模糊匹配,但是對於大量的文字資料檢索,是不可想像的。全文索引在大量的資料面前,能比 like
快 N 倍,速度不是一個數量級。
MySQL 5.6
先前的版本,只有MyISAM
儲存引擎支援全文索引
#MySQL 5.6
及以後的版本,MyISAM
和InnoDB
儲存引擎都支援全文索引
#MySQL 5.7.6
中,提供了支援中文、日文和韓文(CJK)的內建全文ngram 解析器
,以及用於日文的可安裝MeCab
全文解析器外掛程式
全文索引只能用於InnoDB
或MyISAM
表,只能為CHAR
、VARCHAR
、TEXT
列建立
#對於大型資料集,e2a5cfc322b0fffd61a8c4e2ec9a2d2b將資料載入到沒有全文索引的表中然後建立索引要比將資料載入到具有現有全文索引的表中快得多
RDS MySQL 5.6
雖然也支援中文全文檢索,但存在BUG
#導致磁碟資源的大量佔用。全文索引本身就是一個利用磁碟空間換取效能的方法。全文索引大的原因是,按照某種語言來進行分詞
全文索引創建速度慢,而且對有全文索引的各種資料修改操作也慢
#使用全文索引並不是對應用程式透明的。如果要利用全文索引,必須修改查詢語句。原有的查詢語句是不可能利用全文索引的,需要改成全文索引規定的語法
#不區分大小寫
分區表格不支援全文搜尋
由多列組合而成的全文檢索的索引必須使用相同的字元集與排序規則
全文索引可能存在精度問題,即全文索引找到的數據,可能和like
到的不一致
MATCH()函數中的列必須與FULLTEXT索引中定義的列完全一致,除非是在MyISAM表中使用IN BOOLEAN MODE模式的全文搜索(可在沒有建立索引的列執行搜索,但速度很慢)
#單列分別建立全文索引時,多列模糊查詢時不生效
不同表的全文索引不能放在一起查詢,可以兩個語句中加上OR
我們可以透過SQL 指令查看目前配置的最小搜尋長度(分詞長度):
SHOW VARIABLES LIKE 'ft%';
Variable_name | Value | |||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ft_boolean_syntax | -> ;c50b7bfcb9656106990b53b4f50d6846” 表示出现该单词时增加相关性,查询的结果靠前 “<” 表示出现该单词时降低相关性,查询的结果靠后
"" 双引号表示短语,表示要彻底相符,不可拆字效果,类同于 like '%keyword%'
+aaa +(>bbb <ccc) aaa="aaa" sql="sql" select="select" from="from" test="test" where="where" match="match" against="against" in="in" boolean="boolean" mode="mode" select="select" from="from" tommy="tommy" where="where" match="match" against="against" in="in" boolean="boolean" mode="mode" select="select" from="from" tommy="tommy" where="where" match="match" against="against">李秀琴 <练习册 <不是人>是个鬼' in boolean mode); 四、测试结果测试环境:本机4核16G Windows10,MySQL 8.0 争对测试用的SQL语句,增加了以下全文索引: CREATE FULLTEXT INDEX billno_fulltext ON salebill(billno) WITH PARSER ngram; CREATE FULLTEXT INDEX remarks_fulltext ON salebill(remarks) WITH PARSER ngram; CREATE FULLTEXT INDEX remarks_fulltext ON salebilldetail(remarks) WITH PARSER ngram; CREATE FULLTEXT INDEX goodsremarks_fulltext ON salebilldetail(goodsremarks) WITH PARSER ngram; CREATE FULLTEXT INDEX remarks_goodsremarks_fulltext ON salebilldetail(remarks, goodsremarks) WITH PARSER ngram; CREATE FULLTEXT INDEX custname_fulltext ON customer(custname) WITH PARSER ngram; CREATE FULLTEXT INDEX goodsname_fulltext ON goods(goodsname) WITH PARSER ngram; CREATE FULLTEXT INDEX goodscode_fulltext ON goods(goodscode) WITH PARSER ngram; 测试结果,总的来说很魔幻。 test_1-- 测试1,原始 like 查询方式,用时 0.765s select 1 from salebilldetail d where d.tid=260434 and ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%'))); test_2-- 测试2,使用全文索引 remarks_fulltext、goodsremarks_fulltext, 用时 0.834s select 1 from salebilldetail d where d.tid=260434 and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_3-- 测试3,使用全文索引 remarks_goodsremarks_fulltext, 用时 0.242s select 1 from salebilldetail d where d.tid=260434 and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_4-- 测试4,原始 like 查询方式,不过滤 tid ,用时 22.654s select t from salebilldetail d where ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%'))); test_5-- 测试5,使用全文索引 remarks_fulltext、goodsremarks_fulltext, 不过滤 tid ,用时 24.855s select 1 from salebilldetail d where ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_6-- 测试6,使用全文索引 remarks_goodsremarks_fulltext, 不过滤 tid ,用时 0.213s select 1 from salebilldetail d where ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_7-- 测试7,使用全文索引 remarks_goodsremarks_fulltext, 用时 0.22s select count(1) from salebilldetail d where d.tid=260434 and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_8-- 测试8,使用全文索引 remarks_goodsremarks_fulltext, 不过滤 tid ,用时 0.007s select count(1) from salebilldetail d where ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); 从上面的测试语句可以看出,数据量越多,查询越简单,全文索引的效果越好。 再来看看我们的业务测试SQL: test_9-- 测试9 select i.billid ,if(0,0,i.qty) as qty ,if(0,0,i.goodstotal) as total ,if(0,0,i.chktotal) as selfchktotal ,if(0,0,i.distotal) as distotal ,if(0,0,i.otherpay) as feetotal ,if(0,0,ifnull(d.costtotal,0)) as costtotal ,if(0,0,ifnull(d.maoli,0)) as maoli ,i.billno ,from_unixtime(i.billdate,'%Y-%m-%d') as billdate /*单据日期*/ ,from_unixtime(i.createdate,'%Y-%m-%d %H:%i:%s') as createdate /*制单日期*/ ,if(i.sdate=0,'',from_unixtime(i.sdate,'%Y-%m-%d %H:%i:%s')) as sdate /*过账日期*/ ,from_unixtime(i.udate,'%Y-%m-%d %H:%i:%s') as udate /*最后修改时间*/ ,i.custid ,c.custname ,i.storeid ,k.storename ,i.empid ,e.empname ,i.userid ,u.username ,i.remarks /*单据备注*/ ,i.effect,i.settle,i.redold,i.rednew /*单据状态*/ ,i.printtimes /* 打印次数 */ ,(case when i.rednew=1 then 1 when i.redold=1 then 2 when i.settle=1 then 3 when i.effect=1 then 4 else 9 end) as state /*单据状态*/ ,(case when i.rednew=1 then '红冲单' when i.redold=1 then '已红冲' when i.settle=1 then '已结算' when i.effect=1 then '已过账' else '草稿' end) as statetext ,'' as susername /* 操作人 */ ,'' as accname /* 科目 */ from salebill i left join coursecentersale d on d.tid=i.tid and d.billid=i.billid left join customer c on c.tid=i.tid and c.custid=i.custid left join store k on k.tid=i.tid and k.storeid=i.storeid left join employee e on e.tid=i.tid and e.empid=i.empid left join user u on u.tid=i.tid and u.userid=i.userid where i.tid=260434 and (i.billtype = 5 or i.effect = 1) and ('_billdate_f_'!='') and ('_billdate_t_'!='') and ('_sdate_f_'!='') and ('_sdate_t_'!='') and ('_udate_f_'!='') and ('_udate_t_'!='') and ('_cdate_f_'!='') and ('_cdate_t_'!='') and ('_billid_'!='') /*单据id*/ and ('_custid_'!='') /*客户ID*/ and ('_storeid_'!='') /*店仓ID*/ and ('_empid_'!='') /*业务员ID*/ and ('_custstop_'!='') /*客户是否停用*/ and ( (i.billno like concat('%','葡萄','%')) or (i.remarks like concat('%','葡萄','%')) or exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%')))) or exists(select 1 from customer c where c.tid=260434 and c.custid=i.custid and (c.custname like concat('%','葡萄','%'))) or exists(select 1 from goods g join salebilldetail d on d.tid=g.tid and d.goodsid=g.goodsid where d.tid=260434 and d.billid=i.billid and ((g.goodsname like concat('%','葡萄','%')) or (g.goodscode like concat('%','葡萄','%')))) ) and i.rednew=0 /*单据列表不含红冲单*/ and i.billid not in (select billid from coursecenter_del t where t.tid=260434) and ((i.settle=1 and i.effect=1 and i.redold=0 and i.rednew=0)) /*已结算*/ order by udate desc,billno desc limit 0,100; 执行时间约 改成使用全文索引方式: test_10-- 测试10 select i.billid ,if(0,0,i.qty) as qty ,if(0,0,i.goodstotal) as total ,if(0,0,i.chktotal) as selfchktotal ,if(0,0,i.distotal) as distotal ,if(0,0,i.otherpay) as feetotal ,if(0,0,ifnull(d.costtotal,0)) as costtotal ,if(0,0,ifnull(d.maoli,0)) as maoli ,i.billno ,from_unixtime(i.billdate,'%Y-%m-%d') as billdate /*单据日期*/ ,from_unixtime(i.createdate,'%Y-%m-%d %H:%i:%s') as createdate /*制单日期*/ ,if(i.sdate=0,'',from_unixtime(i.sdate,'%Y-%m-%d %H:%i:%s')) as sdate /*过账日期*/ ,from_unixtime(i.udate,'%Y-%m-%d %H:%i:%s') as udate /*最后修改时间*/ ,i.custid ,c.custname ,i.storeid ,k.storename ,i.empid ,e.empname ,i.userid ,u.username ,i.remarks /*单据备注*/ ,i.effect,i.settle,i.redold,i.rednew /*单据状态*/ ,i.printtimes /* 打印次数 */ ,(case when i.rednew=1 then 1 when i.redold=1 then 2 when i.settle=1 then 3 when i.effect=1 then 4 else 9 end) as state /*单据状态*/ ,(case when i.rednew=1 then '红冲单' when i.redold=1 then '已红冲' when i.settle=1 then '已结算' when i.effect=1 then '已过账' else '草稿' end) as statetext ,'' as susername /* 操作人 */ ,'' as accname /* 科目 */ from salebill i left join coursecentersale d on d.tid=i.tid and d.billid=i.billid left join customer c on c.tid=i.tid and c.custid=i.custid left join store k on k.tid=i.tid and k.storeid=i.storeid left join employee e on e.tid=i.tid and e.empid=i.empid left join user u on u.tid=i.tid and u.userid=i.userid where i.tid=260434 and (i.billtype = 5 or i.effect = 1) and ('_billdate_f_'!='') and ('_billdate_t_'!='') and ('_sdate_f_'!='') and ('_sdate_t_'!='') and ('_udate_f_'!='') and ('_udate_t_'!='') and ('_cdate_f_'!='') and ('_cdate_t_'!='') and ('_billid_'!='') /*单据id*/ and ('_custid_'!='') /*客户ID*/ and ('_storeid_'!='') /*店仓ID*/ and ('_empid_'!='') /*业务员ID*/ and ('_custstop_'!='') /*客户是否停用*/ and ( (match(i.billno) against(concat('"','葡萄','"') in boolean mode)) or (match(i.remarks) against(concat('"','葡萄','"') in boolean mode)) or exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) or exists(select 1 from customer c where c.tid=260434 and c.custid=i.custid and (match(c.custname) Against(concat('"','葡萄','"') in boolean mode))) or exists(select 1 from goods g join salebilldetail d on d.tid=g.tid and d.goodsid=g.goodsid where d.tid=260434 and d.billid=i.billid and ((match(g.goodsname) Against(concat('"','葡萄','"') in boolean mode)) or (match(g.goodscode) Against(concat('"','葡萄','"') in boolean mode)))) ) and i.rednew=0 /*单据列表不含红冲单*/ and i.billid not in (select billid from coursecenter_del t where t.tid=260434) and ((i.settle=1 and i.effect=1 and i.redold=0 and i.rednew=0)) /*已结算*/ order by udate desc,billno desc limit 0,100; 执行时间约 最魔幻的地方来了,如果将上面的SQL语句中( exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) test_11改成使用全文索引 -- 测试11 exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) 执行时间无限长(跑了半天没成功)? -- and 中只有一个全文检索时正常, 用时0.2秒 select xxx from xxx ... and ( exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) ) ... -- 下面这样就异常了,会慢成百上千倍,用时 160 秒, 如果有更多的 match ,会更夸张的慢下去 select xxx from xxx ... and ( exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) or match(i.billno) against(concat('"','葡萄','"') in boolean mode) ) ... 测试结果汇总:
五、MySQL 版本升级因线上系统目前是 RDS MySQL 5.6,故简单描述升级相关问题。
|
以上是mysql ft指的是什麼的詳細內容。更多資訊請關注PHP中文網其他相關文章!