mysql ft refers to FullText, that is, full-text index; full-text index is to solve queries that need to be based on similarity, rather than precise numerical comparison; full-text index can be N times faster than like in the face of a large amount of data. Speed is not an order of magnitude.
The operating environment of this tutorial: Windows 10 system, mysql8 version, Dell G3 computer.
What is mysql ft?
That is, full-text index (FullText).
MySQL full-text index (FullText)
1. Introduction
Basic concepts
Full-text index is to solve the need to be based on similarity queries rather than exact numerical comparisons.
Although fuzzy matching can also be achieved using like %
, it is unimaginable for retrieval of large amounts of text data. In the face of a large amount of data, full-text indexing can be N times faster than like
, but the speed is not an order of magnitude.
Version support
-
MySQL 5.6
In previous versions, onlyMyISAM
storage engine supports full-text index -
MySQL 5.6
and later versions,MyISAM
andInnoDB
storage engines support full-text index -
MySQL 5.7.6
, Provides a built-in full-textngram parser
that supports Chinese, Japanese, and Korean (CJK), as well as an installableMeCab
full-text parser plug-in for Japanese - Full-text index Can only be used for
InnoDB
orMyISAM
tables, and can only be created# forCHAR
,VARCHAR
,TEXT
columns ##For large data sets, Load the data into a database without full-text indexing Then creating an index in the table is much faster than loading data into a table with an existing full-text index - RDS MySQL 5.6
Although Chinese full-text retrieval is also supported, there are BUG
- Results in a large occupation of disk resources. Full-text indexing itself is a method of using disk space for performance. The reason why the full-text index is large is that word segmentation is performed according to a certain language The creation of the full-text index is slow, and the modification operations of various data with the full-text index are also slow
- Using the full-text index is not Transparent to the application. If you want to use the full-text index, you must modify the query statement. It is impossible to use the full-text index for the original query statement, and needs to be changed to the syntax specified by the full-text index
- Not case sensitive
- Partitioned tables do not support full-text search
- By The index of the full-text search composed of multiple columns must use the same character set and sorting rules
- Full-text index may have accuracy issues, that is, the data found by the full-text index may not be the same as like
Inconsistency
The columns in the MATCH() function must be exactly the same as the columns defined in the FULLTEXT index, unless full-text search in the IN BOOLEAN MODE mode is used in the MyISAM table (can be used without creating The indexed columns are searched, but the speed is very slow) - When the full-text index is established separately for a single column, the multi-column fuzzy query does not take effect
- The full-text indexes of different tables cannot be queried together. They can be queried in two Add OR
SHOW VARIABLES LIKE 'ft%';
Value | ||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ft_max_word_len | ||||||||||||||||||||||||||||||||||||||||||||||
##ft_min_word_len | ||||||||||||||||||||||||||||||||||||||||||||||
ft_query_expansion_limit | ||||||||||||||||||||||||||||||||||||||||||||||
ft_stopword_file | ||||||||||||||||||||||||||||||||||||||||||||||
全文索引的相关参数都无法进行动态修改,必须通过修改 MySQL 的配置文件来完成。修改最小搜索长度的值为 1,首先打开 MySQL 的配置文件 /etc/my.cnf,在 [mysqld] 的下面追加以下内容: [mysqld] innodb_ft_min_token_size = 1 # 最短的索引字符串,默认值为4 ft_min_word_len = 1 配置完后重启 MySQL 服务器,并修复或重建全文索引方可生效。 repair table test quick; 2.2 创建索引
CREATE TABLE fulltext_test ( id int(11) NOT NULL AUTO_INCREMENT, content TEXT NOT NULL, tag VARCHAR(255), PRIMARY KEY (id), FULLTEXT KEY content_tag_fulltext(content, tag) WITH PARSER ngram ) ENGINE = InnoDB DEFAULT CHARSET=utf8mb4;
CREATE FULLTEXT INDEX content_fulltext ON fulltext_test(content) with parser ngram;
ALTER TABLE fulltext_test ADD FULLTEXT INDEX content_fulltext(content) with parser ngram; 2.3 删除索引
DROP INDEX content_fulltext ON fulltext_test;
ALTER TABLE fulltext_test DROP INDEX content_fulltext; 三、检索数据3.1 自然语言的全文检索默认情况下,或者使用 in natural language mode 修饰符时,match() 函数对文本集合执行自然语言搜索。 SELECT * FROM 表名 WHERE Match(列名1,列名2) Against (检索内容1 检索内容2); 检索内容不需要用逗号隔开! 自然语言搜索引擎将计算每一个文档对象和查询的相关度。这里,相关度是基于匹配的关键词的个数,以及关键词在文档中出现的次数。在整个索引中出现次数越少的词语,匹配时的相关度就越高。相反,非常常见的单词将不会被搜索,如果一个词语的在超过 50% 的记录中都出现了,那么自然语言的搜索将不会搜索这类词语。 3.2 布尔全文检索在布尔搜索中,我们可以在查询中自定义某个被搜索的词语的相关性,当编写一个布尔搜索查询时,可以通过一些前缀修饰符来定制搜索。
+aaa +(>bbb <ccc>李秀琴 是个鬼' in boolean mode);</ccc> 四、测试结果测试环境:本机4核16G Windows10,MySQL 8.0 争对测试用的SQL语句,增加了以下全文索引: CREATE FULLTEXT INDEX billno_fulltext ON salebill(billno) WITH PARSER ngram; CREATE FULLTEXT INDEX remarks_fulltext ON salebill(remarks) WITH PARSER ngram; CREATE FULLTEXT INDEX remarks_fulltext ON salebilldetail(remarks) WITH PARSER ngram; CREATE FULLTEXT INDEX goodsremarks_fulltext ON salebilldetail(goodsremarks) WITH PARSER ngram; CREATE FULLTEXT INDEX remarks_goodsremarks_fulltext ON salebilldetail(remarks, goodsremarks) WITH PARSER ngram; CREATE FULLTEXT INDEX custname_fulltext ON customer(custname) WITH PARSER ngram; CREATE FULLTEXT INDEX goodsname_fulltext ON goods(goodsname) WITH PARSER ngram; CREATE FULLTEXT INDEX goodscode_fulltext ON goods(goodscode) WITH PARSER ngram; 测试结果,总的来说很魔幻。 test_1-- 测试1,原始 like 查询方式,用时 0.765s select 1 from salebilldetail d where d.tid=260434 and ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%'))); test_2-- 测试2,使用全文索引 remarks_fulltext、goodsremarks_fulltext, 用时 0.834s select 1 from salebilldetail d where d.tid=260434 and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_3-- 测试3,使用全文索引 remarks_goodsremarks_fulltext, 用时 0.242s select 1 from salebilldetail d where d.tid=260434 and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_4-- 测试4,原始 like 查询方式,不过滤 tid ,用时 22.654s select t from salebilldetail d where ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%'))); test_5-- 测试5,使用全文索引 remarks_fulltext、goodsremarks_fulltext, 不过滤 tid ,用时 24.855s select 1 from salebilldetail d where ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_6-- 测试6,使用全文索引 remarks_goodsremarks_fulltext, 不过滤 tid ,用时 0.213s select 1 from salebilldetail d where ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_7-- 测试7,使用全文索引 remarks_goodsremarks_fulltext, 用时 0.22s select count(1) from salebilldetail d where d.tid=260434 and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); test_8-- 测试8,使用全文索引 remarks_goodsremarks_fulltext, 不过滤 tid ,用时 0.007s select count(1) from salebilldetail d where ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))); 从上面的测试语句可以看出,数据量越多,查询越简单,全文索引的效果越好。 再来看看我们的业务测试SQL: test_9-- 测试9 select i.billid ,if(0,0,i.qty) as qty ,if(0,0,i.goodstotal) as total ,if(0,0,i.chktotal) as selfchktotal ,if(0,0,i.distotal) as distotal ,if(0,0,i.otherpay) as feetotal ,if(0,0,ifnull(d.costtotal,0)) as costtotal ,if(0,0,ifnull(d.maoli,0)) as maoli ,i.billno ,from_unixtime(i.billdate,'%Y-%m-%d') as billdate /*单据日期*/ ,from_unixtime(i.createdate,'%Y-%m-%d %H:%i:%s') as createdate /*制单日期*/ ,if(i.sdate=0,'',from_unixtime(i.sdate,'%Y-%m-%d %H:%i:%s')) as sdate /*过账日期*/ ,from_unixtime(i.udate,'%Y-%m-%d %H:%i:%s') as udate /*最后修改时间*/ ,i.custid ,c.custname ,i.storeid ,k.storename ,i.empid ,e.empname ,i.userid ,u.username ,i.remarks /*单据备注*/ ,i.effect,i.settle,i.redold,i.rednew /*单据状态*/ ,i.printtimes /* 打印次数 */ ,(case when i.rednew=1 then 1 when i.redold=1 then 2 when i.settle=1 then 3 when i.effect=1 then 4 else 9 end) as state /*单据状态*/ ,(case when i.rednew=1 then '红冲单' when i.redold=1 then '已红冲' when i.settle=1 then '已结算' when i.effect=1 then '已过账' else '草稿' end) as statetext ,'' as susername /* 操作人 */ ,'' as accname /* 科目 */ from salebill i left join coursecentersale d on d.tid=i.tid and d.billid=i.billid left join customer c on c.tid=i.tid and c.custid=i.custid left join store k on k.tid=i.tid and k.storeid=i.storeid left join employee e on e.tid=i.tid and e.empid=i.empid left join user u on u.tid=i.tid and u.userid=i.userid where i.tid=260434 and (i.billtype = 5 or i.effect = 1) and ('_billdate_f_'!='') and ('_billdate_t_'!='') and ('_sdate_f_'!='') and ('_sdate_t_'!='') and ('_udate_f_'!='') and ('_udate_t_'!='') and ('_cdate_f_'!='') and ('_cdate_t_'!='') and ('_billid_'!='') /*单据id*/ and ('_custid_'!='') /*客户ID*/ and ('_storeid_'!='') /*店仓ID*/ and ('_empid_'!='') /*业务员ID*/ and ('_custstop_'!='') /*客户是否停用*/ and ( (i.billno like concat('%','葡萄','%')) or (i.remarks like concat('%','葡萄','%')) or exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%')))) or exists(select 1 from customer c where c.tid=260434 and c.custid=i.custid and (c.custname like concat('%','葡萄','%'))) or exists(select 1 from goods g join salebilldetail d on d.tid=g.tid and d.goodsid=g.goodsid where d.tid=260434 and d.billid=i.billid and ((g.goodsname like concat('%','葡萄','%')) or (g.goodscode like concat('%','葡萄','%')))) ) and i.rednew=0 /*单据列表不含红冲单*/ and i.billid not in (select billid from coursecenter_del t where t.tid=260434) and ((i.settle=1 and i.effect=1 and i.redold=0 and i.rednew=0)) /*已结算*/ order by udate desc,billno desc limit 0,100; 执行时间约 改成使用全文索引方式: test_10-- 测试10 select i.billid ,if(0,0,i.qty) as qty ,if(0,0,i.goodstotal) as total ,if(0,0,i.chktotal) as selfchktotal ,if(0,0,i.distotal) as distotal ,if(0,0,i.otherpay) as feetotal ,if(0,0,ifnull(d.costtotal,0)) as costtotal ,if(0,0,ifnull(d.maoli,0)) as maoli ,i.billno ,from_unixtime(i.billdate,'%Y-%m-%d') as billdate /*单据日期*/ ,from_unixtime(i.createdate,'%Y-%m-%d %H:%i:%s') as createdate /*制单日期*/ ,if(i.sdate=0,'',from_unixtime(i.sdate,'%Y-%m-%d %H:%i:%s')) as sdate /*过账日期*/ ,from_unixtime(i.udate,'%Y-%m-%d %H:%i:%s') as udate /*最后修改时间*/ ,i.custid ,c.custname ,i.storeid ,k.storename ,i.empid ,e.empname ,i.userid ,u.username ,i.remarks /*单据备注*/ ,i.effect,i.settle,i.redold,i.rednew /*单据状态*/ ,i.printtimes /* 打印次数 */ ,(case when i.rednew=1 then 1 when i.redold=1 then 2 when i.settle=1 then 3 when i.effect=1 then 4 else 9 end) as state /*单据状态*/ ,(case when i.rednew=1 then '红冲单' when i.redold=1 then '已红冲' when i.settle=1 then '已结算' when i.effect=1 then '已过账' else '草稿' end) as statetext ,'' as susername /* 操作人 */ ,'' as accname /* 科目 */ from salebill i left join coursecentersale d on d.tid=i.tid and d.billid=i.billid left join customer c on c.tid=i.tid and c.custid=i.custid left join store k on k.tid=i.tid and k.storeid=i.storeid left join employee e on e.tid=i.tid and e.empid=i.empid left join user u on u.tid=i.tid and u.userid=i.userid where i.tid=260434 and (i.billtype = 5 or i.effect = 1) and ('_billdate_f_'!='') and ('_billdate_t_'!='') and ('_sdate_f_'!='') and ('_sdate_t_'!='') and ('_udate_f_'!='') and ('_udate_t_'!='') and ('_cdate_f_'!='') and ('_cdate_t_'!='') and ('_billid_'!='') /*单据id*/ and ('_custid_'!='') /*客户ID*/ and ('_storeid_'!='') /*店仓ID*/ and ('_empid_'!='') /*业务员ID*/ and ('_custstop_'!='') /*客户是否停用*/ and ( (match(i.billno) against(concat('"','葡萄','"') in boolean mode)) or (match(i.remarks) against(concat('"','葡萄','"') in boolean mode)) or exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) or exists(select 1 from customer c where c.tid=260434 and c.custid=i.custid and (match(c.custname) Against(concat('"','葡萄','"') in boolean mode))) or exists(select 1 from goods g join salebilldetail d on d.tid=g.tid and d.goodsid=g.goodsid where d.tid=260434 and d.billid=i.billid and ((match(g.goodsname) Against(concat('"','葡萄','"') in boolean mode)) or (match(g.goodscode) Against(concat('"','葡萄','"') in boolean mode)))) ) and i.rednew=0 /*单据列表不含红冲单*/ and i.billid not in (select billid from coursecenter_del t where t.tid=260434) and ((i.settle=1 and i.effect=1 and i.redold=0 and i.rednew=0)) /*已结算*/ order by udate desc,billno desc limit 0,100; 执行时间约 最魔幻的地方来了,如果将上面的SQL语句中( exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) test_11改成使用全文索引 -- 测试11 exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) 执行时间无限长(跑了半天没成功)? -- and 中只有一个全文检索时正常, 用时0.2秒 select xxx from xxx ... and ( exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) ) ... -- 下面这样就异常了,会慢成百上千倍,用时 160 秒, 如果有更多的 match ,会更夸张的慢下去 select xxx from xxx ... and ( exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)))) or match(i.billno) against(concat('"','葡萄','"') in boolean mode) ) ... 测试结果汇总:
五、MySQL 版本升级因线上系统目前是 RDS MySQL 5.6,故简单描述升级相关问题。
【相关推荐:mysql视频教程】 |
The above is the detailed content of what is mysql ft. For more information, please follow other related articles on the PHP Chinese website!

MySQL'sBLOBissuitableforstoringbinarydatawithinarelationaldatabase,whileNoSQLoptionslikeMongoDB,Redis,andCassandraofferflexible,scalablesolutionsforunstructureddata.BLOBissimplerbutcanslowdownperformancewithlargedata;NoSQLprovidesbetterscalabilityand

ToaddauserinMySQL,use:CREATEUSER'username'@'host'IDENTIFIEDBY'password';Here'showtodoitsecurely:1)Choosethehostcarefullytocontrolaccess.2)SetresourcelimitswithoptionslikeMAX_QUERIES_PER_HOUR.3)Usestrong,uniquepasswords.4)EnforceSSL/TLSconnectionswith

ToavoidcommonmistakeswithstringdatatypesinMySQL,understandstringtypenuances,choosetherighttype,andmanageencodingandcollationsettingseffectively.1)UseCHARforfixed-lengthstrings,VARCHARforvariable-length,andTEXT/BLOBforlargerdata.2)Setcorrectcharacters

MySQloffersechar, Varchar, text, Anddenumforstringdata.usecharforfixed-Lengthstrings, VarcharerForvariable-Length, text forlarger text, AndenumforenforcingdataAntegritywithaetofvalues.

Optimizing MySQLBLOB requests can be done through the following strategies: 1. Reduce the frequency of BLOB query, use independent requests or delay loading; 2. Select the appropriate BLOB type (such as TINYBLOB); 3. Separate the BLOB data into separate tables; 4. Compress the BLOB data at the application layer; 5. Index the BLOB metadata. These methods can effectively improve performance by combining monitoring, caching and data sharding in actual applications.

Mastering the method of adding MySQL users is crucial for database administrators and developers because it ensures the security and access control of the database. 1) Create a new user using the CREATEUSER command, 2) Assign permissions through the GRANT command, 3) Use FLUSHPRIVILEGES to ensure permissions take effect, 4) Regularly audit and clean user accounts to maintain performance and security.

ChooseCHARforfixed-lengthdata,VARCHARforvariable-lengthdata,andTEXTforlargetextfields.1)CHARisefficientforconsistent-lengthdatalikecodes.2)VARCHARsuitsvariable-lengthdatalikenames,balancingflexibilityandperformance.3)TEXTisidealforlargetextslikeartic

Best practices for handling string data types and indexes in MySQL include: 1) Selecting the appropriate string type, such as CHAR for fixed length, VARCHAR for variable length, and TEXT for large text; 2) Be cautious in indexing, avoid over-indexing, and create indexes for common queries; 3) Use prefix indexes and full-text indexes to optimize long string searches; 4) Regularly monitor and optimize indexes to keep indexes small and efficient. Through these methods, we can balance read and write performance and improve database efficiency.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Dreamweaver CS6
Visual web development tools

WebStorm Mac version
Useful JavaScript development tools

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),
