Home  >  Article  >  Database  >  A brief tutorial on MySQL full-text index application

A brief tutorial on MySQL full-text index application

黄舟
黄舟Original
2017-02-07 17:08:151264browse

This article introduces the basic knowledge of MySQL full-text index from the following aspects:

  1. Several precautions for MySQL full-text index

  2. Syntax of full-text index

  3. Introduction to several search types

  4. Examples of several search types

Several notes on full-text index


  • Search must be on an index column of type fulltext, and the column specified in match must be in fulltext Specified

  • can only be applied to tables whose table engine is MyIsam type (it can also be used in Innodb table engine after MySQL 5.6)

  • Only full-text indexes can be created on columns of type char, varchar, and text.

  • Like ordinary indexes, they can be specified when defining the table, or added or modified after creating the table.

  • For a large-scale record insertion, the process of inserting data into a table without an index and then creating an index is much faster than inserting into a data table with an index.

  • The search string must be a constant string and cannot be the column name of the table

  • When the selectivity of the search record exceeds 50%, it is considered that there is no match ( Limited to natural search only)

Full-text index search syntax

MATCH (column name 1, column name 2,…) AGAINST (search string [search modifier ])

The column names 1, 2, etc. specified in match are the column names specified in establishing the full-text index. The subsequent search modifiers are described as follows:

search_modifier:
{
IN NATURAL LANGUAGE MODE
| IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
| IN BOOLEAN MODE
| WITH QUERY EXPANSION
}

Several kinds of searches Introduction to types


#The search modifiers above actually illustrate 3 full-text search types

IN NATURAL LANGUAGE MODE

Introduction: Default search form (without any search modifier or the modifier is IN NATURAL LANGUAGE MODE)

Features:


  1. The characters in the search string are parsed into normal characters and have no special meaning

  2. Filter the strings in the masked character list

  3. When the selectivity of a record exceeds 50%, it is usually considered to be a mismatch

  4. The returned records are sorted and displayed according to the relevance of the record


##IN BOOLEAN MODE


Introduction: Boolean Mode Search (Search When the modifier is IN BOOLEAN MODE)

Features:

  1. will analyze the meaning of special characters in the search string according to certain rules and perform some logical analysis rule. For example: a certain word must appear, or cannot appear, etc.

  2. The records returned by this type of search are not sorted according to relevance


WITH QUERY EXPANSION

Introduction: A slightly complex search form that actually performs 2 natural searches and can return records directly Introduction to records of sexual relationships, modifiers IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION or WITH QUERY EXPANSION modifier

Features: This type of search actually provides an indirect search function, such as: I search a word, and the first row returned does not contain any of the strings in the search term. A second match can be performed based on the record words of the first search result, so that it is possible to find matching records with some indirect relationships.

Examples of several search types

Application in IN NATURAL LANGUAGE MODE mode:

It is still applied to the product table, where we have established a full-text index in the name field. Because I need to match relevant records in the name column based on keywords

The Sql statement is as follows:

SELECT * FROM product WHERE match(name) against(‘auto’)

The time is not bad, more than 10,000 records were hit among nearly 870,000 records, and it took 1.15 seconds , the effect is still good

Note: By default, records are returned from high to low based on correlation

We can SELECT match(name) against('auto') FROM product Check the correlation value of the record. The value is between 0 and 1. 0 means the record does not match.

Several important features:

1. Which words will be ignored

The search word is too short. The default full-text index considers words with more than 4 characters as valid words. We can modify ft_min_word_len in the configuration to configure

The default full-text index blocks words in the vocabulary list. Some common words are blocked. Because these words are too common and have no semantic effect, they are ignored in the search process. Of course, this list is also configurable.

2. How to perform word segmentation

Full-text index considers a continuous valid character (character set matched by \w in regular expressions) to be a word, which may also contain a "'", but continuous The two ' will be considered as a separator. Other delimiters such as: spaces, commas, periods, etc.


IN Application in BOOLEAN MODE mode:


In the Boolean matching mode, we can add some special symbols to increase some logical functions of the search process. Such as the examples provided on the official website (search for statements containing mysql string and not containing Yousql):

SELECT * FROM articles WHERE MATCH (title,body)
-> AGAINST (‘+MySQL -YourSQL’ IN BOOLEAN MODE);

可见,我们对搜索的控制又多了一些,看起来“高大上”了些。

实际上,上面的操作隐含了几个意思:

  • 加号:相当于and

  • 减号:相当于not

  • 没有:相当于or


下面看下布尔类型搜索的几个重要特性: 


1. 没有50%记录选择性的限制,即使搜索结果记录超过总数的50%也同样返回结果

2. 不会自动的按记录的相关性进行降序排序

3. 可以直接应用在没有创建fulltext的全文索引上,但是这样会查询的非常慢,所以说还是别用了。

4. 支持最小、最大单词长度

5. 应用屏蔽词列表


布尔搜索支持的操作符:

n 加号 +:指示修饰的单词必须出现在记录中

n 减号 -:指示修饰的单词必须不能出现在记录中

n 没有任何操作符:单词可有可无,但是包含该词的记录相关性高

n 双引号 “ : 将一个词组作为一个匹配。如:”one word” 匹配one word在一起的单词


下面是官方的一些实例:

至少包含一个词的记录
‘apple banana’
必须包含着两个词
‘+apple +juice’
必须包含apple,包含macintosh的记录相关性高,也可以不包含
‘+apple macintosh’
必须包含apple且不能喊有macintosh
‘+apple -macintosh’
查找apple开头单词的记录
‘apple*’
完整匹配some words单词
‘”some words”‘

了解了基本的mysql全文索引知识,觉得它的全文索引比like当然是强了很多。但是面对高级的搜索还是略显简陋,且性能问题也是担忧。

以上就是MySQL 全文索引应用简明教程的内容,更多相关内容请关注PHP中文网(www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Previous article:mysql8.0 is comingNext article:mysql8.0 is coming