This article introduces the basic knowledge of MySQL full-text index from the following aspects:
Several notes on MySQL full-text index
The syntax of full-text index
Introduction to several search types
Several types Examples of search types
Several notes on full-text indexes
The search must be on an index column of type fulltext, and the column specified in match must have been specified in fulltext
Can only be applied to table engines as In MyIsam type tables (MySQL 5.6 can also be used in the Innodb table engine after MySQL 5.6)
Only full-text indexes can be created on char, varchar, and text type columns
Like ordinary indexes, they can be specified when defining the table. You can also add or modify it after creating the table
For a large-scale record insertion, inserting data into a table without an index and then creating an index is much faster than inserting into a data table with an index
The search string must be A constant string, which cannot be the column name of the table
When the selectivity of the search record exceeds 50%, it is considered that there is no match (limited only in natural search)
Full-text index search syntax
MATCH (column name 1, Column name 2,...) AGAINST (search string [search modifier])
The column names 1, 2, etc. specified in match are the column names specified in establishing the full-text index. The following search modifier descriptions As follows:
search_modifier:
{ IN NATURAL LANGUAGE MODE | IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION | IN BOOLEAN MODE | WITH QUERY EXPANSION }
Introduction to several search types
The search modifier above actually illustrates 3 full-text search types
IN NATURAL LANGUAGE MODE
Introduction: the default search form (not Add any search modifier or the modifier is IN NATURAL LANGUAGE MODE)
Features:
The characters in the search string are parsed into normal characters and have no special meaning
The strings in the masked character list are processed Filtering
When the selectivity of a record exceeds 50%, it is usually considered a mismatch.
The returned records are sorted and displayed according to the relevance of the records
IN BOOLEAN MODE
Introduction: Boolean mode search (when the search modifier is IN BOOLEAN MODE)
Features:
Special characters in the search string will be parsed according to certain rules The meaning carries out some logical meaning rules. For example: a certain word must appear or cannot appear, etc.
The records returned by this type of search are not sorted by relevance
WITH QUERY EXPANSION
Introduction: A slightly more complex search form that actually performs 2 natural searches and can return records with direct introductory relationships , modifier IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION or WITH QUERY EXPANSION modifier
Features: This type of search actually provides an indirect search function, for example: I search for a certain word, and the first one returned The row does not contain any of the strings in the search terms. A second match can be performed based on the record words of the first search result, so that it is possible to find matching records with some indirect relationships.
Examples of several search types
Application in NATURAL LANGUAGE MODE mode:
It is still applied in the product table, where we have established a full-text index in the name field, because I need to match the relevant keywords in the name column Record the
Sql statement as follows:
SELECT * FROM product WHERE match(name) against(‘auto')
The time is not bad, more than 10,000 records were hit in nearly 870,000 records, and it took 1.15 seconds. The effect is still good
Note: By default, it is already based on correlation from high to high When the record is low, the record is returned
We can SELECT match(name) against('auto') FROM product to view the correlation value of the record. The values are between 0 and 1. 0 means the record does not match. Several important features :
1. Which words will be ignored
The search term is too short. The default full-text index considers words with more than 4 characters as valid words. We can modify ft_min_word_len in the configuration to configure
The default full-text index for shielding words in the vocabulary list Some common words are blocked because these words are too common and have no semantic effect, so they are ignored in the search process. Of course, this list is also configurable.2. How to perform word segmentation
The full-text index considers a continuous valid character (the character set matched by w in the regular expression) to be a word, which may also contain a "'", but two consecutive ''s will be considered a separator symbol. Other delimiters such as: spaces, commas, periods, etc.
Application in BOOLEAN MODE mode:
In Boolean matching mode, we can add some special symbols to increase some logical functions of the search process. For example, the example provided on the official website (search for statements containing mysql string and not containing Yousql):
SELECT * FROM articles WHERE MATCH (title,body) -> AGAINST (‘+MySQL -YourSQL' IN BOOLEAN MODE);
It can be seen that we have more control over the search and it looks more "high-end".
In fact, the above operation implies several meanings:
Plus sign: equivalent to and
Minus sign: equivalent to notNo: equivalent to or
Let’s take a look at several important features of Boolean type search:
1. There is no limit of 50% record selectivity. Even if the search result records exceed 50% of the total number, the results will still be returned.
2. It will not automatically sort in descending order according to the relevance of the records. 3. Can be directly applied without creating fulltext. Full-text index, but this will make the query very slow, so it is better not to use it.
4. Support minimum and maximum word length
5. Apply masked word list
n Minus sign -: Indicates modified The word must not appear in the record
n No operator: the word is optional, but the record containing the word is highly relevant
n Double quotation mark ": Use a phrase as a match. For example: "one word" matches one Words that come together
'apple banana'
must contain two words
'+apple +juice'
must contain apple, including macintosh The records are highly relevant and may not contain
'+apple macintosh'
must contain apple and cannot be called macintosh
'+apple -macintosh'
Look for records starting with apple
'apple*'
Complete match some words word
'"some words"'