Home  >  Article  >  Database  >  A brief tutorial on MySQL full-text index application

A brief tutorial on MySQL full-text index application

大家讲道理
大家讲道理Original
2016-11-07 16:19:141013browse

This article introduces the basic knowledge of MySQL full-text index from the following aspects:

Several notes on MySQL full-text index

  • The syntax of full-text index

  • Introduction to several search types

  • Several types Examples of search types

  • Several notes on full-text indexes

The search must be on an index column of type fulltext, and the column specified in match must have been specified in fulltext

Can only be applied to table engines as In MyIsam type tables (MySQL 5.6 can also be used in the Innodb table engine after MySQL 5.6)

Only full-text indexes can be created on char, varchar, and text type columns

Like ordinary indexes, they can be specified when defining the table. You can also add or modify it after creating the table

For a large-scale record insertion, inserting data into a table without an index and then creating an index is much faster than inserting into a data table with an index

The search string must be A constant string, which cannot be the column name of the table

When the selectivity of the search record exceeds 50%, it is considered that there is no match (limited only in natural search)

Full-text index search syntax

MATCH (column name 1, Column name 2,...) AGAINST (search string [search modifier])

The column names 1, 2, etc. specified in match are the column names specified in establishing the full-text index. The following search modifier descriptions As follows:

search_modifier:

{
IN NATURAL LANGUAGE MODE
| IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
| IN BOOLEAN MODE
| WITH QUERY EXPANSION
}

Introduction to several search types

The search modifier above actually illustrates 3 full-text search types

IN NATURAL LANGUAGE MODE

Introduction: the default search form (not Add any search modifier or the modifier is IN NATURAL LANGUAGE MODE)

Features:

The characters in the search string are parsed into normal characters and have no special meaning

The strings in the masked character list are processed Filtering

When the selectivity of a record exceeds 50%, it is usually considered a mismatch.

The returned records are sorted and displayed according to the relevance of the records

IN BOOLEAN MODE

Introduction: Boolean mode search (when the search modifier is IN BOOLEAN MODE)

Features:

Special characters in the search string will be parsed according to certain rules The meaning carries out some logical meaning rules. For example: a certain word must appear or cannot appear, etc.

The records returned by this type of search are not sorted by relevance

WITH QUERY EXPANSION

Introduction: A slightly more complex search form that actually performs 2 natural searches and can return records with direct introductory relationships , modifier IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION or WITH QUERY EXPANSION modifier

Features: This type of search actually provides an indirect search function, for example: I search for a certain word, and the first one returned The row does not contain any of the strings in the search terms. A second match can be performed based on the record words of the first search result, so that it is possible to find matching records with some indirect relationships.

Examples of several search types

Application in NATURAL LANGUAGE MODE mode:

It is still applied in the product table, where we have established a full-text index in the name field, because I need to match the relevant keywords in the name column Record the

Sql statement as follows:

SELECT * FROM product WHERE match(name) against(‘auto')

The time is not bad, more than 10,000 records were hit in nearly 870,000 records, and it took 1.15 seconds. The effect is still good

Note: By default, it is already based on correlation from high to high When the record is low, the record is returned

We can SELECT match(name) against('auto') FROM product to view the correlation value of the record. The values ​​are between 0 and 1. 0 means the record does not match. Several important features :

1. Which words will be ignored

The search term is too short. The default full-text index considers words with more than 4 characters as valid words. We can modify ft_min_word_len in the configuration to configure

The default full-text index for shielding words in the vocabulary list Some common words are blocked because these words are too common and have no semantic effect, so they are ignored in the search process. Of course, this list is also configurable.

2. How to perform word segmentation

The full-text index considers a continuous valid character (the character set matched by w in the regular expression) to be a word, which may also contain a "'", but two consecutive ''s will be considered a separator symbol. Other delimiters such as: spaces, commas, periods, etc.

Application in BOOLEAN MODE mode:

In Boolean matching mode, we can add some special symbols to increase some logical functions of the search process. For example, the example provided on the official website (search for statements containing mysql string and not containing Yousql):

SELECT * FROM articles WHERE MATCH (title,body)
-> AGAINST (‘+MySQL -YourSQL' IN BOOLEAN MODE);

It can be seen that we have more control over the search and it looks more "high-end".

In fact, the above operation implies several meanings:

Plus sign: equivalent to and

Minus sign: equivalent to not

No: equivalent to or

Let’s take a look at several important features of Boolean type search:

1. There is no limit of 50% record selectivity. Even if the search result records exceed 50% of the total number, the results will still be returned.
2. It will not automatically sort in descending order according to the relevance of the records. 3. Can be directly applied without creating fulltext. Full-text index, but this will make the query very slow, so it is better not to use it.
4. Support minimum and maximum word length
5. Apply masked word list

Operators supported by Boolean search:

n Plus sign +: Indicates that the modified word must appear in the record

n Minus sign -: Indicates modified The word must not appear in the record
n No operator: the word is optional, but the record containing the word is highly relevant
n Double quotation mark ": Use a phrase as a match. For example: "one word" matches one Words that come together

Here are some official examples:

Records containing at least one word

'apple banana'

must contain two words
'+apple +juice'

must contain apple, including macintosh The records are highly relevant and may not contain
'+apple macintosh'

must contain apple and cannot be called macintosh
'+apple -macintosh'

Look for records starting with apple
'apple*'

Complete match some words word
'"some words"'

I have learned the basic knowledge of mysql full-text index, and I feel that its full-text index is certainly much stronger than like, but it is still a bit crude for advanced searches, and there are also performance issues. Worry.

This is just an introduction and a translation of some basic knowledge on the official website.
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn