Home  >  Article  >  Backend Development  >  PHP uses scws to implement mysql full-text search function

PHP uses scws to implement mysql full-text search function

PHP中文网
PHP中文网Original
2016-07-13 17:05:541325browse

This article mainly introduces how PHP uses scws to implement the MySQL full-text search function. The MySQL full-text search function can be implemented through the expansion of the scws word segmentation plug-in. It is a very practical skill. Friends who need it can refer to it

The example in this article describes how PHP uses scws to implement the full-text search function of mysql. Share it with everyone for your reference. The specific method is as follows:

Chinese word segmentation plug-ins like scws are quite good. I simply studied it. It contains a set of rules for proper names, names of people, names of places, digital ages, etc. You can directly separate sentences according to these rules. into keywords one by one, the accuracy is between 90% and 95%. Follow the installation instructions to put the scws extension into the php extension directory, download the rule file and dictionary file, and reference them in the php configuration file. Use scws for word segmentation.

1) Modify the php extension code to be compatible with php 5.4.x

2) Fix the problem that the limit parameter of scws_get_tops in the php extension is not allowed to be less than 10

3) libscws adds scws_fork() to branch from existing scws instances and share dictionaries/rule sets, mainly for multi-threaded development.

4) Add some versions of win32 dll extensions

The PHP example code is as follows:

The code is as follows:

<?php 
//实例化分词插件核心类 
$so = scws_new(); 
//设置分词时所用编码 
$so->set_charset(&#39;utf-8&#39;); 
//设置分词所用词典(此处使用utf8的词典) 
$so->set_dict(&#39;/path/dict.utf8.xdb&#39;); 
//设置分词所用规则 
$so->set_rule(&#39;/path/rules.utf8.ini &#39;); 
//分词前去掉标点符号 
$so->set_ignore(true); 
//是否复式分割,如“中国人”返回“中国+人+中国人”三个词。 
$so->set_multi(true); 
//设定将文字自动以二字分词法聚合 
$so->set_duality(true); 
//要进行分词的语句 
$so->send_text(“欢迎来到火星时代IT开发”); 
//获取分词结果,如果提取高频词用get_tops方法 
while ($tmp = $so->get_result()) 
{ 
  print_r($tmp); 
} 
$so->close(); 
?>


Note: As in the above example, The character sets of the input text, dictionary, and rule files must be unified. In addition, some mysql 4.XX does not support Chinese full-text search. You can store the location code corresponding to the keyword to facilitate full-text search .

Version List

Version Type Platform Performance Others

SCWS-1.1.x C Code *Unix*/*PHP* Accuracy: 95%, Recall: 91 %, speed: 1.2MB/sec

PHP extended word segmentation speed: 250KB/sec [Download] [Documentation] [Installation instructions]

php_scws.dll(1) PHP extension library Windows/PHP 4.4.x Accuracy: 95%, Recall: 91%,

php_scws.dll(2) PHP extension library Windows/PHP 5.2.x Accuracy: 95%, Recall: 91%,

php_scws.dll(3) PHP extension library Windows/PHP 5.3.x Accuracy: 95%, Recall: 91%,

php_scws.dll(4) PHP extension library Windows/PHP 5.4.x Accuracy: 95% , Recall: 91%,

PSCWS23 PHP source code is not limited (does not support UTF-8) Accuracy: 93%, Recall: 89%,

PSCWS4 PHP source code is not limited Accurate: 95 %, recall: 91%,

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn