Home >Backend Development >PHP Tutorial >coreseek configuration and incremental index merge index

coreseek configuration and incremental index merge index

不言
不言Original
2018-05-15 14:56:012068browse


Guide: I am a PHP novice, and the company's business is not complicated, but I have recently used full-text search, so I want to use sphinx.
It is roughly divided into three parts, 1: installation; 2 configuration: 3 calling api. Here we mainly talk about configuration and calling api. I wrote a separate post about the installation steps before, you can check it out. If you don’t understand, you can go to the official website. The installation steps are very clear. Without further ado, let’s get started.
1. Why use incremental index? In fact, I personally think that it is completely unnecessary to use incremental indexes for businesses with small data volumes. It will be OK if you can regenerate the index regularly. Incremental indexing is to generate separate indexes for the content added since the last generated index, so that the amount of data is relatively small and does not affect business processing. Then the indexes are merged regularly. In order to maintain data uniformity, indexes need to be regenerated regularly. .
1. The ID of the last generated index needs to be recorded, and can be stored in a table.

CREATE TABLEtbl_pre_coursevideo( 
idint(11) NOT NULL DEFAULT '0', 
maxidint(11) NOT NULL DEFAULT '0', 
 PRIMARY KEY (id) 
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

2. Configuration file

source mysql
{
 type                    = mysql
    sql_host             = 127.0.0.1
    sql_user              = root
    sql_pass              = 123456
    sql_db                = test
    sql_port              = 3306
    sql_query_pre        = SETNAMES utf8
    sql_query_pre       = SETSESSION query_cache_type=OFF #如果这里报错去mysql配置文件修改
    sql_query_pre         = REPLACEINTO tbl_pre_coursevideo SELECT1, MAX(id) FROM tbl_coursevideo    #在刚才新建的表中将当前索引生成的最多id存起来,为增量索引做准备。


    sql_query                = SELECT id,title,create_time, subtitle,content,type FROM tbl_coursevideo WHERE id <=(SELECT maxid FROM tbl_pre_coursevideo WHERE id=1)
   #上面这条sql可以分为两部分WHERE 之前,是对数据的查询(根据自己的业务来定)where之后,是对刚才的记录
   最大ID 的筛选

    sql_attr_uint            = id           #从SQL读取到的值必须为整数
     sql_attr_timestamp        = create_time #从SQL读取到的值必须为整数,作为时间属性


    sql_field_string          = title  #字符串字段(可全文搜索,可返回原始文本信息)  
     sql_field_string          = subtitle  #字符串字段(可全文搜索,可返回原始文本信息)  
     sql_field_string          = content  #字符串字段(可全文搜索,可返回原始文本信息)  
}
source increment : mysql
{
sql_query_pre = SETNAMES utf8
sql_query                = SELECT id,title,create_time, subtitle,content,type FROMtbl_coursevideo WHERE id >(SELECT maxid FROM tbl_pre_coursevideo WHERE id=1)
#这是增量索引的数据源sql。和上面保持一致,唯一的变化,就是where条件之后,这里查询的是大于上次重新生成索引的id,即:刚刚添加的数据
}


#index定义
index mysql
{
   source            = mysql             #对应的source名称
    path            = /usr/local/coreseek/var/data/mysql#请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...
    docinfo            = extern
    mlock            = 0morphology        = none
    min_word_len        = 1html_strip                = 0    #中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
    charset_dictpath = /usr/local/mmseg3/etc/ #BSD、Linux环境下设置,/符号结尾
    #charset_dictpath = etc/                             #Windows环境下设置,/符号结尾,最好给出绝对路径,例如:C:/usr/local/coreseek/etc/...
    charset_type        = zh_cn.utf-8}


index increment : mysql
{
  source=increment

    path            = /usr/local/coreseek/var/data/increment
    charset_dictpath = /usr/local/mmseg3/etc/
    charset_type    = zh_cn.utf-8}
#全局index定义
indexer
{
    mem_limit            = 128M
}


#searchd服务定义
searchd
{
    listen                  =   9312read_timeout        = 5max_children        = 30max_matches            = 1000seamless_rotate        = 0preopen_indexes        = 0unlink_old            = 1    pid_file = /usr/local/coreseek/var/log/searchd_mysql.pid  #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...

    log =/usr/local/coreseek/var/log/searchd_mysql.log       #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...

    query_log =/usr/local/coreseek/var/log/query_mysql.log  #请修改为实际使用的绝对路径,例如:/usr/local/coreseek/var/...


}

Use:
1. Generate index (the following commands are all my own installation environment paths, please modify them to your own)
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft.conf --all #Generate index
At this time, a record will be added to the tbl_pre_coursevideo table. What is stored is the largest ID in your content table

2,
/usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/csft.conf #Open the background process

This time for If you search the mysql data source, there is already data.
3. Incremental index (prerequisite for adding new data to the content table)
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft.conf increment --rotate #After execution is completed You will be prompted to generate several pieces of incremental index data, which is the number of pieces of data you just added to the content table. At this time, you can actually test whether your incremental index is successful or not cl->Query($keyword, 'increment '); When calling the api, you can use the incremental index to query the content you just added
4. Merge index

usr /local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft.conf
–merge mysql increment –rotate

cl->Query(&#39;查询的关键字&#39;, &#39;mysql&#39;); //就能查询出来刚才的新添加的数据,以及以前的数据。

5. In order to maintain data uniformity, it is necessary to regenerate the index regularly

/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft.conf --all --rotate#--rotate 是不影响服务器搜索时可以添加这个属性

2.
1. Call the api. This is encapsulated by myself. The file after the sphinxapi.php installation is completed. This class

require&#39;sphinxapi.php&#39;;

class Sphinx {private$host=&#39;127.0.0.1&#39;;private$port=9312;private$cl;/*
     * @desc 构造函数  初始化sphinx对象
     */public function __construct() {$this->cl =new SphinxClient ();$this->cl->SetServer($this->host, $this->port);$this->cl->SetConnectTimeout(1);$this->cl->SetArrayResult(true);$this->cl->SetMatchMode(SPH_MATCH_EXTENDED2);$this->cl->SetRankingMode(SPH_RANK_WORDCOUNT);
    }/*
     * @desc 搜索
     * @param $page  页数
     * @param $pagesize  条数
     * @param $keyword  搜索关键字
     * @param $source 索引源
     */public function search($keyword, $p, $pagesize) {$page= ($p-1) *10;$this->cl->SetLimits($page, $pagesize); //分页$res=$this->cl->Query($keyword, $source); //sphinx 查询    }
  }

2, scheduled task
yum install crontab //Install
crontab -e                                 // Open the editor
and then execute the incremental index regularly, merge the index, and regenerate the index

Stay: If the merged index page is successful. But the queried data is always empty, then you can take a look at the configuration file

 path            = /usr/local/coreseek/var/data/increment#这里的配置,主索引是否和增量索引这只的路径一样,increment 就是索引的文件名,会在data文件夹下。

That’s all. If there is anything you don’t know, take a good look at this blog, including the comments. These are some of the problems I encountered, and they are all commented. If there's anything you don't understand, you can leave me a message and I'll tell you everything. Next time I will tell you about the participle participle.

&#39;).addClass(&#39;pre-numbering&#39;).hide();
                    $(this).addClass(&#39;has-numbering&#39;).parent().append($numbering);
                    for (i = 1; i <= lines; i++) {
                        $numbering.append($(&#39;
&#39;).text(i));
                    };
                    $numbering.fadeIn(1700);
                });
            });
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn