Heim >Backend-Entwicklung >PHP-Tutorial >elasticsearch创建索引时的一些选项问题

elasticsearch创建索引时的一些选项问题

WBOY
WBOYOriginal
2016-07-06 13:53:081281Durchsuche

我想用elasticsearch为博客的文章做站内搜索,后台用的php。

文章表articles的全部字段如下:

<code>id     title     content     user_id    created_at     updated_at</code>

现在我想为文章表的title字段、content字段、updated_at字段,共三个字段创建索引。

下面是我参照elasticsearch-php客户端的官方文档写的创建索引blog和创建类型article的demo,分词用到了ik分词。

其中有些选项不太清楚什么意思,具体问题在下面代码中(有4个),请大神帮解答一下,谢谢。

官方文档链接:https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/_index_management_operations.html#_create_an_index_advanced_example

<code>        $params = [
            'index' => 'blog',
            'body' => [
                'settings' => [
                    'number_of_shards' => 1,
                    'number_of_replicas' => 0,
                    'analysis' => [
                        'filter' => [
                            //1、这里的两个shingle应该改成article吗?
                            'shingle' => [
                                'type' => 'shingle'
                            ]
                        ],

                        //2、char_filter里面内容表示什么意思?包括pre_negs和post_negs。
                        'char_filter' => [

                            'pre_negs' => [
                                'type' => 'pattern_replace',
                                'pattern' => '(\\w+)\\s+((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\b',
                                'replacement' => '~$1 $2'
                            ],
                            'post_negs' => [
                                'type' => 'pattern_replace',
                                'pattern' => '\\b((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\s+(\\w+)',
                                'replacement' => '$1 ~$2'
                            ]
                        ],

                        //3、analyzer的内容需要怎么修改吗?
                        'analyzer' => [
                            'blog' => [
                                'type' => 'custom',
                                'tokenizer' => 'standard',
                                'filter' => ['lowercase', 'stop', 'kstem']
                            ]
                        ]
                    ]
                ],
                'mappings' => [
                    'article' => [
                        "_all" => [
                            "analyzer" => "ik_max_word",
                            "search_analyzer" => "ik_max_word",
                            "term_vector" => "no",
                            "store" => "false"
                        ],
                        'properties' => [
                            'title' => [
                                'type' => 'string',
                                'store' => 'no',
                                'term_vector' => 'with_positions_offsets',
                                'analyzer' => 'ik_max_word',
                                'search_analyzer' => 'ik_max_word',
                                'include_in_all' => 'true',
                                'boost' => 9
                            ],
                            'content' => [
                                'type' => 'string',
                                'store' => 'no',
                                'term_vector' => 'with_positions_offsets',
                                'analyzer' => 'ik_max_word',
                                'search_analyzer' => 'ik_max_word',
                                'include_in_all' => 'true',
                                'boost' => 8
                            ],
                            //4、时间只是用来在搜索的时候排序使用,下面的选项该怎么填写?
                            'updated_at' => [
                                'type' => '',
                                'store' => '',
                                'term_vector' => '',
                                'analyzer' => '',
                                'search_analyzer' => '',
                                'include_in_all' => '',
                                'boost' => 
                            ]
                        ]
                    ]


                ]
            ]
        ];
        $client->indices()->create($params);</code>

回复内容:

我想用elasticsearch为博客的文章做站内搜索,后台用的php。

文章表articles的全部字段如下:

<code>id     title     content     user_id    created_at     updated_at</code>

现在我想为文章表的title字段、content字段、updated_at字段,共三个字段创建索引。

下面是我参照elasticsearch-php客户端的官方文档写的创建索引blog和创建类型article的demo,分词用到了ik分词。

其中有些选项不太清楚什么意思,具体问题在下面代码中(有4个),请大神帮解答一下,谢谢。

官方文档链接:https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/_index_management_operations.html#_create_an_index_advanced_example

<code>        $params = [
            'index' => 'blog',
            'body' => [
                'settings' => [
                    'number_of_shards' => 1,
                    'number_of_replicas' => 0,
                    'analysis' => [
                        'filter' => [
                            //1、这里的两个shingle应该改成article吗?
                            'shingle' => [
                                'type' => 'shingle'
                            ]
                        ],

                        //2、char_filter里面内容表示什么意思?包括pre_negs和post_negs。
                        'char_filter' => [

                            'pre_negs' => [
                                'type' => 'pattern_replace',
                                'pattern' => '(\\w+)\\s+((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\b',
                                'replacement' => '~$1 $2'
                            ],
                            'post_negs' => [
                                'type' => 'pattern_replace',
                                'pattern' => '\\b((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\s+(\\w+)',
                                'replacement' => '$1 ~$2'
                            ]
                        ],

                        //3、analyzer的内容需要怎么修改吗?
                        'analyzer' => [
                            'blog' => [
                                'type' => 'custom',
                                'tokenizer' => 'standard',
                                'filter' => ['lowercase', 'stop', 'kstem']
                            ]
                        ]
                    ]
                ],
                'mappings' => [
                    'article' => [
                        "_all" => [
                            "analyzer" => "ik_max_word",
                            "search_analyzer" => "ik_max_word",
                            "term_vector" => "no",
                            "store" => "false"
                        ],
                        'properties' => [
                            'title' => [
                                'type' => 'string',
                                'store' => 'no',
                                'term_vector' => 'with_positions_offsets',
                                'analyzer' => 'ik_max_word',
                                'search_analyzer' => 'ik_max_word',
                                'include_in_all' => 'true',
                                'boost' => 9
                            ],
                            'content' => [
                                'type' => 'string',
                                'store' => 'no',
                                'term_vector' => 'with_positions_offsets',
                                'analyzer' => 'ik_max_word',
                                'search_analyzer' => 'ik_max_word',
                                'include_in_all' => 'true',
                                'boost' => 8
                            ],
                            //4、时间只是用来在搜索的时候排序使用,下面的选项该怎么填写?
                            'updated_at' => [
                                'type' => '',
                                'store' => '',
                                'term_vector' => '',
                                'analyzer' => '',
                                'search_analyzer' => '',
                                'include_in_all' => '',
                                'boost' => 
                            ]
                        ]
                    ]


                ]
            ]
        ];
        $client->indices()->create($params);</code>
Stellungnahme:
Der Inhalt dieses Artikels wird freiwillig von Internetnutzern beigesteuert und das Urheberrecht liegt beim ursprünglichen Autor. Diese Website übernimmt keine entsprechende rechtliche Verantwortung. Wenn Sie Inhalte finden, bei denen der Verdacht eines Plagiats oder einer Rechtsverletzung besteht, wenden Sie sich bitte an admin@php.cn