search
HomePHP LibrariesOther librariesPHP's simple word segmentation library
PHP's simple word segmentation library
<?php
header("Content-Type:text/html; charset=utf-8");
define('APP_ROOT', str_replace('\', '/', dirname(__FILE__)));
function get_tags_arr($title)
    {
require(APP_ROOT.'/pscws4.class.php');
        $pscws = new PSCWS4();
$pscws->set_dict(APP_ROOT.'/scws/dict.utf8.xdb');
$pscws->set_rule(APP_ROOT.'/scws/rules.utf8.ini');
$pscws->set_ignore(true);
$pscws->send_text($title);
$words = $pscws->get_tops(5);
$tags = array();
foreach ($words as $val) {
   $tags[] = $val['word'];
}
$pscws->close();
return $tags;
}
print_r(get_tags_arr($con));
function get_keywords_str($content){
require(APP_ROOT.'/phpanalysis.class.php');
PhpAnalysis::$loadInit = false;
$pa = new PhpAnalysis('utf-8', 'utf-8', false);
$pa->LoadDict();
$pa->SetSource($content);
$pa->StartAnalysis( false );
$tags = $pa->GetFinallyResult();
return $tags;
}
print(get_keywords_str($con));

No need to install extensions, it comes with a dictionary and is easy to use
Uses scws, which many people are familiar with, and the other is phpanalysis made by IT Plato
Please refer to the index.php file example for usage

Disclaimer

All resources on this site are contributed by netizens or reprinted by major download sites. Please check the integrity of the software yourself! All resources on this site are for learning reference only. Please do not use them for commercial purposes. Otherwise, you will be responsible for all consequences! If there is any infringement, please contact us to delete it. Contact information: admin@php.cn

Related Article

Simple Chinese word segmentation based on RMMSimple Chinese word segmentation based on RMM

25Jul2016

Simple Chinese word segmentation based on RMM

PHP's cURL library makes web crawling simple and effective_PHP TutorialPHP's cURL library makes web crawling simple and effective_PHP Tutorial

15Jul2016

PHP's cURL library scrapes web pages simply and efficiently. Using PHP's cURL library can easily and effectively scrape web pages. You just need to run a script and analyze the web pages you crawled, and then you can get what you want programmatically

Simple Chinese word segmentation code made in PHP_PHP tutorialSimple Chinese word segmentation code made in PHP_PHP tutorial

20Jul2016

Simple Chinese word segmentation code made in PHP. For Chinese search engines, Chinese word segmentation is one of the most basic parts of the entire system, because the current Chinese search algorithm based on single characters is not very good. Of course, this article is not intended to guide Chinese search engines.

PHP simple Chinese word segmentation system (1/2)_PHP tutorialPHP simple Chinese word segmentation system (1/2)_PHP tutorial

20Jul2016

PHP simple Chinese word segmentation system (1/2). PHP simple Chinese word segmentation system structure: first word hash table, Trie index tree node advantages: in word segmentation, there is no need to predict the length of the word to be queried, and it is matched word by word along the tree chain. Cons: Construction and Maintenance Comparison

Simple code sharing for Chinese word segmentation in PHP_PHP tutorialSimple code sharing for Chinese word segmentation in PHP_PHP tutorial

21Jul2016

Simple code sharing for Chinese word segmentation in PHP. Of course, this article is not to do research on Chinese search engines, but to share how to use PHP to build an on-site search engine. This article is an article in this system. The word segmentation tool I use is Zhong

How to import third-party libraries in ThinkPHPHow to import third-party libraries in ThinkPHP

03Jun2023

Third-party class libraries Third-party class libraries refer to other class libraries besides the ThinkPHP framework and application project class libraries. They are generally provided by third-party systems or products, such as class libraries of Smarty, Zend and other systems. For the class libraries imported earlier using automatic loading or the import method, the ThinkPHP convention is to use .class.php as the suffix. Non-such suffixes need to be controlled through the import parameters. But for the third type of library, since there is no such agreement, its suffix can only be considered to be php. In order to easily introduce class libraries from other frameworks and systems, ThinkPHP specifically provides the function of importing third-party class libraries. Third-party class libraries are uniformly placed in the ThinkPHP system directory/

See all articles