search
Homephp教程PHP开发Use PHP extension trie_filter to filter Chinese sensitive words

1. Install libiconv, which is a dependency of libdatrie

wget http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.14.tar.gz 
tar zxvf libiconv-1.14.tar.gz 
cd libiconv-1.14
./configure 
make 
make install

2. Install: libdatrie (http://linux.thai.net/~thep/datrie/datrie.html#Download)

tar zxf libdatrie-0.2.4.tar.gz   
cd libdatrie-0.2.4  
./configure --prefix=/usr/local   
make   
make install

Compilation error trietool.c:125: undefined reference to `libiconv'

The solution is: ./configure LDFLAGS=-L/usr/local/lib LIBS=-liconv

3. Install the trie_filter extension

Since the official trie_filter extension does not support Chinese very well, I found an extension on git that was rewritten on the official extension and has been tested without any problems

The installation method is as follows:

https://github.com/wulijun/php-ext-trie-filter Download the source code package here

phpize
./configure --with-php-config=/usr/local/bin/php-config 
make
make install

4. Modify the php.ini file and add the trie_filter extension :extension=trie_filter.so, restart PHP.

Check phpinfo and find that the trie_filter extension is available, as shown in the following figure:

Use PHP extension trie_filter to filter Chinese sensitive words

5. Generate a dictionary for word detection, because it is not included in the source code package downloaded above. It has a command to generate a dictionary, so you also need to download the official source code package

(https://code.google.com/p/as3chat/downloads/detail?name=trie_filter-2011-03-21.tar. gz)

tar zxf trie_filter-2011.03.21.tar.gz   
cd trie_filter-2011.03.21    
gcc -o dpp dpp.c -ldatrie // 生成dpp命令用语编译词典 
./dpp words.txt words.dic  //将words.txt 编译成trie_filter使用的词典 words.txt中每个词占一行

Error when generating dictionary: ./dpp: error while loading shared libraries: libdatrie.so.1: cannot open shared object file: No such file or directory

Solution : Execute

ldconfig

and then execute

./dpp words.txt words.dic

6. Test:

<!--?php 
/**
 * trie_filter 敏感词过滤示例
 * 
 **/ 
   
// 载入词典,成功返回一个 Trie_Filter 资源句柄,失败返回 NULL 
$file = trie_filter_load(&#39;./words.dic&#39;); 
var_dump($file); 
$str1 = &#39;今天利用trie_filter做敏感词过滤示例&#39;; 
$str2 = &#39;今天利用trie_filter做过滤示例&#39;; 
// 检测文本中是否含有词典中定义的敏感词(假设敏感词设定为:‘敏感词’) 
$res1 = trie_filter_search_all($file, $str1);  // 一次把所有的敏感词都检测出来
$res2 = trie_filter_search($file, $str2);// 每次只检测一个敏感词 
var_dump($res1); 
echo "<br/-->"; 
var_dump($res2);
trie_filter_free($file); //最后别忘记调用free

It is recommended to use php 5.3.3 or above version, I Using 5.3.3

The above is the content of using PHP extension trie_filter to filter Chinese sensitive words. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!

Related articles:

An efficient sensitive word filtering method (PHP)

php sensitive word filtering uses a third-party extension trie_filter

PHP implements filtering sensitive words in message messages

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools