百度百科的关键词链接是怎样实现的呢-PHP Tutorial-php.cn

Home

Backend Development

PHP Tutorial

百度百科的关键词链接是怎样实现的呢

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 23, 2016 pm 02:14 PM

关键字链接

百度百科的关键词带有链接。我在想少量关键词，只需要简单替换成链接就可以。可是百度的关键词是狠多的，可能成万上千万个。如果替换上万次，那效率也太低了吧。请教这样的功能是怎样实现的呢？谢谢！

附截图：

回复讨论(解决方案)

我也想知道。

百科的关键词是按类别相关性来分配的，所以不会有很多个关键词。
另外你感觉可能要调用replace函数很多次，这只是PHP的正常实现方式。实际上，用C语言来遍历一次整篇文章即可，这个效率还是远远超过PHP的实现方式的。

dream1206 如果一个类别的关键字有一万个一篇文章替换一万次；你认为合理不？

dream1206 如果一个类别的关键字有一万个一篇文章替换一万次；你认为合理不？你还没明白我的意思，如果算法得当，只需要遍历一次整篇文章。
替换只是针对文章中的某个字符串，已经检查过的内容并不需要再去检查，明白吗？
当然如果考虑到其它因素，例如关键词冲突例如研究,研究生这个功能还是蛮复杂的

我也想知道啊，老师现在逼着我做啊，不会。。

少量的关键词 php有个strtr函数

dream1206 如果一个类别的关键字有一万个一篇文章替换一万次；你认为合理不？
当然不合理！
但是你为什么不反过来做呢？
抄写一遍文章，对于文章中的每一个词去检查是否在关键词集合中，不就快多了吗？

记得我发过基于 trie 的关键词匹配代码

引用 3 楼 anydy2008 的回复:dream1206    如果一个类别的关键字有一万个  一篇文章替换一万次；你认为合理不？
当然不合理！
但是你为什么不反过来做呢？
抄写一遍文章，对于文章中的每一个词去检查是否在关键词集合中，不就快多了吗？

记得我发过基于 trie 的关键词匹配代码

版主  但我怎么可以知道文章里的是词语呢。
比如：

文章  秦始皇东巡洛阳

关键词集合  秦始皇  洛阳

程序是不知道应该将文章的  秦始皇在关键词中也匹配，因为它不知道“秦始皇”是个词呢。

这就只能说中文的自身的问题了，比如魔兽世界经典的黑色魔纹胸甲，断句失败就是黑/色魔/纹胸/甲

好吧，我再发一遍

include 'TTrie.php';class wordkey extends TTrie {  function b() {    $t = array_pop($this->buffer);    $this->buffer[] = "<b>$t</b>";  }}$p = new wordkey;$p->set('秦始皇', 'b');$p->set('洛阳', 'b');$t = $p->match('秦始皇东巡洛阳');echo join('', $t);

秦始皇东巡洛阳

TTrie.php

class TTrie {  protected $buffer = array();  protected $dict = array( array() );  protected $input = 0; //字符串当前偏移  protected $backtracking = 0; //字符串回溯位置  public $debug = 0;  public $savematch = 1;  function set($word, $action='') {	if(is_array($word)) {		foreach($word as $k=>$v) $this->set($k, $v);		return;	}	$p = count($this->dict);	$cur = 0; //当前节点号	foreach(str_split($word) as $c) {		if (isset($this->dict[$cur][$c])) { //已存在就下移			$cur = $this->dict[$cur][$c];			continue;		}		$this->dict[$p]= array(); //创建新节点		$this->dict[$cur][$c] = $p; //在父节点记录子节点号		$cur = $p; //把当前节点设为新插入的		$p++;	}	$this->dict[$cur]['acc'] = $action; //一个词结束，标记叶子节点  }  function getto($ch) {	$i =& $this->input; //字符串当前偏移	$p =& $this->backtracking; //字符串回溯位置	$len = strlen($this->doc);	$t = '';	$this->input++;//	while($this->input<$len && $this->doc{$this->input} != $ch) $t .= $this->doc{$this->input++};//	$t .= $this->doc{$this->input++};	do {		if($this->input >= $len) break;		$t .= $this->doc{$this->input};	}while($this->doc{$this->input++} != $ch);	return trim($t);  }	  function match($s) {	$this->doc =& $s;	$this->buffer = array();	$ret = array();	$cur = 0; //当前节点，初始为根节点	$i =& $this->input; //字符串当前偏移	$p =& $this->backtracking; //字符串回溯位置	$i = $p = 0;	$s .= "\0"; //附加结束符	$len = strlen($s);	$buf = '';	while($i < $len) {		$c = $s{$i};		if(isset($this->dict[$cur][$c])) { //如果存在			$cur = $this->dict[$cur][$c]; //转到对应的位置			if(isset($this->dict[$cur][$s[$i+1]])) {//检查下一个字符是否也能匹配，长度优先				$i++;				continue;			}			if(isset($this->dict[$cur]['acc'])) { //是叶子节点，单词匹配！				if($buf != '') {					$this->buffer[] = $buf;					$buf = '';				}				if($this->savematch) $this->buffer[] = substr($s, $p, $i - $p + 1); //取出匹配位置和匹配的词				$ar = explode(',', $this->dict[$cur]['acc']);				call_user_func_array( array($this, array_shift($ar)), $ar );				$p = $i + 1; //设置下一个回溯位置				$cur = 0; //重置当前节点为根节点			}		} else { //不匹配			$buf .= $s{$p}; //substr($s, $p, $i - $p + 1); //保存未匹配位置和未匹配的内容			$cur = 0; //重置当前节点为根节点			$i = $p; //把当前偏移设为回溯位置			$p = $i + 1; //设置下一个回溯位置		}		$i++; //下一个字符	}	if(trim($buf, "\0")) $this->buffer[] = trim($buf, "\0");	return $this->buffer;  }  function __call($method, $param) {	if($this->debug) printf("偏移:%d 回溯:%d\n", $this->input, $this->backtracking);  }}

传说中的 PHP文字高亮，很好的class啊……

mark 我是来学习的

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

The Continued Use of PHP: Reasons for Its EnduranceApr 19, 2025 am 12:23 AM

What’s still popular is the ease of use, flexibility and a strong ecosystem. 1) Ease of use and simple syntax make it the first choice for beginners. 2) Closely integrated with web development, excellent interaction with HTTP requests and database. 3) The huge ecosystem provides a wealth of tools and libraries. 4) Active community and open source nature adapts them to new needs and technology trends.

PHP and Python: Exploring Their Similarities and DifferencesApr 19, 2025 am 12:21 AM

PHP and Python are both high-level programming languages that are widely used in web development, data processing and automation tasks. 1.PHP is often used to build dynamic websites and content management systems, while Python is often used to build web frameworks and data science. 2.PHP uses echo to output content, Python uses print. 3. Both support object-oriented programming, but the syntax and keywords are different. 4. PHP supports weak type conversion, while Python is more stringent. 5. PHP performance optimization includes using OPcache and asynchronous programming, while Python uses cProfile and asynchronous programming.

PHP and Python: Different Paradigms ExplainedApr 18, 2025 am 12:26 AM

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP and Python: A Deep Dive into Their HistoryApr 18, 2025 am 12:25 AM

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

Choosing Between PHP and Python: A GuideApr 18, 2025 am 12:24 AM

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP and Frameworks: Modernizing the LanguageApr 18, 2025 am 12:14 AM

PHP remains important in the modernization process because it supports a large number of websites and applications and adapts to development needs through frameworks. 1.PHP7 improves performance and introduces new features. 2. Modern frameworks such as Laravel, Symfony and CodeIgniter simplify development and improve code quality. 3. Performance optimization and best practices further improve application efficiency.

PHP's Impact: Web Development and BeyondApr 18, 2025 am 12:10 AM

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

How does PHP type hinting work, including scalar types, return types, union types, and nullable types?Apr 17, 2025 am 12:25 AM

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values and handle functions that may return null values.

See all articles