Home  >  Article  >  Backend Development  >  Make your own on-site search engine_PHP tutorial

Make your own on-site search engine_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 16:09:16892browse

ccterran(原作)

作者:iwind

  朋友用dreamweaver做了一个网站,没有动态的内容,只是一些个人收藏的文章,个人介绍等等。现在内容比较多了,想叫我帮他做一个搜索引擎。说实在的,这是一个不难的问题,于是就随手做了一个。现在我在其它论坛上也看到有人想做这个,于是就想说说这方面的知识,重在了解一下方法。

写程序前先要想好一个思路,下面是我的思路,可能谁有更好的,但注意这只是一个方法问题 :遍历所有文件  读取内容  搜索关键字,如果匹配就放入一个数组  读数组。在实现这些步骤之前,我假定你的网页都是标准的,就是有标题(),也有(),如果你是用dreamweaver或者frontpage设计的,那么除非你故意删掉,它们都在存在的。下面就让我们一步步来完成并在工程中改善这个搜索引擎。

一,设计搜索表单
在网站的根目录下建个search.htm,内容如下


搜索表单



 
   
     
     
   
 

       

         
       

     

       
     




二,搜索程序
再在根目录下建个search.php 的文件,用来处理search.htm表单传过来的数据.内容如下
//获取搜索关键字
$keyword=trim($_POST[“keyword”]);
//检查是否为空
if($keyword==””){
echo”您要搜索的关键字不能为空”;
exit;//结束程序
}
?>

这样如果访问者输入的关键字为空时,可以做出提示。下面是遍历所有文件。

我们可以用递归的方法遍历所有的文件,可以用函数opendir,readdir,也可以用PHP Directory的类。我们现在用前者.
//遍历所有文件的函数
function listFiles($dir){
$handle=opendir($dir);
while(false!==($file=readdir($handle))){
if($file!="."&&$file!=".."){
//如果是目录就继续搜索
if(is_dir("$dir/$file")){
listFiles("$dir/$file");
}
else{
//在这里进行处理
}
}
}
}

?>

在红字的地方我们可以对搜索到的文件进行读取,处理.下面就是读取文件内容,并检查内容中是否含有关键字$keyword,如果含有就把文件地址赋给一个数组。
//$dir是搜索的目录,$keyword是搜索的关键字 ,$array是存放的数组
function listFiles($dir,$keyword,&$array){
$handle=opendir($dir);
while(false!==($file=readdir($handle))){
if($file!="."&&$file!=".."){
if(is_dir("$dir/$file")){
listFiles("$dir/$file",$keyword,$array);
}
else{
//读取文件内容
$data=fread(fopen("$dir/$file","r"),filesize("$dir/$file"));
//不搜索自身
if($file!=”search.php”){
//是否匹配
if(eregi("$keyword",$data)){
$array[]="$dir/$file";
}
}
}
}
}
}
//定义数组$array
$array=array();
//执行函数
listFiles(".","php",$array);
//打印搜索结果
foreach($array as $value){
echo "$value"."
\n";
}
?>

现在把这个结果和开头的一段程序结合起来,输入一个关键字,然后就会发现你的网站中的相关结果都被搜索出来了。我们现在在把它完善一下。
1,列出内容的标题

                          if(eregi("$keyword",$data)){
                  $array[]="$dir/$file";
                          }
改成
                          if(eregi("$keyword",$data)){
                                   if(eregi("(.+)",$data,$m)){
                        $title=$m["1"];
                                   }
                                   else{
                        $title="没有标题";
                                   }
                                   $array[]="$dir/$file $title";
                           }
原理就是,如果在文件内容中找到xxx,那么就把xxx取出来作为标题,如果找不到那么就把标题命名未”没有标题”.

2,只搜索网页的内容的主题部分。
做网页时一定会有很多html代码在里面,而这些都不是我们想要搜索的,所以要去除它们。我现在用正则表达式和strip_tags的配合,并不能把所有的都去掉。

            $data=fread(fopen("$dir/$file","r"),filesize("$dir/$file"));
            //不搜索自身
            if($file!=”search.php”){
              //是否匹配
                          if(eregi("$keyword",$data)){
改为
$data=fread(fopen("$dir/$file","r"),filesize("$dir/$file"));
           if(eregi("]+)>(.+)",$data,$b)){
                 $body=strip_tags($b["2"]);
                        }
                        else{
                 $body=strip_tags($data);
                        }
                        if($file!="search.php"){
                            if(eregi("$keyword",$body)){

3,标题上加链接
foreach($array as $value){
   echo "$value"."
\n";
}
改成
foreach($array as $value){
   //拆开
   list($filedir,$title)=split(“[ ]”,$value,”2”);
   //输出
   echo "$value"."
\n";
}
4防止超时
如果文件比较多,那么防止PHP执行时间超时是必要的。可以在文件头加上
set_time_limit(“600”);
以秒为单位,所以上面是设10分钟为限。


So the complete program is
set_time_limit("600");
//Get the search keyword
$keyword=trim($_POST["keyword "]);
//Check if it is empty
if($keyword==""){
echo "The keyword you want to search cannot be empty";
exit;//End Program
}
function listFiles($dir,$keyword,&$array){
$handle=opendir($dir);
while(false!==($file=readdir($ handle))){
If($file!="."&&$file!=".."){
if(is_dir("$dir/$file")){
listFiles( "$dir/$file",$keyword,$array);
                                                                                                                                                           . $dir/$file"));
If(eregi("]+)>(.+)",$data,$b)){
                $body=strip_tags($b["2"]);                                                   $body=strip_tags($data);
                                                                                    "search.php"){
if(eregi("$keyword",$body)){
if(eregi("(.+)</title&g t;",$data, $m)){<BR>                                                                                           else{<BR> $title="No title";<BR> }<BR> $array[] ="$dir/$file $title";<BR>                                                                                                        🎜>}<BR>$array=array();<BR>listFiles( ".","$keyword",$array);<BR>foreach($array as $value){<BR> //Split<BR> list($filedir,$title)=split("[ ]" ,$value,"2");<BR> //Output<BR> echo "<a href=$filedir target=_blank>$title </a>"."<br>n";<br>}<br>?><br><br>So far, you have built your own search engine. You can also improve it by modifying the content processing part. You can search for titles or search for content. Function. Also consider pagination. Keep this to yourself. <br><br>Here is a description of using preg_match instead of eregi, which will be much faster. This is just for easy understanding, so the commonly used eregi is used. <br><br> <br> <br></p> <p align="left"></p> <div style="display:none;"> <span id="url" itemprop="url">http://www.bkjia.com/PHPjc/314558.html</span><span id="indexUrl" itemprop="indexUrl">www.bkjia.com</span><span id="isOriginal" itemprop="isOriginal">true</span><span id="isBasedOnUrl" itemprop="isBasedOnUrl">http: //www.bkjia.com/PHPjc/314558.html</span><span id="genre" itemprop="genre">TechArticle</span><span id="description" itemprop="description">ccterran (original work) Author: iwind A friend made a website using Dreamweaver. There is no dynamic content, just some personal collections. articles, personal introductions, etc. There is more content now...</span> </div> <div class="art_confoot"></div></div><div class="nphpQianMsg"><div class="clear"></div></div><div class="nphpQianSheng"><span>Statement:</span><div>The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn</div></div></div><div class="nphpSytBox"><span>Previous article:<a class="dBlack" title="Generation of Pinyin Code Table_PHP Tutorial" href="http://m.php.cn/faq/312307.html">Generation of Pinyin Code Table_PHP Tutorial</a></span><span>Next article:<a class="dBlack" title="Generation of Pinyin Code Table_PHP Tutorial" href="http://m.php.cn/faq/312309.html">Generation of Pinyin Code Table_PHP Tutorial</a></span></div><div class="nphpSytBox2"><div class="nphpZbktTitle"><h2>Related articles</h2><em><a href="http://m.php.cn/article.html" class="bBlack"><i>See more</i><b></b></a></em><div class="clear"></div></div><ul class="nphpXgwzList"><li><b></b><a href="http://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="http://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="http://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="http://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="http://m.php.cn/faq/2.html" title="All expression symbols in regular expressions (summary)" class="aBlack">All expression symbols in regular expressions (summary)</a><div class="clear"></div></li></ul></div></div><div class="nphpFoot"><div class="nphpFootBg"><ul class="nphpFootMenu"><li><a href="http://m.php.cn/"><b class="icon1"></b><p>Home</p></a></li><li><a href="http://m.php.cn/course.html"><b class="icon2"></b><p>Course</p></a></li><li><a href="http://m.php.cn/wenda.html"><b class="icon4"></b><p>Q&A</p></a></li><li><a href="http://m.php.cn/login"><b class="icon5"></b><p>My</p></a></li><div class="clear"></div></ul></div></div><div class="nphpYouBox" style="display: none;"><div class="nphpYouBg"><div class="nphpYouTitle"><span onclick="$('.nphpYouBox').hide()"></span><a href="http://m.php.cn/"></a><div class="clear"></div></div><ul class="nphpYouList"><li><a href="http://m.php.cn/"><b class="icon1"></b><span>Home</span><div class="clear"></div></a></li><li><a href="http://m.php.cn/course.html"><b class="icon2"></b><span>Course</span><div class="clear"></div></a></li><li><a href="http://m.php.cn/article.html"><b class="icon3"></b><span>Article</span><div class="clear"></div></a></li><li><a href="http://m.php.cn/wenda.html"><b class="icon4"></b><span>Q&A</span><div class="clear"></div></a></li><li><a href="http://m.php.cn/dic.html"><b class="icon6"></b><span>Dictionary</span><div class="clear"></div></a></li><li><a href="http://m.php.cn/course/type/99.html"><b class="icon7"></b><span>Manual</span><div class="clear"></div></a></li><li><a href="http://m.php.cn/xiazai/"><b class="icon8"></b><span>Download</span><div class="clear"></div></a></li><li><a href="http://m.php.cn/faq/zt" title="Topic"><b class="icon12"></b><span>Topic</span><div class="clear"></div></a></li><div class="clear"></div></ul></div></div><div class="nphpDing" style="display: none;"><div class="nphpDinglogo"><a href="http://m.php.cn/"></a></div><div class="nphpNavIn1"><div class="swiper-container nphpNavSwiper1"><div class="swiper-wrapper"><div class="swiper-slide"><a href="http://m.php.cn/" >Home</a></div><div class="swiper-slide"><a href="http://m.php.cn/article.html" class="hover">Article</a></div><div class="swiper-slide"><a href="http://m.php.cn/wenda.html" >Q&A</a></div><div class="swiper-slide"><a href="http://m.php.cn/course.html" >Course</a></div><div class="swiper-slide"><a href="http://m.php.cn/faq/zt" >Topic</a></div><div class="swiper-slide"><a href="http://m.php.cn/xiazai" >Download</a></div><div class="swiper-slide"><a href="http://m.php.cn/game" >Game</a></div><div class="swiper-slide"><a href="http://m.php.cn/dic.html" >Dictionary</a></div><div class="clear"></div></div></div><div class="langadivs" ><a href="javascript:;" class="bg4 bglanguage"></a><div class="langadiv" ><a onclick="javascript:setlang('zh-cn');" class="language course-right-orders chooselan " href="javascript:;"><span>简体中文</span><span>(ZH-CN)</span></a><a onclick="javascript:;" class="language course-right-orders chooselan chooselanguage" href="javascript:;"><span>English</span><span>(EN)</span></a><a onclick="javascript:setlang('zh-tw');" class="language course-right-orders chooselan " href="javascript:;"><span>繁体中文</span><span>(ZH-TW)</span></a><a onclick="javascript:setlang('ja');" class="language course-right-orders chooselan " href="javascript:;"><span>日本語</span><span>(JA)</span></a><a onclick="javascript:setlang('ko');" class="language course-right-orders chooselan " href="javascript:;"><span>한국어</span><span>(KO)</span></a><a onclick="javascript:setlang('ms');" class="language course-right-orders chooselan " href="javascript:;"><span>Melayu</span><span>(MS)</span></a><a onclick="javascript:setlang('fr');" class="language course-right-orders chooselan " href="javascript:;"><span>Français</span><span>(FR)</span></a><a onclick="javascript:setlang('de');" class="language course-right-orders chooselan " href="javascript:;"><span>Deutsch</span><span>(DE)</span></a></div></div><script> var swiper = new Swiper('.nphpNavSwiper1', { slidesPerView : 'auto', observer: true,//修改swiper自己或子元素时,自动初始化swiper observeParents: true,//修改swiper的父元素时,自动初始化swiper }); </script></div></div><!--顶部导航 end--><script>isLogin = 0;</script><script type="text/javascript" src="/static/layui/layui.js"></script><script type="text/javascript" src="/static/js/global.js?4.9.47"></script></div><script src="https://vdse.bdstatic.com//search-video.v1.min.js"></script><link rel='stylesheet' id='_main-css' href='/static/css/viewer.min.css' type='text/css' media='all'/><script type='text/javascript' src='/static/js/viewer.min.js?1'></script><script type='text/javascript' src='/static/js/jquery-viewer.min.js'></script><script>jQuery.fn.wait = function (func, times, interval) { var _times = times || -1, //100次 _interval = interval || 20, //20毫秒每次 _self = this, _selector = this.selector, //选择器 _iIntervalID; //定时器id if( this.length ){ //如果已经获取到了,就直接执行函数 func && func.call(this); } else { _iIntervalID = setInterval(function() { if(!_times) { //是0就退出 clearInterval(_iIntervalID); } _times <= 0 || _times--; //如果是正数就 -- _self = $(_selector); //再次选择 if( _self.length ) { //判断是否取到 func && func.call(_self); clearInterval(_iIntervalID); } }, _interval); } return this; } $("table.syntaxhighlighter").wait(function() { $('table.syntaxhighlighter').append("<p class='cnblogs_code_footer'><span class='cnblogs_code_footer_icon'></span></p>"); }); $(document).on("click", ".cnblogs_code_footer",function(){ $(this).parents('table.syntaxhighlighter').css('display','inline-table');$(this).hide(); }); $('.nphpQianCont').viewer({navbar:true,title:false,toolbar:false,movable:false,viewed:function(){$('img').click(function(){$('.viewer-close').trigger('click');});}}); </script></body></html>