Home  >  Article  >  PHP Framework  >  How to record spider crawling logs in ThinkPHP6

How to record spider crawling logs in ThinkPHP6

藏色散人
藏色散人forward
2021-12-10 14:32:032405browse

The following thinkphp framework tutorial column will introduce to you how ThinkPHP 6 records spider crawling logs such as Baidu. I hope it will be helpful to friends in need!

thinkphp6 records Baidu spider log:

Write the following code in the parent class of the controller such as IndexBase. All front-end controllers inherit this controller

  public function initialize()
    {
        parent::initialize(); // TODO: Change the autogenerated stub
        if ($this->Config['web_status'] == 0) {  // 判断是否关闭网站
            die('网站已经关闭');
        }
        $this->baiduLog();
    }
    protected function baiduLog()
    {
        $useragent = strtolower($_SERVER['HTTP_USER_AGENT']);
        $url = $this->request->controller() . "/" . $this->request->action();
        $param = input("param.","","htmlspecialchars");
        $url = (string) url($url,$param);
        $ip = get_real_ip();
        $title = "";
        if (strpos($useragent, 'googlebot') !== false){
            $title =  'Google';
        } elseif (strpos($useragent, 'baiduspider') !== false){
            $title =  'Baidu';
        } elseif (strpos($useragent, 'msnbot') !== false){
            $title =  'Bing';
        } elseif (strpos($useragent, 'slurp') !== false){
            $title =  'Yahoo';
        } elseif (strpos($useragent, 'sosospider') !== false){
            $title =  'Soso';
        } elseif (strpos($useragent, 'sogou spider') !== false){
            $title =  'Sogou';
        } elseif (strpos($useragent, 'yodaobot') !== false){
            $title =  'Yodao';
        } elseif (strpos($useragent, 'googlebot') !== false){
            $title =  'Google';
        } elseif (strpos($useragent, 'baiduspider') !== false){
            $title =  'Baidu';
        } else {
//            $title = $useragent; // 不怕数据大的话可以取消注释,记录所有访问日志
        }
        if (!empty($title)) {
            BaiduLog::create(["title"=>$title,"href"=>$url,"ip"=>$ip]);
        }
    }

The above is how thinkphp6 records Baidu spider crawling logs.

get_real_ip()

is a custom function to obtain the customer’s real IP.

Recommended: "The latest 10 thinkphp video tutorials"

The above is the detailed content of How to record spider crawling logs in ThinkPHP6. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:phpfv.com. If there is any infringement, please contact admin@php.cn delete