Home >Backend Development >PHP Tutorial >Teach you step by step how to do keyword matching project (search engine)----On the third day, teach you how to do it----_PHP tutorial
The third day
Xiao Wang (Operations Director) saw that Xiao Dingding was searching for keywords on Taobao, Baidu, Rubik's Cube, and Paipai all day long. It took a long time every day and the work efficiency was low, so he came to me with this excuse.
Speaking of: Xiao Shuai Shuai, look at Xiao Ding Ding spending a long time searching for keywords on Taobao, Baidu, Rubik's Cube, and Paipai every day. Can you help me and see if I can let the system do it by itself? , this can save a lot of manpower and bring high benefits. (0 Actually just to cover up their laziness 0)
As soon as Xiao Shuai Shuai heard that the benefits it could bring were so high, and Mr. Wang was still begging me, his eyes were filled with stars. It was time for me to demonstrate my worth. Solving this problem will reflect my value.
Xiao Shuai Shuai patted his chest and promised: Mr. Wang, this little KS, I will help you do it right away. (0 Programmers are so cute, willing to take challenges, willing to accept, and never need high rewards 0)
Xiao Wang patted Xiao Shuai Shuai: Good boy, work hard, I’m waiting for your good news.
Xiao Shuai Shuai has been having fun all day since then. I never thought about how painful a task this is.
Keyword source example image:
When Xiao Shuai Shuai woke up, he realized that this project was no ordinary pain in the ass.
Xiao Shuaishuai didn’t know how to start, so he ran to Xiao Yu (technical director) in a hurry (0 It’s great to have a technical director, a strong backer, but someone is taking the blame anyway 0).
Speaking of: Boss Yu, Mr. Wang gave me this task just now. I don’t know how to handle it. Can you please give me some guidance?
Boss Yu took a look and said: You did it in PHP, right? PHP is a bit more complicated. Do you know how to parse curl and html dom?
Xiao Shuai Shuai said: I don’t really understand, I’ve never used it, it looks very advanced.
Boss Yu looked down upon it: What’s so profound about this? It’s very simple. Just search it on Baidu. (The boss is the boss, everything is simple in his eyes, and he is an object of admiration)
Boss Yu entered php curl in the Baidu search box and found the result for Xiao Shuaishuai to understand.
php curl manual: http://cn2.php.net/manual/zh/book.curl.php
After Xiao Shuai Shuai checked the play manual, he wrote one casually:
<span>#</span><span>请求淘宝首页</span> <span>$curl</span> =<span> curl_init(); curl_setopt_array(</span><span>$curl</span>,<span>array</span><span>( CURLOPT_FAILONERROR </span>=> <span>false</span>,<span> CURLOPT_RETURNTRANSFER </span>=> <span>true</span>,<span> CURLOPT_FOLLOWLOCATION </span>=> 1,<span> CURLOPT_CONNECTTIMEOUT </span>=> 15,<span> CURLOPT_TIMEOUT </span>=> 60, <span>//</span><span>CURLOPT_COOKIESESSION => 1, </span> CURLOPT_URL => "http://www.taobao.com"<span> )); </span><span>$result</span> = curl_exec(<span>$this</span>-><span>curl); </span><span>echo</span> <span>$result</span>;
Xiao Shuai Shuai is very happy because he has learned new knowledge. Xiao Shuai Shuai is worried again. He has retrieved the content, but how to retrieve the keywords in it?
Xiao Shuai Shuai rushed to Xiao Yu (Technical Director) to ask for advice.
said: Boss Yu, I already understand curl. I went to the homepage of Taobao. What should I do next?
Boss Yu glanced at the code and said unhappily: Well, it’s written, why does it look so awkward?
Xiao Shuaishuai felt unhappy, thinking that my writing was so good and easy to use, so why was it so awkward.
Boss Yu dug out the previous code and threw it to Xiao Shuaishuai. He said: First understand this code and rewrite it using this code?
File content:
<span>/*</span><span>* * cURL 常用操作封装 * * @author oShine </span><span>*/</span> <span>class</span><span> ExtendedCurl { </span><span>/*</span><span>* * 返回 JSON 内容为对象 </span><span>*/</span> <span>const</span> JSON_OBJECT = 0<span>; </span><span>/*</span><span>* * 返回 JSON 内容为数组 </span><span>*/</span> <span>const</span> JSON_ARRAY = 1<span>; </span><span>/*</span><span>* * cURL Handle * * @var resource </span><span>*/</span> <span>private</span> <span>$curl</span><span>; </span><span>/*</span><span>* * 当前(默认) cURL 参数 * * @var array </span><span>*/</span> <span>private</span> <span>$options</span> = <span>array</span><span>( CURLOPT_FAILONERROR </span>=> <span>false</span>,<span> CURLOPT_RETURNTRANSFER </span>=> <span>true</span>,<span> CURLOPT_FOLLOWLOCATION </span>=> 1,<span> CURLOPT_CONNECTTIMEOUT </span>=> 15,<span> CURLOPT_TIMEOUT </span>=> 60, <span>//</span><span>CURLOPT_COOKIESESSION => 1, </span> <span> ); </span><span>/*</span><span>* * 最后一次请求的错误信息 * * @var null|string </span><span>*/</span> <span>private</span> <span>$error</span> = <span>null</span><span>; </span><span>/*</span><span>* * @var int </span><span>*/</span> <span>private</span> <span>$httpCode</span> = <span>null</span><span>; </span><span>/*</span><span>* * @param array $defaultOptions * @internal param array $options 可选的覆盖默认 cURL 参数 </span><span>*/</span> <span>public</span> <span>function</span> __construct(<span>array</span> <span>$defaultOptions</span> = <span>array</span><span>()) { </span><span>$this</span>->curl =<span> curl_init(); </span><span>if</span> (!<span>empty</span>(<span>$defaultOptions</span><span>)) { </span><span>$this</span>->options = <span>$defaultOptions</span><span>; } } </span><span>/*</span><span>* * 设置 cURL 多个选项 * * @param array $options </span><span>*/</span> <span>public</span> <span>function</span> setOptions(<span>array</span> <span>$options</span><span>) { </span><span>foreach</span> (<span>$options</span> <span>as</span> <span>$key</span> => <span>$value</span><span>) { </span><span>$this</span>->setOption(<span>$key</span>, <span>$value</span><span>); } } </span><span>/*</span><span>* * 设置 cURL 单个选项 * * @param $key * @param $value </span><span>*/</span> <span>public</span> <span>function</span> setOption(<span>$key</span>, <span>$value</span><span>) { </span><span>$this</span>->options[<span>$key</span>] = <span>$value</span><span>; } </span><span>/*</span><span>* * 发送 GET 请求并返回解析后的 JSON 内容 * * @param $url * @param array $data * @param int $type * @return null|object|array </span><span>*/</span> <span>public</span> <span>function</span> getJson(<span>$url</span>, <span>array</span> <span>$data</span> = <span>array</span>(), <span>$type</span> = self::<span>JSON_ARRAY) { </span><span>$content</span> = <span>$this</span>->get(<span>$url</span>, <span>$data</span><span>); </span><span>return</span> json_decode(<span>$content</span>, <span>$type</span><span>); } </span><span>/*</span><span>* * 发送 GET 请求 * * @param $url * @param array $data * @return null|string </span><span>*/</span> <span>public</span> <span>function</span> get(<span>$url</span>, <span>array</span> <span>$data</span> = <span>array</span><span>()) { </span><span>if</span> (!<span>empty</span>(<span>$data</span><span>)) { </span><span>if</span> (<span>false</span> === <span>strpos</span>(<span>$url</span>, '?'<span>)) { </span><span>$url</span> .= '?'<span>; } </span><span>else</span><span> { </span><span>$url</span> .= '&'<span>; } </span><span>$url</span> .= <span>http_build_query</span>(<span>$data</span><span>); } </span><span>$options</span> = <span>array</span><span>( CURLOPT_URL </span>=> <span>$url</span>,<span> ); </span><span>return</span> <span>$this</span>->request(<span>$options</span><span>); } </span><span>/*</span><span>* * 发送 cURL 请求 * * @param array $options * @return mixed </span><span>*/</span> <span>private</span> <span>function</span> request(<span>array</span> <span>$options</span> = <span>array</span><span>()) { </span><span>$this</span>->setOptions(<span>$options</span><span>); curl_setopt_array(</span><span>$this</span>->curl, <span>$this</span>-><span>options); </span><span>$result</span> = curl_exec(<span>$this</span>-><span>curl); </span><span>$errorNo</span> = curl_errno(<span>$this</span>-><span>curl); </span><span>$response</span> = curl_getinfo( <span>$this</span>-><span>curl ); </span><span>if</span> (<span>$errorNo</span><span>) { </span><span>$this</span>->error = '[' . <span>$errorNo</span> . '] ' . curl_error(<span>$this</span>-><span>curl); } </span><span>else</span><span> { </span><span>$this</span>->error = <span>null</span><span>; } </span><span>if</span>(<span>isset</span>(<span>$response</span>['http_code'<span>])){ </span><span>$this</span>->httpCode = <span>$response</span>['http_code'<span>]; } </span><span>return</span> <span>$result</span><span>; } </span><span>/*</span><span>* * 发送 POST 请求并返回解析后的 JSON 内容 * * @param $url * @param array $data * @param int $type * @return null|object|array </span><span>*/</span> <span>public</span> <span>function</span> postJson(<span>$url</span>, <span>array</span> <span>$data</span> = <span>array</span>(), <span>$return</span> = self::<span>JSON_ARRAY) { </span><span>$content</span> = <span>$this</span>->post(<span>$url</span>, <span>$data</span><span>); </span><span>return</span> json_decode(<span>$content</span>, <span>$return</span><span>); } </span><span>/*</span><span>* * 发送 POST 请求 * * @param $url * @param array $data * @return null|string </span><span>*/</span> <span>public</span> <span>function</span> post(<span>$url</span>, <span>array</span> <span>$data</span> = <span>array</span><span>()) { </span><span>$options</span> = <span>array</span><span>( CURLOPT_URL </span>=> <span>$url</span>,<span> CURLOPT_POST </span>=> 1,<span> ); </span><span>if</span> (!<span>empty</span>(<span>$data</span><span>)) { </span><span>if</span> (<span>$this</span>->isMultiPart(<span>$data</span><span>)) { </span><span>$options</span>[CURLOPT_POSTFIELDS] = <span>$data</span><span>; } </span><span>else</span><span> { </span><span>$options</span>[CURLOPT_POSTFIELDS] = <span>http_build_query</span>(<span>$data</span><span>); } } </span><span>return</span> <span>$this</span>->request(<span>$options</span><span>); } </span><span>private</span> <span>function</span> isMultiPart(<span>$data</span><span>) { </span><span>foreach</span> (<span>$data</span> <span>as</span> <span>$value</span><span>) { </span><span>if</span> ('@' == <span>$value</span>[0<span>]) </span><span>return</span> <span>true</span><span>; } </span><span>return</span> <span>false</span><span>; } </span><span>/*</span><span>* * 判断最后一次请求是否有错误 * * @return bool </span><span>*/</span> <span>public</span> <span>function</span><span> hasError() { </span><span>return</span> <span>null</span> !== <span>$this</span>-><span>error; } </span><span>/*</span><span>* * 获取最后一次请求的错误信息 * * @return null|string </span><span>*/</span> <span>public</span> <span>function</span><span> getError() { </span><span>return</span> <span>$this</span>-><span>error; } </span><span>public</span> <span>function</span><span> getHttpCode() { </span><span>return</span> <span>$this</span>-><span>httpCode; } } </span>
Xiao Shuaishuai was very unhappy and wanted to beat Yu Boss, but he had to give in to his power, so he had to agree and said: Okay, I will go back and think about it first.
Xiao Shuai Shuai has been depressed for a whole day. He went to practice with the heavenly martial arts secrets.
Currently, Baidu promotion has three matching methods: precise, phrase, and broad. Generally speaking, the golden combination matching method is: broad matching + search term report + negative keywords. In terms of traffic, it's broad>phrase>accurate. You can set it according to your actual situation!
If you want an exact match, you need to put quotes around ABC.