Home >Backend Development >PHP Tutorial >Parse Baidu search results link?url=parameter analysis (full)_PHP tutorial

Parse Baidu search results link?url=parameter analysis (full)_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:15:385087browse

A few days ago, I wrote an article about how to get the URL after Baidu jump. After searching on Baidu, someone also studied Baidu link?url=.

The following results are roughly obtained:

1. The encryption method is based on: random + input residence time + snapshot address for encryption
2. There should be three elements in the entire code Parts: 1. Time of search term; 2. Search keywords; 3. Randomly generated unique identification code.
3. In any environment or browser url = there is a similar code at the end
From the above results of other people's research, we can know that "there is a similar code at the end" is more usable, so we will start with this.
I searched for "enenba" and found that the URL of my first search result had the same code, which was
http://www.baidu.com/link?url=………… ebac5573358cc3c0659257bfcf54763ec1c5ecff3b3fbd1d4c
All search results have a piece of code ebac5573358cc3c0659257bfcf54 (found after searching N times)
The 763ec1c5ecff3b3fbd1d4c at the end looks like the search result Real URL. (It has been verified that it is the ciphertext of the real URL)
I verified it like this:
1. First search www.php100.com on Baidu
The first result link:
http://www. baidu.com/link?url=…………ebac5573358cc3c0659257bfcf546427d385fef6656de2404d6843da27
See the first few 6427d385fef6656de2404d6843da27
2. Search www.hao123.com on Baidu
First result link:
http: //www.baidu.com/link?url=…………ebac5573358cc3c0659257bfcf54 6427d385e6ff7a6de0434d6843da
See the first few 6427d385e6ff7a6de0434d6843da
……
After searching N websites many times, I found that the first few domain names Yes For "www.", the ciphertext is 6427d385
and www. is four characters, and the ciphertext 6427d385 is eight characters. You can know that two characters of the ciphertext are equal to one character of the URL.
So I wrote a php form query and got the ciphertext part for easy viewing later.
Publish the php source code:

Copy the code The code is as follows:



Query Baidu link?ulr=Real link form</ title> <br></head> <br><body> <br><?php <BR>/* <BR>getrealurl Get the URL address after 301 and 302 redirection by enenba.com <BR> @param str $url Query <BR>$return str The real url of the directed url <BR>*/ <BR>function getrealurl($url){ <BR>$header = get_headers($url,1); <BR>if (strpos($header[0],'301') || strpos($header[0],'302')) { <BR>if(is_array($header['Location'])) { <BR>return $header['Location'][count($header['Location'])-1]; <BR>}else{ <BR>return $header['Location']; <BR>} <BR>} else { <BR>return $url; <BR>} <BR>} <BR>$input = '<form method="get" action=""><input type="text" name="url " id="url" style="width:800px;" /><input type="submit" value="Submit" /></form><body></html>'; <br>$url = isset($_GET['url'])?$_GET['url']:''; <br>if(empty($url)) exit($input); <br>$urlreal = getrealurl($url); <br>echo 'The real url is:'.$urlreal; <br>$urlreal = ltrim($urlreal,'http://'); <br>$search = '/ebac5573358cc3c0659257bfcf54( [0-9a-f]+)/i'; <br>preg_match($search,$url,$r); <br>$url_encode = $r[1]; unset($r); <br>echo '<br/>The ciphertext part is: '.$url_encode.'<br/>'; <br>$urlreal_arr = str_split($urlreal); <br>$url_encode_arr = str_split($url_encode,2 ; Research to be continued. . . . <br>This site declares in advance: The articles on cnbeta are not published by me. My analysis is only based on my own ideas and research. It is just a process. As for whether there are results, I have my own conclusions. Please don't rant. <br>Continuing from the previous article, I carefully looked at the long code of the Baidu result URL and found that the ciphertext only consists of numbers and letters a to f, which is a hexadecimal code. <br>Hexadecimal is from 0->1->2->3->4->5->7->8->9->a->b ->c->d->e->f <br>I collected a series of URLs and counted the first code. </div>ebac5573358cc3c0659257bfcf54XX... <br>The url corresponding to the XX code is as follows <br><br><br><br>Copy the code<br><br> The code is as follows:<br><div class="codebody" id="code98723"> <br>33 0 23 @ 13 P 03 ` 73 p 63 <br>! 32 1 22 A 12 Q 02 a 72 q 62 <br>" 31 2 21 B 11 R 01 b 71 r 61 <br> # 30 3 20 C 10 S 00 c 70 s 60 <br>$ 37 4 27 D 17 T 07 d 77 t 67 <br>% 36 5 26 E 16 U 06 e 76 u 66 <br>& 35 6 25 F 15 V 05 f 75 v 65 <br>' 34 7 24 G 14 W 04 g 74 w 64 <br>( 3b 8 2b H 1b X 0b h 7b x 6b <br>) 3a 9 2a I 1a Y 0a i 7a y 6a <br>* 39 : 29 J 19 Z 09 j 79 z 69 <br>+ 38 ; 28 K 18 [ 08 k 78 { 68 <br>, 3f < 2f L 1f 0f l 7f | 6f <BR> - 3e = 2e M 1e ] 0e m 7e } 6e <BR>. 3d > 2d N 1d ^ 0d n 7d ~ 6d <br>/ 3c ? 2c O 1c _ 0c o 7c 6c <br> </div> <br>It is found that it should be a character in an ascii code table, but the order should be confused. But it is all like this in one base: <br>3->2->1->0-> 7->6->5->4->b->a->9->8->f->e->d->c <br>four digits In descending order, it can be seen that the overall trend is decreasing. <br>But what I don’t understand is that the corresponding 0c and 73 are adjacent in ascii. There is no way. I can’t see the pattern. Let’s look at the second one. The code for this bit is <br>ebac5573358cc3c0659257bfcf54XXYY. The url corresponding to the bit code for <br>YY is as follows: <br><div class="codetitle"> <span style="CURSOR: pointer" onclick="doCopy('code76697')"><u>70 0 60 @ 50 P 40 ` 30 p 20 </u>! 71 1 61 A 51 Q 41 a 31 q 21 </span>" 72 2 62 B 52 R 42 b 32 r 22 </div># 73 3 63 C 53 S 43 c 33 s 23 <div class="codebody" id="code76697">$ 74 4 64 D 54 T 44 d 34 t 24 <br>% 75 5 65 E 55 U 45 e 35 u 25 <br>& 76 6 66 F 56 V 46 f 36 v 26 <br>' 77 7 67 G 57 W 47 g 37 w 27 <br>( 78 8 68 H 58 X 48 h 38 x 28 <br>) 79 9 69 I 59 Y 49 i 39 y 29 <br>* 7a : 6a J 5a Z 4a j 3a z 2a <br>+ 7b ; 6b K 5b [ 4b k 3b { 2b <br>, 7c < 6c L 5c 4c l 3c | 2c <BR>- 7d = 6d M 5d ] 4d m 3d } 2d <BR>. 7e > 6e N 5e ^ 4e n 3e ~ 2e <br>/ 7f ? 6f O 5f _ 4f o 3f 2f <br><br> <br>The secret text of the second group follows the increasing order of hexadecimal. <br>0->1->2->3->4->5->7->8->9->a->b->c-> ;d->e->f <br>The overall trend is decreasing. <br>Look at the third group <br>ebac5573358cc3c0659257bfcf54XXYYZZ. . . . </div>The url corresponding to the ZZ code is as follows: <br><br><br><br> Copy the code <br><br> The code is as follows: <br><div class="codetitle"> <span style="CURSOR: pointer" onclick="doCopy('code16218')">84 0 94 @ a4 P b4 ` c4 p d4 <u>! 85 1 95 A a5 Q b5 a c5 q d5 </u>" 86 2 96 B a6 R b6 b c6 r d6 </span># 87 3 97 C a7 S b7 c c7 s d7 </div>$ 80 4 90 D a0 T b0 d c0 t d0 <div class="codebody" id="code16218">% 81 5 91 E a1 U b1 e c1 u d1 <br>& 82 6 92 F a2 V b2 f c2 v d2 <br> ' 83 7 93 G a3 W b3 g c3 w d3 <br>( 8c 8 9c H ac ae Z be j ce z de <br>+ 8f ; 9f K af [ bf k cf { df <br>, 88 < 98 L a8 b8 l c8 | d8 <BR>- 89 = 99 M a9 ] b9 m c9 } d9 <BR>. 8a > 9a N aa ^ ba n ca ~ da <br>/ 8b ? 9b O ab _ bb o cb db <br><br> <br>Does not explain the upper order: <br> 4->5->6->7->0->1->2->3->4->c->b->e->f- >8->9->a->b <br>Overall, it is increasing <br> I haven’t looked at the following digits, but I can probably tell that it is a group of four digits. Hexadecimal confusion , as for whether it is increasing or decreasing, a certain amount of data is needed to determine. <br>To be continued. <br> <br> </div> <br>http://www.bkjia.com/PHPjc/326056.html<br><br>www.bkjia.com<br><br>true<br>http: //www.bkjia.com/PHPjc/326056.html<p align="left"></p> <div style="display:none;">TechArticle<span id="url" itemprop="url"></span>A few days ago I wrote an article on how to get the URL after Baidu jump. I checked it on Baidu. Some people have also studied Baidu link?url=. The following results are roughly obtained: 1. The encryption method is based on...<span id="indexUrl" itemprop="indexUrl"></span><span id="isOriginal" itemprop="isOriginal"></span> <span id="isBasedOnUrl" itemprop="isBasedOnUrl"></span> </div></div><div class="nphpQianMsg"><div class="clear"></div></div><div class="nphpQianSheng"><span>Statement:</span><div>The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn</div></div></div><div class="nphpSytBox"><span>Previous article:<a class="dBlack" title="Introduction to delimiters and atoms in PHP regular expressions_PHP Tutorial" href="https://m.php.cn/faq/308279.html">Introduction to delimiters and atoms in PHP regular expressions_PHP Tutorial</a></span><span>Next article:<a class="dBlack" title="Introduction to delimiters and atoms in PHP regular expressions_PHP Tutorial" href="https://m.php.cn/faq/308281.html">Introduction to delimiters and atoms in PHP regular expressions_PHP Tutorial</a></span></div><div class="nphpSytBox2"><div class="nphpZbktTitle"><h2>Related articles</h2><em><a href="https://m.php.cn/article.html" class="bBlack"><i>See more</i><b></b></a></em><div class="clear"></div></div><ins class="adsbygoogle" style="display:block" data-ad-format="fluid" data-ad-layout-key="-6t+ed+2i-1n-4w" data-ad-client="ca-pub-5902227090019525" data-ad-slot="8966999616"></ins><script> (adsbygoogle = window.adsbygoogle || []).push({}); </script><ul class="nphpXgwzList"><li><b></b><a href="https://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="https://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="https://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="https://m.php.cn/faq/1.html" title="How to use cURL to implement Get and Post requests in PHP" class="aBlack">How to use cURL to implement Get and Post requests in PHP</a><div class="clear"></div></li><li><b></b><a href="https://m.php.cn/faq/2.html" title="All expression symbols in regular expressions (summary)" class="aBlack">All expression symbols in regular expressions (summary)</a><div class="clear"></div></li></ul></div></div><ins class="adsbygoogle" style="display:block" data-ad-format="autorelaxed" data-ad-client="ca-pub-5902227090019525" data-ad-slot="5027754603"></ins><script> (adsbygoogle = window.adsbygoogle || []).push({}); </script><footer><div class="footer"><div class="footertop"><img src="/static/imghwm/logo.png" alt=""><p>Public welfare online PHP training,Help PHP learners grow quickly!</p></div><div class="footermid"><a href="https://m.php.cn/about/us.html">About us</a><a href="https://m.php.cn/about/disclaimer.html">Disclaimer</a><a href="https://m.php.cn/update/article_0_1.html">Sitemap</a></div><div class="footerbottom"><p> © php.cn All rights reserved </p></div></div></footer><script>isLogin = 0;</script><script type="text/javascript" src="/static/layui/layui.js"></script><script type="text/javascript" src="/static/js/global.js?4.9.47"></script></div><script src="https://vdse.bdstatic.com//search-video.v1.min.js"></script><link rel='stylesheet' id='_main-css' href='/static/css/viewer.min.css' type='text/css' media='all'/><script type='text/javascript' src='/static/js/viewer.min.js?1'></script><script type='text/javascript' src='/static/js/jquery-viewer.min.js'></script><script>jQuery.fn.wait = function (func, times, interval) { var _times = times || -1, //100次 _interval = interval || 20, //20毫秒每次 _self = this, _selector = this.selector, //选择器 _iIntervalID; //定时器id if( this.length ){ //如果已经获取到了,就直接执行函数 func && func.call(this); } else { _iIntervalID = setInterval(function() { if(!_times) { //是0就退出 clearInterval(_iIntervalID); } _times <= 0 || _times--; //如果是正数就 -- _self = $(_selector); //再次选择 if( _self.length ) { //判断是否取到 func && func.call(_self); clearInterval(_iIntervalID); } }, _interval); } return this; } $("table.syntaxhighlighter").wait(function() { $('table.syntaxhighlighter').append("<p class='cnblogs_code_footer'><span class='cnblogs_code_footer_icon'></span></p>"); }); $(document).on("click", ".cnblogs_code_footer",function(){ $(this).parents('table.syntaxhighlighter').css('display','inline-table');$(this).hide(); }); $('.nphpQianCont').viewer({navbar:true,title:false,toolbar:false,movable:false,viewed:function(){$('img').click(function(){$('.viewer-close').trigger('click');});}}); </script></body><!-- Matomo --><script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="https://tongji.php.cn/"; _paq.push(['setTrackerUrl', u+'matomo.php']); _paq.push(['setSiteId', '9']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); })(); </script><!-- End Matomo Code --></html>