Home > Article > Backend Development > How to parse Baidu search results link?url=parameter
Disclaimer: The articles on cnbeta are not published by me. My analysis is only based on my own ideas and research. It is just a process. As for whether there are results, I have my own conclusions. Please don't rant. After carefully looking at the long code of the Baidu result URL, I found that the ciphertext only consists of numbers and letters a to f, which is a hexadecimal code. Hexadecimal is from 0->1->2->3->4->5->7->8->9->a->b->c->d->e->f I collected a series of URLs and counted the first code. ebac5573358cc3c0659257bfcf54XX...... The URL corresponding to the XX code is as follows 33 0 23 @ 13 P 03 ` 73 p 63 ! 32 1 22 A 12 Q 02 a 72 q 62 " 31 2 21 B 11 R 01 b 71 r 61 # 30 3 20 C 10 S 00 C 70 s 60 $37 4 27 D 17 T 07 D 77 T 67 % 36 5 26 E 16 U 06 e 76 u 66 & 35 6 25 F 15 V 05 f 75 v 65 ' 34 7 24 G 14 W 04 g 74 w 64 ( 3b 8 2b H 1b X 0b h 7b x 6b ) 3a 9 2a I 1a Y 0a i 7a y 6a * 39 : 29 J 19 Z 09 j 79 z 69 + 38 ; 28 K 18 [ 08 k 78 { 68 , 3f 2d N 1d ^ 0d n 7d ~ 6d / 3c ? 2c O 1c _ 0c o 7c 6c I found that it should be characters in an ascii code table, but the order should be confused. But it’s all like this in this single base system: 3->2->1->0->7->6->5->4->b->a->9->8->f->e->d->c Four digits in descending order, it can be seen that the overall order is decreasing. But what is puzzling is that from _ to ` are adjacent in ASCII, and the corresponding 0c and 73 are jumping. No way, I can't see the pattern. Let's look at the second set of codes. ebac5573358cc3c0659257bfcf54XXYY. . . . The URL corresponding to the code YY is as follows 70 0 60 @ 50 P 40 ` 30 P 20 ! 71 1 61 A 51 Q 41 a 31 q 21 " 72 2 62 B 52 R 42 b 32 r 22 # 73 3 63 C 53 S 43 c 33 s 23 $ 74 4 64 D 54 T 44 d 34 t 24 % 75 5 65 E 55 U 45 e 35 u 25 & 76 6 66 F 56 V 46 f 36 v 26 ' 77 7 67 G 57 W 47 g 37 w 27 (78 8 68 H 58 x 48 h 38 x 28 ) 79 9 69 I 59 Y 49 i 39 y 29 * 7a : 6a J 5a Z 4a j 3a z 2a + 7b ; 6b K 5b [ 4b k 3b { 2b , 7c 6e N 5e ^ 4e n 3e ~ 2e / 7f ? 6f O 5f _ 4f o 3f 2f The secret text of the second group follows the increasing order of hexadecimal. 0->1->2->3->4->5->7->8->9->a->b->c->d->e->f Overall it is decreasing. Let’s look at the third group ebac5573358cc3c0659257bfcf54XXYYZZ. . . . The URL corresponding to the ZZ code is as follows 84 0 94 @ a4 P b4 ` c4 p d4 ! 85 1 95 A a5 Q b5 a c5 q d5 " 86 2 96 B a6 R b6 b c6 r d6 # 87 3 97 C a7 S b7 c c7 s d7 $ 80 4 90 D a0 T b0 d c0 t d0 % 81 5 91 E a1 U b1 e c1 u d1 & 82 6 92 F a2 V b2 f c2 v d2 ' 83 7 93 G a3 W b3 g c3 w d3 ( 8c 8 9c H ac X bc h cc x dc ) 8b 9 9b I ab Y bb i cd y dd * 8e : 9e J ae Z be j ce z de + 8f ; 9f K af [ bf k cf { df , 88 9a N aa ^ ba n ca ~ da / 8b ? 9b O ab _ bb o cb db I won’t explain the order: 4->5->6->7->0->1->2->3->4->c->b->e->f->8->9->a->b Overall it is increasing I haven’t looked at the number of digits at the end, but I can probably tell that it is a group of four digits in hexadecimal confusion. As for whether it is increasing or decreasing, a certain amount of data is needed to judge. Next time, 1,000 URL data will be collected for judgment. |