Home  >  Article  >  Backend Development  >  Weibo short link algorithm PHP version implementation code_PHP tutorial

Weibo short link algorithm PHP version implementation code_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:16:15852browse

Idea:
1) Generate a 32-bit signature string from md5 of the long URL, divided into 4 segments, each segment is 8 bytes;
2) Process these four segments in a loop, take 8 bytes, and read them into a hexadecimal string and operate with 0x3fffffff (30 bits 1), that is, more than 30 bits are ignored;
3) These 30 bits are divided into 6 segments, and each 5-digit number is used as an index of the alphabet to obtain a specific character, in sequence Get a 6-digit string;
4) The total md5 string can get 4 6-digit strings; any one of them can be used as the short URL address of this long URL;
The following is the PHP code:

Copy code The code is as follows:

function shorturl($url='', $prefix='', $suffix='' ) {
$base = array (
'a', 'b', 'c', 'd', 'e', ​​'f', 'g', 'h',
'i ', 'j', 'k', 'l', 'm', 'n', 'o', 'p',
'q', 'r', 's', 't', ' u', 'v', 'w', 'x',
'y', 'z', '0', '1', '2', '3', '4', '5') ;
$hex = md5($prefix.$url.$suffix);
$hexLen = strlen($hex);
$subHexLen = $hexLen / 8;
$output = array( );
for ($i = 0; $i < $subHexLen; $i++) {
$subHex = substr ($hex, $i * 8, 8);
$int = 0x3FFFFFFF & (1 * ('0x'.$subHex));
$out = '';
for ($j = 0; $j < 6; $j++) {
$val = 0x0000001F & $int;
$out .= $base[$val];
$int = $int >> 5;
}
$output[] = $out;
}
return $output;
}
$urls = shorturl('http://www.jb51.net/');
var_dump($urls);

Result
Copy code The code is as follows:

array(4) {
[0]=>
string(6) "alms1l"
[1]=>
string(6) "2ipmby"
[2]=>
string(6) "avo1hu"
[ 3]=>
string(6) "fdlban"
}

Another version:
Copy code The code is as follows:

function shorturl($url='', $prefix='', $suffix='') {
$base = array(
"a" ,"b","c","d","e","f","g","h",
"i","j","k","l","m ","n","o","p",
"q","r","s","t","u","v","w","x",
"y","z","0","1","2","3","4","5",
"6","7","8"," 9","A","B","C","D",
"E","F","G","H","I","J","K", "L",
"M","N","O","P","Q","R","S","T",
"U","V", "W","X","Y","Z");
$hex = md5($prefix.$url.$suffix);
$hexLen = strlen($hex);
$subHexLen = $hexLen / 8;
$output = array();
for ($i = 0; $i < $subHexLen; $i++) {
$subHex = substr ($hex, $i * 8, 8);
$int = 0x3FFFFFFF & (1 * ('0x'.$subHex));
$out = '';
for ($j = 0; $j < 6; $j++) {
$val = 0x0000003D & $int;
$out .= $base[$val];
$int = $int >> 5;
}
$output[] = $out;
}
return $output;
}

Result:
Copy the code The code is as follows:

array(4) {
[0] =>
string(6) "6jmMVj"
[1] =>
string(6) "2EnIby"
[2] =>
string(6) "6vIVfu"
[3] =>
string(6) "B7Fb6n "
}

But the collision rate of the upgraded version is higher, I don’t know why.
Test code for testing collision:
Copy code The code is as follows:

$result = array();
$repeats= array();
$loop = 20000;
for($i=0;$i<$loop;$i++){
$url = 'http://www.jb51 .net/?id='.$i;
$shorta = shorturl($url);
$short = $shorta[0];
if(in_array($short, $result)){
$repeats[] = $short;
}
$result[] = $short;
}
$result = array();
for($i=0; $i<$loop;$i++){
$url = 'http://www.jb51.net/?id='.$i;
$shorta = shorturl($url);
$short = $shorta[0];
if(in_array($short, $repeats)){
$result[$short][] = $url;
}
}
var_dump($repeats);
var_dump($result);

Result:
Copy code The code is as follows:

array(8) {
[0] =>
string(6) "3eQBzq"
[1] =>
string(6) "uQFnay"
[2] =>
string(6) "qEZbIv"
[3] =>
string(6) "fMneYf"
[4] =>
string(6) "FJj6Fr"
[5] =>
string(6) "3Eviym"
[6] =>
string(6) "j2mmuy"
[7] =>
string(6) "jyQfIv"
}
array(8) {
'jyQfIv' =>
array(2) {
[0] =>
string(26) "http://www.jb51.net/?id=1640"
[1] =>
string(27) "http://www.jb51.net/?id=18661"
}
'fMneYf' =>
array(2) {
[0] =>
string(26) "http://www.jb51.net/?id=2072"
[1] =>
string(26) "http://www.jb51.net/?id=8480"
}
'3eQBzq' =>
array(2) {
[0] =>
string(26) "http://www.jb51.net/?id=4145"
[1] =>
string(26) "http://www.jb51.net/?id=4273"
}
'j2mmuy' =>
array(2) {
[0] =>
string(26) "http://www.jb51.net/?id=7131"
[1] =>
string(27) "http://www.jb51.net/?id=17898"
}
'qEZbIv' =>
array(2) {
[0] =>
string(26) "http://www.jb51.net/?id=7320"
[1] =>
string(26) "http://www.jb51.net/?id=8134"
}
'uQFnay' =>
array(2) {
[0] =>
string(26) "http://www.jb51.net/?id=7347"
[1] =>
string(26) "http://www.jb51.net/?id=7962"
}
'FJj6Fr' =>
array(2) {
[0] =>
string(26) "http://www.jb51.net/?id=8628"
[1] =>
string(26) "http://www.jb51.net/?id=9031"
}
'3Eviym' =>
array(2) {
[0] =>
string(27) "http://www.jb51.net/?id=11175"
[1] =>
string(27) "http://www.jb51.net/?id=14437"
}
}

www.bkjia.comtruehttp://www.bkjia.com/PHPjc/325982.htmlTechArticle思路: 1)将长网址md5生成32位签名串,分为4段, 每段8个字节; 2)对这四段循环处理, 取8个字节, 将他看成16进制串与0x3fffffff(30位1)与操作, 即超过...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn