Home >Backend Development >PHP Tutorial > php排序1亿个QQ号码解决方案

php排序1亿个QQ号码解决方案

WBOY
WBOYOriginal
2016-06-13 13:50:06937browse

php排序1亿个QQ号码
吃饱喝足了,还发贴了。
拆开分成几千份进行排序再合并。


首先先创建一个1亿个QQ号的txt。

PHP code
<!--

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->
<?php // 创建一亿个QQ号的txt (大约需85~100秒)

set_time_limit(0);
$fn = 'qq.txt';
$fp = fopen($fn, 'w');

$st = microtime(true);

$l = range(0,10000);
shuffle($l);
foreach ($l as $k=>$v)
{
    $arr = range($v*10000+10000,10000*($v+1)+9999);
    shuffle($arr);
    fputs($fp,implode("\n", $arr)."\n");
    unset($arr);
}

echo  microtime(true)-$st;

?>



 

稍等一两分钟1亿个随机QQ创建完成了。

QQ号码范围为>10000。文件大小大概有840MB。



下面就进行分类划分成几千份文件。

以QQ号码长度为文件夹,QQ号码前3位为文件名。

PHP code
<!--

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->
<?php // 长度号码分类 (大约需360~400秒)

set_time_limit(0);
$st = microtime(true);

if(!is_dir('qq_no')) mkdir('qq_no');
$file = fopen('qq.txt', 'r'); 


$i=0;
$end_s = '';
while(!feof($file))
{
    $g = 1042*1024;
    fseek($file,$g*$i);
    $s = fread($file, $g);

     
    $end = strrpos($s, "\n");
    $arr_s = $end_s.substr($s, 0, $end);
    $end_s = substr($s, $end);

    $arr = explode("\n", $arr_s);
    foreach ($arr as $k=>$v)
    {
        if($v!='')
        {
            $tag = "$v[0]$v[1]$v[2]";
            $text_arr[strlen($v)][$tag][] = $v;
        }
    }

    foreach ($text_arr as $k=>$v)
    {
        $n_dir = 'qq_no/'.$k;
        if (!is_dir($n_dir)) mkdir($n_dir);
        foreach ($v as $tag=>$val)
        {
            $n_tf = fopen($n_dir.'/'.$tag.'.txt', 'a+');
            fputs($n_tf,implode("\n",$val)."\n");
        }
        
        
    }
    unset($text_arr);

    ++$i;

}

echo  microtime(true)-$st;

?>





最后就要每个文件进行排序合并数据了。

PHP code
<!--

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->
<?php // 排序完成拉 (800~920秒)

set_time_limit(0);
$st = microtime(true);

$qq_done = fopen('qq_done.txt', 'a+');

$root = 'qq_no';
$dir_array = scandir($root);

foreach ($dir_array as $key=>$val)
{
    if ($val != '.' && $val != '..')
        $dirs[$val] =  scandir($root.'/'.$val);
}


foreach ($dirs as $key=>$val)
{
    foreach ($val as $v)
    {
        if ($v != '.' && $v != '..')
        {
            $file = $root. '/' . $key . '/'. $v;
            $c = file_get_contents($file);
            $arr = explode("\n", $c);
            sort($arr);
            fputs($qq_done, implode("\n",$arr));
            unlink($file);
        }
    }
    rmdir($root. '/' . $key);
}
rmdir($root);

echo  microtime(true)-$st;

?>




总共大概花费了20多分钟。

虽然完成了,但方法很土鳖 0_0 ,坛里各位高手们改进改进啊。


------解决方案--------------------
来个C版本的
C/C++ code

#include <stdio.h>

#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 100000000

int a[1 + N/BITSPERWORD];

void set(int i)
{
    a[i>>SHIFT] |= (1>SHIFT] &= ~(1>SHIFT] & (1<font color="#e78608">------解决方案--------------------</font><br>
<br>既然有现成的数据文件,就没有必要去构造插入串了<br><dl class="code">PHP code<pre class="brush:php;toolbar:false">
set_time_limit(0);
$sql =
                 
              
              
        
            
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn