Home  >  Q&A  >  body text

The data returned by php query is too large and importing csv files is very slow.

The framework uses ci and the database is sqlsrv (sql server 2008). The result set returned by the sql query has 20,000 pieces of data. I execute this sql statement in the ci framework and the web page keeps spinning. 20,000 pieces of data need to be It takes about 30 seconds, but my sql statement is executed in seconds in sql server 2008 R2. When using the built-in query sql in ci, the execution time is 200 milliseconds, which shows that my statement is not slow. Personally, I think it’s slow due to too much returned data. The returned data is 20,000 pieces of data, with about 20 fields. Because what I do is statistics, I want to export the data into a csv file, so there is a lot of data. How should I deal with this? ?
My sql statement is select * from table where create_time between 'xxxx-xx-xx 00:00:00' and 'xxxx-xx-xx 23:59:59'; It is based on the time period if the user chooses If the time range is small, the data will be less. If the range is large, the data will be large. When the data is large, it will be very slow.

public function aa(){

$sql ="select * from table where create_time between 'xxxx-xx-xx 00:00:00' and 'xxxx-xx-xx 23:59:59'";
$result=$this->db->query($sql)-result_array();

}
This is already the simplest query. The sql statement can be checked in seconds, but it is very slow to execute with a browser.
I execute it in the browser. If the query results are tens of thousands, the browser will It takes more than 40 seconds to keep spinning, and 500 pieces of data are enough to take 3 or 4 seconds

I am wondering if the result_array() step requires too much time to loop. There are tens of thousands of records and there are twenty fields

==============2017-05-16 16:01 update===============
I did not use the ci framework myself With result_array(), when viewing the ci framework documentation, I saw a paragraph:
Often, you will need to provide a database connection ID or a result ID. The connection ID can be like this

$this->db->conn_id;

result ID can be obtained from the result object returned by the query, like this:

$query = $this->db->query("SOME QUERY");
$query->result_id;

So I modified my code to use result_id to read each record in a loop
$sql ="select xxx";
$query=$this->db->query($sql) ;
//Because I am using sqlsrv here, I use sqlsrv_fetch_array to read each line in a loop
//Then write a csv file after each line read
while($row=sqlsrv_fetch_array($query ->result_id,SQLSRV_FETCH_ASSOC)){
//Here is the code to write row into the csv file
}

The specific code is as follows

$sql="xxx";
$query=$this->db->query($sql);
$filename= "CostDetail.csv";//导出的文件名
header('Content-Type: application/vnd.ms-excel');
header('Content-Disposition: attachment;filename="'.$filename.'"');
header('Cache-Control: max-age=0');

// 打开PHP文件句柄,php://output 表示直接输出到浏览器
$fp = fopen('php://output', 'a');

// 输出Excel列名信息
$head = array(xxx);
foreach ($head as $i => $v) {
    // CSV的Excel支持GBK编码,一定要转换,否则乱码
    $head[$i] = iconv('utf-8', 'gbk', $v);
}

// 将数据通过fputcsv写到文件句柄
fputcsv($fp, $head);

// 计数器
$cnt = 0;
// 从数据库中获取数据,为了节省内存,不要把数据一次性读到内存,从句柄中一行一行读即可
$limit = 5000;

while ($row=sqlsrv_fetch_array($query->result_id,SQLSRV_FETCH_ASSOC)){
    $cnt ++;
    if ($limit == $cnt) { //刷新一下输出buffer,防止由于数据过多造成问题
        ob_flush();
        flush();
        $cnt = 0;
    }
    //这里是把每个字段的编码转成gbk
    $newRow[] = $this->_mb_convert_encoding($row['edis_orgsoid']);
    ....    
    unset($row);
    fputcsv($fp, $newRow);
    unset($newRow);
}

exit;

This way the file can be exported, but I looked at the 19204 rows and the 16-column csv file is about 3M. It takes 40-45 seconds to export. I would like to ask if this time can be optimized faster

怪我咯怪我咯2736 days ago751

reply all(2)I'll reply

  • 大家讲道理

    大家讲道理2017-05-17 09:57:57

    This is how to export all data to excel. The data needs to be checked out before it can be written into excel. I usually use a background process to export it and then send it to an email

    reply
    0
  • 给我你的怀抱

    给我你的怀抱2017-05-17 09:57:57

    Provide some ideas:

    • The front-end should have several time ranges to choose from, and the back-end will run tasks regularly according to these time ranges to get the database and put it on the server as a static file and bind a timemodified field;

    • Every time the table is added, deleted or modified, the scheduled update task must be triggered; the scheduled task update can also be done during periods when the number of customers is relatively small every day, such as at night;

    • Every time you export, through comparison timemodified, if they are the same, you can take the previously exported static files in advance. If they are not equal, you need to update them on the spot, but this situation should be relatively rare;

    This operation of fetching data from the database will not affect customer use. I hope it can help you

    reply
    0
  • Cancelreply