Home  >  Article  >  Database  >  Performance comparison of MySQL insertion methods under large data volumes

Performance comparison of MySQL insertion methods under large data volumes

黄舟
黄舟Original
2017-02-23 10:53:171170browse



Whether it is daily business data processing or database import and export, you may encounter insertions that require processing of large amounts of data. Both the insertion method and the database engine will affect the insertion speed. This article aims to analyze and compare various methods from a theoretical and practical perspective to facilitate the selection of insertion methods in future applications.

Insertion analysis

The time required to insert a record in MySQL is composed of the following factors, and the numbers represent approximate proportions:

  • Connection: (3 )

  • Send query to server: (2)

  • Analyze query: (2)

  • Insert record: (1x record size)

  • Insert index: (1x index)

  • Close: (1)

If we execute a SQL statement every time we insert an item, then we need to execute all steps except connection and closing N times. This is very time-consuming. There are several optimization methods:

  1. Write multiple rows in each insert statement, batch insert

  2. Write all query statements into the transaction

  3. Use Load Data to import data

The performance of each method is as follows.

Innodb engine

InnoDB provides MySQL with transaction-safe (ACID compliant) transaction (commit), rollback (rollback) and crash recovery capabilities (crash recovery capabilities) ) type table. InnoDB provides row locking (locking on row level) and foreign key constraints (FOREIGN KEY constraints).

InnoDB is designed to handle large-capacity database systems, and its CPU utilization is unmatched by other disk-based relational database engines. Technically, InnoDB is a complete database system placed on the MySQL background. InnoDB establishes its own dedicated buffer pool in main memory for caching data and indexes.

Test environment

Macbook Air 12mid apache2.2.26 php5.5.10 mysql5.6.16

Total 100W pieces of data

After insertion, the database size is 38.6MB ( No index), 46.8 (with index)

  • Total time taken for single insert without index: 229s Peak memory: 246KB

  • Single insert with index Total time taken: 242s Peak memory: 246KB

  • Batch insert without index Total time taken: 10s Peak memory: 8643KB

  • Batch insert with index Total time taken: 16s Peak memory: 8643KB

  • Total time taken for transaction insertion without index: 78s Peak memory: 246KB

  • Insertion with indexed transaction Total time taken: 82s Peak memory: 246KB

  • Load without index Total time spent inserting data: 12s Peak memory: 246KB

  • Load with index Total data insertion time: 11s Peak memory: 246KB

MyIASM engine

MyISAM is the default storage engine of MySQL. The design is simple and supports full-text search.

Test environment

Macbook Air 12mid apache2.2.26 php5.5.10 mysql5.6.16

Total 100W pieces of data

After insertion, the database size is 19.1MB ( No index), 38.6 (with index)

  • Total time taken for single insert without index: 82s Peak memory: 246KB

  • Single insert with index Total time taken: 86s Peak memory: 246KB

  • Batch insert without index Total time taken: 3s Peak memory: 8643KB

  • Batch insert with index Total time taken: 7s Peak memory: 8643KB

  • Load without index Total time taken for data insertion: 6s Peak memory: 246KB

  • Load with index Total data insertion time: 8s Peak memory: 246KB

Summary

The amount of data I tested is not very large, but I can roughly understand the speed of these insertion methods. The fastest method must be the Load Data method. This method is relatively troublesome because it involves writing files, but it can take into account both memory and speed.

Test code

<?php
$dsn = &#39;mysql:host=localhost;dbname=test&#39;;
$db = new PDO($dsn,&#39;root&#39;,&#39;&#39;,array(PDO::ATTR_PERSISTENT => true));
//删除上次的插入数据
$db->query(&#39;delete from `test`&#39;);
//开始计时
$start_time = time();
$sum = 1000000;
// 测试选项
$num = 1;

if ($num == 1){
    // 单条插入
    for($i = 0; $i < $sum; $i++){
        $db->query("insert into `test` (`id`,`name`) values ($i,&#39;tsetssdf&#39;)");
    }
} elseif ($num == 2) {
    // 批量插入,为了不超过max_allowed_packet,选择每10万插入一次
    for ($i = 0; $i < $sum; $i++) {
        if ($i == $sum - 1) { //最后一次
            if ($i%100000 == 0){
                $values = "($i, &#39;testtest&#39;)";
                $db->query("insert into `test` (`id`, `name`) values $values");
            } else {
                $values .= ",($i, &#39;testtest&#39;)";
                $db->query("insert into `test` (`id`, `name`) values $values");
            }
            break;
        }
        if ($i%100000 == 0) { //平常只有在这个情况下才插入
            if ($i == 0){
                $values = "($i, &#39;testtest&#39;)";
            } else {
                $db->query("insert into `test` (`id`, `name`) values $values");
                $values = "($i, &#39;testtest&#39;)";
            }
        } else {
            $values .= ",($i, &#39;testtest&#39;)";    
        }
    }
} elseif ($num == 3) {
    // 事务插入
    $db->beginTransaction(); 
    for($i = 0; $i < $sum; $i++){
        $db->query("insert into `test` (`id`,`name`) values ($i,&#39;tsetssdf&#39;)");
    }
    $db->commit();
} elseif ($num == 4) {
    // 文件load data
    $filename = dirname(__FILE__).&#39;/test.sql&#39;;
    $fp = fopen($filename, &#39;w&#39;);
    for($i = 0; $i < $sum; $i++){
        fputs($fp, "$i,&#39;testtest&#39;\r\n");    
    }
    $db->exec("load data infile &#39;$filename&#39; into table test fields terminated by &#39;,&#39;");
}

$end_time = time();
echo "总耗时", ($end_time - $start_time), "秒\n";
echo "峰值内存", round(memory_get_peak_usage()/1000), "KB\n";

?>

The above is the performance analysis and comparison of various methods for inserting large amounts of data into MySQL. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn